7

Lets consider this small example:

df1<- data.frame(A=c(1,NA,"pvalue",0.0003),B=c(0.5,7,"I destroy","numbers all day"),stringsAsFactors = T)

Write file:

openxlsx::write.xlsx(df1,"Test.xlsx")

In my resulting excel file, 1 and 7 are text cells. Excel has the "intuition" that they are numbers stored as text. I can convert them by hand.

How can I convert those "flagged" values automatically to numbers from inside R?

In the "What I want" I have by hand converted the TEXT into Numbers. It's an option behind the "green triangle" in the "What I get" Part (red arrows).

enter image description here

@Roland's comment: Rearranging as list does not work.

df1<- as.data.frame(cbind(A=list(1,NA_real_,"pvalue",0.0003),B=list(0.5,7,"I destroy","numbers all day")))
openxlsx::write.xlsx(df1,"Test2.xlsx")
Andre Elrico
  • 8,959
  • 1
  • 37
  • 61
  • You will not be able to convert the 1 and 7 to numbers since they are in a `char` variable – SmitM Aug 27 '18 at 13:58
  • maybe I can setup my df1 differently. – Andre Elrico Aug 27 '18 at 14:01
  • I think you would need to make sure that the columns always contain the same data type. This means store all the numbers in one column and the text in another one. – hannes101 Aug 27 '18 at 14:02
  • That is not possible. Its for an report formatting is important and a mix von num and char is needed. – Andre Elrico Aug 27 '18 at 14:05
  • If it's just for reporting and not calculations, why does it matter if it's character or numeric format, as long as the numbers shown are correct, no? – acylam Aug 27 '18 at 14:09
  • 1
    Maybe I was not precise with the word "report". Its an output I deliver, that should look nice and have meaningful cell types and others can directly continue on working. – Andre Elrico Aug 27 '18 at 14:12
  • 2
    In a data.frame each column can hold only exactly one data type. If that doesn't work for you, you can't use a data.frame. You could use a list of lists instead (which is not nearly as convenient). I agree with the comment above, rearrange your data. Data analysis and data reporting are two distinct tasks. You can't let the latter limit how you do the former. – Roland Aug 27 '18 at 14:17
  • You should probably just use a "tidy" data.frame and write some kind of report generator that transfers the data to Excel in the desired arrangement. – Roland Aug 27 '18 at 14:18
  • Instead of writing the data with write.xlsx in one go, start using writeData. Then you can specify where the data on the sheet needs to go and if you read the formatting vignette you can format those individual sections to your own taste. But as @Roland says, you should work in a tidy format, because if you transfer character numbers to excel you can't transform them to numbers except in excel. – phiver Aug 27 '18 at 14:32
  • A nice article about working with excel in R can be found here: https://www.r-bloggers.com/writing-from-r-to-excel-with-xlsx/ . As you can see, you can format each cell seperately. If it is always the same format that you need, it might be a good idea to give it a try. – Arno Aug 27 '18 at 15:15

3 Answers3

4

I wrote a small piece of code following the suggestions of @Roland and @phiver. It starts with a tidy data.frame (to preserve the data type of each cell) and save values one by one:

library(openxlsx)
df1<- as.data.frame(cbind(A=list(1,NA_real_,"pvalue",0.0003),B=list(0.5,7,"I destroy","numbers all day")))

wb <- createWorkbook()
sheet.name <- 'test'
addWorksheet(wb, sheet.name)

for(i in seq_along(df1)){
    writeData(wb, sheet = sheet.name, names(df1)[i], startCol = i, startRow = 1)
    icol <- df1[[i]]
    for(j in seq_along(icol)){
        x <- icol[[j]]
        writeData(wb, sheet = sheet.name, x, startCol = i, startRow = j + 1)
    }
}
saveWorkbook(wb, file = "Test.xlsx")

enter image description here

Hope this works for your data.

mt1022
  • 15,027
  • 4
  • 36
  • 59
3

thanks @mt1022 added the validator to let 000123 stay 000123 in the helpers function part

A solution that can do what openxlsx::write.xlsx() can do + "finding meaningful types".

function: (its 98% openxlsx::write.xlsx)

writeXlsxWithTypes <- function(x, file, asTable = FALSE, ...) {
    library(magrittr);library(openxlsx);

    if(T) {
        setTypes <- function(x) {
            x %<>%
                lapply(function(xX){
                    lapply(xX ,function(u) {
                        if(canConvert(u)) { type.convert(as.character(u), as.is = TRUE) } else { u }
                    })
                }) %>% do.call(cbind, .) %>% as.data.frame
        } #types fun

        validateBorderStyle <- function(borderStyle){


            valid <- c("none", "thin", "medium", "dashed", "dotted", "thick", "double", "hair", "mediumDashed", 
                       "dashDot", "mediumDashDot", "dashDotDot", "mediumDashDotDot", "slantDashDot")

            ind <- match(tolower(borderStyle), tolower(valid))
            if(any(is.na(ind)))
                stop("Invalid borderStyle", call. = FALSE)

            return(valid[ind])

        }

        validateColour <- function(colour, errorMsg = "Invalid colour!"){

            ## check if
            if(is.null(colour))
                colour = "black"

            validColours <- colours()

            if(any(colour %in% validColours))
                colour[colour %in% validColours] <- col2hex(colour[colour %in% validColours])

            if(any(!grepl("^#[A-Fa-f0-9]{6}$", colour)))
                stop(errorMsg, call.=FALSE)

            colour <- gsub("^#", "FF", toupper(colour))

            return(colour)

        }
        #x="0001"
        canConvert <- function(x) {
            return( !grepl("^0+\\.?\\d",x) )
            }
    } # define helper functions

    if(T) {
        params <- list(...)
        if (!is.logical(asTable)) 
            stop("asTable must be a logical.")
        creator <- ifelse("creator" %in% names(params), params$creator, 
                          "")
        title <- params$title
        subject <- params$subject
        category <- params$category
        sheetName <- "Sheet 1"
        if ("sheetName" %in% names(params)) {
            if (any(nchar(params$sheetName) > 31)) 
                stop("sheetName too long! Max length is 31 characters.")
            sheetName <- as.character(params$sheetName)
            if ("list" %in% class(x) & length(sheetName) == length(x)) 
                names(x) <- sheetName
        }
        tabColour <- NULL
        if ("tabColour" %in% names(params)) 
            tabColour <- validateColour(params$tabColour, "Invalid tabColour!")
        zoom <- 100
        if ("zoom" %in% names(params)) {
            if (is.numeric(params$zoom)) {
                zoom <- params$zoom
            }
            else {
                stop("zoom must be numeric")
            }
        }
        gridLines <- TRUE
        if ("gridLines" %in% names(params)) {
            if (all(is.logical(params$gridLines))) {
                gridLines <- params$gridLines
            }
            else {
                stop("Argument gridLines must be TRUE or FALSE")
            }
        }
        overwrite <- TRUE
        if ("overwrite" %in% names(params)) {
            if (is.logical(params$overwrite)) {
                overwrite <- params$overwrite
            }
            else {
                stop("Argument overwrite must be TRUE or FALSE")
            }
        }
        withFilter <- TRUE
        if ("withFilter" %in% names(params)) {
            if (is.logical(params$withFilter)) {
                withFilter <- params$withFilter
            }
            else {
                stop("Argument withFilter must be TRUE or FALSE")
            }
        }
        startRow <- 1
        if ("startRow" %in% names(params)) {
            if (all(startRow > 0)) {
                startRow <- params$startRow
            }
            else {
                stop("startRow must be a positive integer")
            }
        }
        startCol <- 1
        if ("startCol" %in% names(params)) {
            if (all(startCol > 0)) {
                startCol <- params$startCol
            }
            else {
                stop("startCol must be a positive integer")
            }
        }
        colNames <- TRUE
        if ("colNames" %in% names(params)) {
            if (is.logical(params$colNames)) {
                colNames <- params$colNames
            }
            else {
                stop("Argument colNames must be TRUE or FALSE")
            }
        }
        if ("col.names" %in% names(params)) {
            if (is.logical(params$col.names)) {
                colNames <- params$col.names
            }
            else {
                stop("Argument col.names must be TRUE or FALSE")
            }
        }
        rowNames <- FALSE
        if ("rowNames" %in% names(params)) {
            if (is.logical(params$rowNames)) {
                rowNames <- params$rowNames
            }
            else {
                stop("Argument colNames must be TRUE or FALSE")
            }
        }
        if ("row.names" %in% names(params)) {
            if (is.logical(params$row.names)) {
                rowNames <- params$row.names
            }
            else {
                stop("Argument row.names must be TRUE or FALSE")
            }
        }
        xy <- NULL
        if ("xy" %in% names(params)) {
            if (length(params$xy) != 2) 
                stop("xy parameter must have length 2")
            xy <- params$xy
        }
        headerStyle <- NULL
        if ("headerStyle" %in% names(params)) {
            if (length(params$headerStyle) == 1) {
                if ("Style" %in% class(params$headerStyle)) {
                    headerStyle <- params$headerStyle
                }
                else {
                    stop("headerStyle must be a style object.")
                }
            }
            else {
                if (all(sapply(params$headerStyle, function(x) "Style" %in% 
                               class(x)))) {
                    headerStyle <- params$headerStyle
                }
                else {
                    stop("headerStyle must be a style object.")
                }
            }
        }
        borders <- NULL
        if ("borders" %in% names(params)) {
            borders <- tolower(params$borders)
            if (!all(borders %in% c("surrounding", "rows", "columns", 
                                    "all"))) 
                stop("Invalid borders argument")
        }
        borderColour <- getOption("openxlsx.borderColour", "black")
        if ("borderColour" %in% names(params)) 
            borderColour <- params$borderColour
        borderStyle <- getOption("openxlsx.borderStyle", "thin")
        if ("borderStyle" %in% names(params)) {
            borderStyle <- validateBorderStyle(params$borderStyle)
        }
        keepNA <- FALSE
        if ("keepNA" %in% names(params)) {
            if (!"logical" %in% class(keepNA)) {
                stop("keepNA must be a logical.")
            }
            else {
                keepNA <- params$keepNA
            }
        }
        tableStyle <- "TableStyleLight9"
        if ("tableStyle" %in% names(params)) 
            tableStyle <- params$tableStyle
        colWidths <- ""
        if ("colWidths" %in% names(params)) 
            colWidths <- params$colWidths
    } # params check

    if(class(x) == "data.frame") {
        x %<>% setTypes %>% list
    } else {
        lNames <- names(x)
        x %<>% lapply(setTypes)
    }

    if(T) {   
        nms <- names(x)
        nSheets <- length(x)
        if (is.null(nms)) {
            nms <- paste("Sheet", 1:nSheets)
        }
        else if (any("" %in% nms)) {
            nms[nms %in% ""] <- paste("Sheet", (1:nSheets)[nms %in% 
                                                               ""])
        }
        else {
            nms <- make.unique(nms)
        }
        if (any(nchar(nms) > 31)) {
            warning("Truncating list names to 31 characters.")
            nms <- substr(nms, 1, 31)
        }
        if (!is.null(tabColour)) {
            if (length(tabColour) != nSheets) 
                tabColour <- rep_len(tabColour, length.out = nSheets)
        }
        if (length(zoom) != nSheets) 
            zoom <- rep_len(zoom, length.out = nSheets)
        if (length(gridLines) != nSheets) 
            gridLines <- rep_len(gridLines, length.out = nSheets)
        if (length(withFilter) != nSheets) 
            withFilter <- rep_len(withFilter, length.out = nSheets)
        if (length(colNames) != nSheets) 
            colNames <- rep_len(colNames, length.out = nSheets)
        if (length(rowNames) != nSheets) 
            rowNames <- rep_len(rowNames, length.out = nSheets)
        if (length(startRow) != nSheets) 
            startRow <- rep_len(startRow, length.out = nSheets)
        if (length(startCol) != nSheets) 
            startCol <- rep_len(startCol, length.out = nSheets)
        if (!is.null(headerStyle)) 
            headerStyle <- lapply(1:nSheets, function(x) return(headerStyle))
        if (length(borders) != nSheets & !is.null(borders)) 
            borders <- rep_len(borders, length.out = nSheets)
        if (length(borderColour) != nSheets) 
            borderColour <- rep_len(borderColour, length.out = nSheets)
        if (length(borderStyle) != nSheets) 
            borderStyle <- rep_len(borderStyle, length.out = nSheets)
        if (length(keepNA) != nSheets) 
            keepNA <- rep_len(keepNA, length.out = nSheets)
        if (length(asTable) != nSheets) 
            asTable <- rep_len(asTable, length.out = nSheets)
        if (length(tableStyle) != nSheets) 
            tableStyle <- rep_len(tableStyle, length.out = nSheets)
        if (length(colWidths) != nSheets) 
            colWidths <- rep_len(colWidths, length.out = nSheets)
    }  # setup and validation

    wb <- openxlsx::createWorkbook(creator = creator, title = title, subject = subject, 
                         category = category)

    for (i in 1:nSheets) {

        if(T) {

            wb$addWorksheet(nms[[i]], showGridLines = gridLines[i], 
                            tabColour = tabColour[i], zoom = zoom[i])
            if (asTable[i]) {

                for(ii in seq_along(x[[i]])){
                    openxlsx::writeDataTable(wb = wb, sheet = i, x = names(x[[i]])[[ii]],
                                             startCol = ii, startRow = 1, 
                                             xy = xy, colNames = colNames[[i]], rowNames = rowNames[[i]], 
                                             tableStyle = tableStyle[[i]], tableName = NULL, 
                                             headerStyle = headerStyle[[i]], withFilter = withFilter[[i]], 
                                             keepNA = keepNA[[i]]
                                             )
                    icol <- x[[i]][[ii]]

                    for(j in seq_along(icol)){
                        dati <- icol[[j]]

                        openxlsx::writeData(wb = wb, sheet = i,x = dati,
                                            startCol = ii, startRow = j+1, 
                                            xy = xy, colNames = colNames[[i]], rowNames = rowNames[[i]], 
                                            tableStyle = tableStyle[[i]], tableName = NULL, 
                                            headerStyle = headerStyle[[i]], withFilter = withFilter[[i]], 
                                            keepNA = keepNA[[i]]
                                            )
                    }
                }
            }
            else {

                for(ii in seq_along(x[[i]])){

                    openxlsx::writeData(wb = wb, sheet = i, x = names(x[[i]])[[ii]],
                                        startCol = ii, startRow = 1,
                                        xy = xy, colNames = colNames[[i]], rowNames = rowNames[[i]],
                                        headerStyle = headerStyle[[i]],
                                        borders = borders[[i]], borderColour = borderColour[[i]], borderStyle = borderStyle[[i]],
                                        keepNA = keepNA[[i]]
                    )
                    icol <- x[[i]][[ii]]

                    for(j in seq_along(icol)){
                        dati <- icol[[j]]

                        openxlsx::writeData(wb = wb, sheet = i,x = dati,
                                            startCol = ii, startRow = j+1, 
                                            xy = xy, colNames = colNames[[i]], rowNames = rowNames[[i]],
                                            headerStyle = headerStyle[[i]],
                                            borders = borders[[i]], borderColour = borderColour[[i]], borderStyle = borderStyle[[i]],
                                            keepNA = keepNA[[i]]
                        )
                    }
                }
            }
            if (colWidths[i] %in% "auto") 
                setColWidths(wb, sheet = i, cols = 1:ncol(x[[i]]) + 
                                 startCol[[i]] - 1L, widths = "auto")

            } #from list



    }

    if(T) {
        freezePanes <- FALSE
        firstActiveRow <- rep_len(1L, length.out = nSheets)
        if ("firstActiveRow" %in% names(params)) {
            firstActiveRow <- params$firstActiveRow
            freezePanes <- TRUE
            if (length(firstActiveRow) != nSheets) 
                firstActiveRow <- rep_len(firstActiveRow, length.out = nSheets)
        }
        firstActiveCol <- rep_len(1L, length.out = nSheets)
        if ("firstActiveCol" %in% names(params)) {
            firstActiveCol <- params$firstActiveCol
            freezePanes <- TRUE
            if (length(firstActiveCol) != nSheets) 
                firstActiveCol <- rep_len(firstActiveCol, length.out = nSheets)
        }
        firstRow <- rep_len(FALSE, length.out = nSheets)
        if ("firstRow" %in% names(params)) {
            firstRow <- params$firstRow
            freezePanes <- TRUE
            if ("list" %in% class(x) & length(firstRow) != nSheets) 
                firstRow <- rep_len(firstRow, length.out = nSheets)
        }
        firstCol <- rep_len(FALSE, length.out = nSheets)
        if ("firstCol" %in% names(params)) {
            firstCol <- params$firstCol
            freezePanes <- TRUE
            if ("list" %in% class(x) & length(firstCol) != nSheets) 
                firstCol <- rep_len(firstCol, length.out = nSheets)
        }
        if (freezePanes) {
            for (i in 1:nSheets) openxlsx::freezePane(wb = wb, sheet = i, 
                                            firstActiveRow = firstActiveRow[i], firstActiveCol = firstActiveCol[i], 
                                            firstRow = firstRow[i], firstCol = firstCol[i])
        }
    } # additional settings/Options

    openxlsx::saveWorkbook(wb = wb, file = file, overwrite = overwrite)

    return(invisible(NULL))
}

example data:

df1 <- mtcars

df1[1,3]<-"ID =====>"
df1[1,4]<-"00000123"
df1[3,7]<-NA
df1[2,6]<-"stringi"

ldf <- list(NOW=df1, WITH=df1, LISTS=df1)

call:

writeXlsxWithTypes(df1, "test_normal3.xlsx" , rowNames = TRUE, borders = "surrounding")
writeXlsxWithTypes(ldf, "test_list3.xlsx", rowNames = TRUE, borders = "surrounding")
Andre Elrico
  • 8,959
  • 1
  • 37
  • 61
  • Nice try. A caveat is that `type.convert` is not always desirable. For example if I have a string of ID number like `"00001230"` that is to be written into a excel file, `type.convert`will convert it to a integer `1230`. However, the automatic conversion makes no sense. – mt1022 Aug 28 '18 at 11:19
0

Just in case it helps someone else, I imported an excel document, did a bunch of manipulations on the dataframe and then wrote it out as a new excel document. I didn't want to put the conversion from char to numeric in the dataframe, because it would mess with my existing code, so I put it in the writeData bit.

wb <- createWorkbook()
lapply(listOfDFs, function(x) addWorksheet(wb, sheetName = x))   
for (n in 1:length(listOfDFs)) {
  sheet <- allDFs[[n]]
  for (row in 1:nrow(sheet)){
    sheetRow <- data.frame(lapply(sheet[row,], function(x){type.convert(as.character(x))}), check.names = FALSE, stringsAsFactors = FALSE)
    if (row == 1) {
      writeData(wb, sheet = n, x = sheetRow, startRow = row, colNames = TRUE)
    } else {
      writeData(wb, sheet = n, x = sheetRow, startRow = row+1, colNames = FALSE)
    }
  }
}
saveWorkbook(wb, file = "test.xlsx", overwrite = TRUE)
vorpal
  • 193
  • 1
  • 11