Skip/delete rows having empty cells

Question

I am uploading reading a text file in my shiny app. Here is how I am reading it in shiny app:

data <- reactive({
    req(input$file)
    df <- read.table(file=input$file$datapath[input$file$name==input$Select], skip = 15, sep=input$sep, header = input$header, stringsAsFactors = input$stringAsFactors)
    updateSelectInput(session, inputId = 'xcol', label = 'X Variable',
                      choices = names(df), selected = names(df)[1])
    updateSelectInput(session, inputId = 'ycol', label = 'Y Variable',
                      choices = names(df), selected = names(df)[2])
    return(df)
  })

Now, I want to delete/skip the rows in the uploaded dataset having empty cells.

My attempt:

df[!apply(df == "", 1, all),]

But, it is not working.

Is there a different way to do it when using read.table?

I think [this](https://stackoverflow.com/questions/4862178/remove-rows-with-all-or-some-nas-missing-values-in-data-frame) post answers your question. — maydin, Aug 12 '20 at 18:12
If your file is a single column, then you could try `read.table(..., blank.lines.skip=TRUE)` ... which is its default, suggesting that your data has more than one column. In that case, the answer is **No**, you need to do it post-`read.table`. — r2evans, Aug 12 '20 at 18:12

score 1 · Accepted Answer · answered Aug 12 '20 at 18:16

1

@maydin's link works great for NA values, but you'll need a little bit more to check for a specific value (i.e, "", the empty string).

df <- data.frame(a=c('a','b',''), b="")
rowSums(df != "") == 0
# [1] FALSE FALSE  TRUE

That tells you which rows have exactly 0 non-empty strings on the row. If even one of the columns has something more than zero-length-string, then it'll pop as false.

Using this, we'll look for only rows with 1 or more non-empty-strings.

df[rowSums(df != "") > 0, ]
#   a b
# 1 a  
# 2 b

answered Aug 12 '20 at 18:16

r2evans

77,184
4
55
96

I did not completly understand your answer. How can I use 'df[rowSums(df != "") > 0, ]' in read.table ? – kolas0202 Aug 12 '20 at 18:52
As I said in my first comment, you cannot. What you're asking of `read.table` is to conditionally exclude rows based on individual cell contents. The arguments in `read.table` that allow omitting rows: `nrows`, `skip`, `blank.lines.skip`, `comment.char`, and possibly `skipNul`. None of those allow the logic of *"if a **column** is blank then skip the **row**"*. Anything on that scale needs to be done either before or after `read.table`. – r2evans Aug 12 '20 at 18:59
Ok, understood. – kolas0202 Aug 12 '20 at 19:10
I tried usind 'df[rowSums(df != "") > 0, ]' and 'df[!apply(df == "", 0, all), ]' after read.table. Both are not working. – kolas0202 Aug 12 '20 at 19:45
Did you get it to work, kolas0202? (I see you accepted it since that comment.) If it doesn't work, it might be for several things, including not-quite-empty strings, `NA`s, or other logic that might dissuade `rowSums` from doing its job. If you need more help please include sample data (via `dput(head(x))`). Thanks! – r2evans Aug 12 '20 at 20:34
1

I figured it out. – kolas0202 Aug 12 '20 at 23:04

score 0 · Answer 2 · answered Aug 12 '20 at 20:21

I got my answer:

Here is the code: Raw Data:

data <- reactive({
    req(input$file)
    df <- read.table(file=input$file$datapath[input$file$name==input$Select], skip = 15, sep=input$sep, header = input$header, stringsAsFactors = input$stringAsFactors, skipNul = TRUE, na.strings = "")
    updateSelectInput(session, inputId = 'xcol', label = 'X Variable',
                      choices = names(df), selected = names(df)[1])
    updateSelectInput(session, inputId = 'ycol', label = 'Y Variable',
                      choices = names(df), selected = names(df)[2])
    return(df)
  })

Data with removing the rows with empty cells:

data_1 <- reactive({
    req(input$file)
    x <- data()[, c(input$xcol, input$ycol)]
    x[x == ""] <- NA
    M <- na.omit(x)
    return(M)
  })

Skip/delete rows having empty cells

2 Answers2