1

I need to exclude one column of which I know it merely contains null values.
I've used this topic as example before, but I'm not sure how I can exclude a column.

There is several different data frames like the one below but with different amount of columns, on eachone I want all NA's except the ones in the last column.

PIG_ID      WEIGHT HERD_YEAR_BIRTH BIRTH_MONTH PARITY_BLL_DAM SEX_CODE LIVE_BORN_LITTER LIVE_BORN_LITTER2 VALID_COLUMN
13513130           7.5      65433130215          09              2        Z                9                81         <NA>
654605132           7.0      5646846421          04              3        Z                4                16         <NA>
654068065           4.0      5530201049          <NA>              3        B               15               225         <NA>

How can I get this done with complete.cases?

Expected output would be:

PIG_ID      WEIGHT HERD_YEAR_BIRTH BIRTH_MONTH PARITY_BLL_DAM SEX_CODE LIVE_BORN_LITTER LIVE_BORN_LITTER2 VALID_COLUMN
13513130           7.5      65433130215          09              2        Z                9                81         <NA>
654605132           7.0      5646846421          04              3        Z                4                16         <NA>
Community
  • 1
  • 1
Bas
  • 1,035
  • 1
  • 9
  • 26

1 Answers1

2

We apply the complete.cases on subset of dataset without the 'VALID_COLUMN' and use that as row index.

df1[complete.cases(df1[setdiff(names(df1), 'VALID_COLUMN')]),]
#     PIG_ID WEIGHT HERD_YEAR_BIRTH BIRTH_MONTH PARITY_BLL_DAM SEX_CODE
#1  13513130    7.5     65433130215          09              2        Z
#2 654605132    7.0      5646846421          04              3        Z
#  LIVE_BORN_LITTER LIVE_BORN_LITTER2 VALID_COLUMN
#1                9                81         <NA>
#2                4                16         <NA>
akrun
  • 674,427
  • 24
  • 381
  • 486