47

If I have a large DF (hundreds and hundreds) columns with different col_names randomly distributed alphabetically:

df.x <- data.frame(2:11, 1:10, rnorm(10))
colnames(df.x) <- c("ID", "string", "delta")

How would I order all of the data (vertically) by col_name alphabetically?

Essentially, I have hundreds of CSV(sep="|") text files that I need to read their columns into a single df, order those columns alphabetically and then use some other dplyf functions to get a final result. I have all of this figured out except how to order the columns alphabetically. I do not want to sort the columns (up and down) by alphabet, rather, the actual vertical orientation of the col_names and their corresponding data. Analogous to cutting and pasting entire columns of data in Excel.

For example I reviewed this approach but this is the "sort" the rows alphabetically bit, which is not what I'm looking to do.

How to sort a dataframe by column(s)?

Thanks!

Community
  • 1
  • 1
Zach
  • 954
  • 2
  • 11
  • 19

4 Answers4

46

Try this

df %>% select(noquote(order(colnames(df))))

or just

df[,order(colnames(df))]
Steven Beaupré
  • 20,095
  • 7
  • 52
  • 73
Koundy
  • 4,477
  • 3
  • 20
  • 36
35

An alternative way to do this in dplyr is:

iris %>% 
  select(sort(current_vars()))

current_vars() returns column names such that they're sortable, and select() will take the vector of column names.

Steph Locke
  • 5,236
  • 3
  • 28
  • 71
  • 7
    In this form, I get the error message `Error: Variable context not set`. `current_vars()` may be deprecated? Replacing `current_vars()` with `everything()` runs fine for me. – lowndrul Jan 24 '18 at 00:23
  • `current_vars()` but not `everything()` works for me (`dplyr` 0.7.6). I don't get the above error. Also noteworthy is that `iris %>% select(sort(current_vars()), -Species)` works, but not `iris %>% select(-Species, sort(current_vars()))`. – Joe Sep 28 '18 at 08:50
  • 13
    Update Dec 2019. The `current_vars()` has been deprecated in favor of `tidyselect::peek_vars()`. The above code works with this substitution. `select(sort(tidyselect::peek_vars()))` – John J. Dec 02 '19 at 02:22
  • 1
    You could also use the new `relocate()` verb. ```iris %>% relocate(sort(current_vars))``` – Hany Nagaty Apr 08 '21 at 10:51
3

If a specific column (or columns) has to be the first one (or last), but the rest is ordered, you can:

mtcars %>% tibble %>% 
  select("hp", sort(colnames(.)))
HBat
  • 3,195
  • 3
  • 26
  • 42
0

Why not just:

sort(colnames(df.x))

[1] "delta"  "ID"     "string"
Frank B.
  • 1,563
  • 4
  • 20
  • 37