slick one-lineRs

Question

What's your favorite one-liner in R?

Include a short companion example, and limit to one tip per post, please. Note, ; is cheating.

Example: calculate x[i] / x[i-1] for a vector x,

x <- 1:10
Reduce("/", as.data.frame(embed(x, 2)))

(collected from R-help, I forget who/when)

Edit: after some initial controversy, it looks like the question is now reopen for entries.

Too bad, I really would've like to see what else people come up with. Yours was really cool. Thanks! — Vincent, Jun 06 '11 at 00:42
Why is this subjective and argumentative? Its either one line or it isn't and it does not seem argumentative to me either. — G. Grothendieck, Jun 06 '11 at 00:58
@G.Grothendieck: What's the question? What's _your_ favorite one-liner in R? It's a poll, it's not a question that could be objectively be answered. — Jeff Mercado, Jun 06 '11 at 01:23
Out of curiosity, had I phrased it "I'm looking for a concise piece of R code, limited to one line, yet providing a most elaborate example of R's functional programming paradigm in manipulating data."; would *that* have *fit in*? — baptiste, Jun 06 '11 at 01:38
The most upvoted question in the R label is "What statistics should a programmer (or computer scientist) know?" and the second most upvoted one is "What is the most useful R trick?". If they are acceptable then I think this one should be too. — G. Grothendieck, Jun 06 '11 at 01:54
Now that this question is open again, it should be CW cause you won't get a single best anwser — Sacha Epskamp, Jun 06 '11 at 10:09
Voted to close, not a real question. Please put this on your blog instead. — user7116, Jun 06 '11 at 16:41
Am I the only one perplexed that regular R users don't seem to mind this unreasonable question? — baptiste, Jun 08 '11 at 07:17
@sixlettervariables, @BlueRaja, @KirkWoll, @dmckee, @Graviton [Very similar question in javascript tag](http://stackoverflow.com/questions/472644/javascript-collection-of-one-line-useful-functions) and no closing votes. So why close this one? (more examples in [tips-and-tricks](http://stackoverflow.com/questions/tagged/tips-and-tricks) and [one-liner](http://stackoverflow.com/questions/tagged/one-liner) tags). — Marek, Jun 08 '11 at 15:44
@Marek: thank you for bringing those to my attention. However, some are old, thus "grandfathered". Others are on topic, others are not and I'll vote to close them. — user7116, Jun 08 '11 at 15:56
Dear officious intermeddlers; I know you all are trying to do God's work by closing this question. However the [r] community on Stack Overflow likes this question quite a lot. Would you mind backing down from your religious zealotry on this one? Thanks. — JD Long, Jun 08 '11 at 16:30

score 14 · Answer 1 · answered Jun 06 '11 at 10:30

14

If you want to record the time that you created a file in its name (perhaps to make it unique, or prevent overwriting), then try this one-line function.

timestamp <- function(format = "%y%m%d%H%M%S")
{
  strftime(Sys.time(), format)
}

Usage is, e.g.,

write.csv(
   some_data_frame, 
   paste("some data ", timestamp(), ".csv", sep = "")
)

answered Jun 06 '11 at 10:30

Richie Cotton

107,354
40
225
343

+1 Handy! Thanks for sharing. – baptiste Jun 06 '11 at 10:34
+1 Nice trick to manage redundant results. – Shreyas Karnik Jun 06 '11 at 14:45

score 9 · Answer 2 · answered Jun 06 '11 at 10:37

9

Get odd or even indices.

odds <- function(x) seq_along(x) %% 2 > 0
evens <- function(x) seq_along(x) %% 2 == 0

Usage is, e.g.,

odds(1:5)
evens(1:5)

answered Jun 06 '11 at 10:37

Richie Cotton

107,354
40
225
343

1

Or you could do the same without a one-liner with gtools pacakge by calling `odd()` and `even()` :) – daroczig Jun 06 '11 at 11:39
7

Or using recycling `seq_len(5)[c(TRUE, FALSE)]` – Martin Morgan Jun 06 '11 at 12:28

score 6 · Answer 3 · answered Jun 06 '11 at 10:34

6

I often need fake data to illustrate, say, a regression problem. Instead of

X <- replicate(2, rnorm(100))
y <- X[,1] + X[,2] + rnorm(100)
df <- data.frame(y=y, X=X)

we can use

df <- transform(X <- as.data.frame(replicate(2, rnorm(100))), 
                y = V1+V2+rnorm(100))

to generate two uncorrelated predictors associated to the outcome y.

answered Jun 06 '11 at 10:34

chl

24,035
5
47
70

The one-liner is cute code, but I think the three line method is easier to understand. – Richie Cotton Jun 06 '11 at 13:03
@Richie Hey, but the OP ask for one-line :-) – chl Jun 06 '11 at 13:24

Mark Heckmann · Answer 4 · 2011-06-07T06:52:42.360

6

Removing NaNs - which are a nuisance every once in a while - from a vector or dataframe (found somewhen on R-help)

is.na(x) <- is.na(x)

Example:

> x <- c(1, NaN, 2, NaN, 3, NA)
> is.na(x) <- is.na(x)
> x
[1]  1 NA  2 NA  3 NA

edited Jun 07 '11 at 06:52

answered Jun 07 '11 at 06:47

Mark Heckmann

9,829
3
48
77

Richie Cotton · Answer 5 · 2011-06-06T10:41:41.943

5

Convert Excel dates to R dates. Answer adapted from code by Paul Murrell.

excel_date_to_r_date <- function(excel_date, format)
{
  #excel_date is the number of days since the 0th January 1900.  See
  #http://www.stat.auckland.ac.nz/~paul/ItDT/HTML/node67.html
  strftime(as.Date(as.numeric(excel_date) - 2, origin = "1900-01-01"), format)
}

Usage is, e.g.,

excel_date_to_r_date(40700, "%d-%m-%Y")

edited Jun 06 '11 at 10:41

answered Jun 06 '11 at 10:34

Richie Cotton

107,354
40
225
343

Did you check whether it works the same for Windows and Mac? – chl Jun 06 '11 at 13:25
That minus 2 was killing me in some work I'm doing. So glad I stumbled on this. – JD Long Jun 06 '11 at 14:36
@chl: Good point. I think, for a Mac, you can just change the `origin` value to `1904-01-01` and don't subtract the 2. Volunteers with Macs appreciated to test this. – Richie Cotton Jun 06 '11 at 17:34

score 4 · Answer 6 · answered Jun 06 '11 at 09:19

4

Not quite what you are after, but fitting a multivariate linear regression model in one line is great:

lm(y ~ x1 + x2)

answered Jun 06 '11 at 09:19

csgillespie

54,386
13
138
175

score 4 · Answer 7 · answered Jun 06 '11 at 09:31

4

Reduce() is a new kid on the block. The same can be done using do.call(), and is a little bit quicker (on my system at least):

do.call("/", as.data.frame(embed(1:10, 2)))

R> do.call("/", as.data.frame(embed(1:10, 2)))
[1] 2.000000 1.500000 1.333333 1.250000 1.200000 1.166667 1.142857 1.125000
[9] 1.111111
R> Reduce("/", as.data.frame(embed(1:10, 2)))
[1] 2.000000 1.500000 1.333333 1.250000 1.200000 1.166667 1.142857 1.125000
[9] 1.111111

answered Jun 06 '11 at 09:31

Gavin Simpson

157,540
25
364
424

Thanks, that makes good sense, more than `Reduce` in fact. – baptiste Jun 06 '11 at 09:35
2

exp(diff(log(x)))) is even little bit quicker – Wojciech Sobala Jun 06 '11 at 10:21
2

@Wojciech Sobala Neat, though you might run into problems with negative values, I'd imagine. – baptiste Jun 06 '11 at 10:28
@baptiste if you don't exclude negative values in x you can't exclude 0, so you have problem anyway. – Wojciech Sobala Jun 06 '11 at 11:46
@baptiste: "imagine" -- _get it_? – sehe Jun 06 '11 at 22:28
@sehe clearly, that should read "i'd" – baptiste Jun 07 '11 at 00:24

score 3 · Answer 8 · answered Jun 06 '11 at 22:08

Function summarize the amount of missing data for each variable in a data frame. Returns a list.

propmiss <- function(dataframe) lapply(dataframe,function(x) data.frame(nmiss=sum(is.na(x)), n=length(x), propmiss=sum(is.na(x))/length(x)))

Not a one-liner, but returning this info as a data frame is more useful.

propmiss <- function(dataframe) {
    m <- sapply(dataframe, function(x) {
        data.frame(
            nmiss=sum(is.na(x)), 
            n=length(x), 
            propmiss=sum(is.na(x))/length(x)
        )
    })
    d <- data.frame(t(m))
    d <- sapply(d, unlist)
    d <- as.data.frame(d)
    d$variable <- row.names(d)
    row.names(d) <- NULL
    d <- cbind(d[ncol(d)],d[-ncol(d)])
    return(d[order(d$propmiss), ])
}

score 3 · Answer 9 · answered Jun 07 '11 at 14:08

3

Wipes the slate clean removes all objects from the memory.

rm(list=ls(all=TRUE))

answered Jun 07 '11 at 14:08

Shreyas Karnik

3,553
3
23
26

Marek · Answer 10 · 2011-06-07T09:58:56.037

2

Multiple columns edit is one of my favourite.

E.g. to change all numeric columns to characters:

X <- iris
X[id] <- lapply(X[id <- sapply(X, is.numeric)], as.character)

or standardize them

X[id] <- lapply(X[id <- sapply(X, is.numeric)], scales)

edited Jun 07 '11 at 09:58

answered Jun 07 '11 at 05:18

Marek

45,585
13
89
116

score 1 · Answer 11 · answered Jun 06 '11 at 09:54

1

I'd say look at plyr for a package full of slick oneliners!

answered Jun 06 '11 at 09:54

Sacha Epskamp

42,423
17
105
128

an example in particular? (I know I have mine) – baptiste Jun 06 '11 at 10:25

score 1 · Answer 12 · answered Jun 06 '11 at 09:55

1

Well, not really a oneliner but textConnection is great!

x <- "1,3
1,a
1,g,4
3,d,6
2,X,1,3
2,K"
read.table(textConnection(x), sep=",", header=FALSE, na.strings="", fill=TRUE)

result

  V1 V2 V3 V4
1  1  3 NA NA
2  1  a NA NA
3  1  g  4 NA
4  3  d  6 NA
5  2  X  1  3
6  2  K NA NA
>

answered Jun 06 '11 at 09:55

jrara

14,677
28
85
117

2

one annoying side-effect of using `textConnection` in such a one-liner is that you get warnings when the connection closes later on. I usually have three lines; one to open, one to read, one to close the connection. – baptiste Jun 06 '11 at 10:31
@baptiste you can make it a one line by assigning *inline* `read.table(con – Gavin Simpson Jun 06 '11 at 10:36
1

@Gavin Simpson yeah, I suppose, though I wouldn't do that. Also, "real" one-liners should not use `;` i reckon. – baptiste Jun 06 '11 at 10:40
See `text_to_table` for a convenience wrapper to `textConnection`. http://stackoverflow.com/questions/3936285/is-there-a-way-to-use-read-csv-to-read-from-a-string-value-rather-than-a-file-in/3941145#3941145 – Richie Cotton Jun 06 '11 at 13:09

score 1 · Answer 13 · answered Jun 06 '11 at 10:37

1

Here's another tip collected from R-help (if memory serves, by Romain François).

Remove existing variables from the workspace:

rm( list = Filter( exists, c("a", "b") ) )

answered Jun 06 '11 at 10:37

baptiste

71,030
13
180
267

score 1 · Answer 14 · answered Jun 06 '11 at 12:28

1

My favorite one-liner can be found in the help pages of the %in% function and is basically its opposite.

f.wo <- function(x, y) x[!x %in% y]

Wrapped up into a nice, small function it comes really handy. E.g.

R> f.wo(c("a", "b", "c"), "b")
[1] "a" "c"
R> f.wo(1:8, c(2,7))
[1] 1 3 4 5 6 8

answered Jun 06 '11 at 12:28

mropa

10,364
9
31
29

1

Can be made to look more like %nin% by putting this in your profile: "%nin%" – Stephen Turner Jun 06 '11 at 21:59
@StephenTurner There was question about it: http://stackoverflow.com/q/5831794/168747. I prefer direct definition of `%nin%` – Marek Jun 08 '11 at 14:08

score 1 · Answer 15 · answered Jun 06 '11 at 22:01

1

Function to read space delimited data from the clipboard

read.cb <- function(...) read.table(file="clipboard", ...)

e.g.

# read data from the clipboard with a header
d<-read.cb(T) 

#read data from clipboard without header
d<-read.cb()

answered Jun 06 '11 at 22:01

Stephen Turner

2,324
7
28
44

score 1 · Answer 16 · answered Jun 06 '11 at 22:04

1

Function to convert columns of data in a data frame to factor variables

factorcols <- function(d, ...) lapply(d, function(x) factor(x, ...))

E.g. convert columns 1-4 in data frame d to factor variables

d[1:4] <- factorcols(d[1:4])

answered Jun 06 '11 at 22:04

Stephen Turner

2,324
7
28
44

score 1 · Answer 17 · answered Jun 06 '11 at 22:21

1

Return a new matrix, where the rows of the original matrix are sorted by columns:

newmat <- t(apply(orimat, 1, sort))

answered Jun 06 '11 at 22:21

bill_080

4,414
1
21
30

slick one-lineRs

17 Answers17