17

Is there a good way to get a year + week number converted a date in R? I have tried the following:

> as.POSIXct("2008 41", format="%Y %U")
[1] "2008-02-21 EST"
> as.POSIXct("2008 42", format="%Y %U")
[1] "2008-02-21 EST"

According to ?strftime:

%Y Year with century. Note that whereas there was no zero in the original Gregorian calendar, ISO 8601:2004 defines it to be valid (interpreted as 1BC): see http://en.wikipedia.org/wiki/0_(year). Note that the standard also says that years before 1582 in its calendar should only be used with agreement of the parties involved.

%U Week of the year as decimal number (00–53) using Sunday as the first day 1 of the week (and typically with the first Sunday of the year as day 1 of week 1). The US convention.

Kyle Brandt
  • 23,178
  • 32
  • 115
  • 158

3 Answers3

23

This is kinda like another question you may have seen before. :)

The key issue is: what day should a week number specify? Is it the first day of the week? The last? That's ambiguous. I don't know if week one is the first day of the year or the 7th day of the year, or possibly the first Sunday or Monday of the year (which is a frequent interpretation). (And it's worse than that: these generally appear to be 0-indexed, rather than 1-indexed.) So, an enumerated day of the week needs to be specified.

For instance, try this:

as.POSIXlt("2008 42 1", format = "%Y %U %u")

The %u indicator specifies the day of the week.

Additional note: See ?strptime for the various options for format conversion. It's important to be careful about the enumeration of weeks, as these can be split across the end of the year, and day 1 is ambiguous: is it specified based on a Sunday or Monday, or from the first day of the year? This should all be specified and tested on the different systems where the R code will run. I'm not certain that Windows and POSIX systems sing the same tune on some of these conversions, hence I'd test and test again.

Community
  • 1
  • 1
Iterator
  • 19,577
  • 11
  • 65
  • 109
  • Nice answer. You might want to just paste in the documentation for `%U` in `?strptime`, which precisely specifies its behavior (i.e. first Sunday is day 1 of week 1, and earlier days belong to week 0). – Josh O'Brien Feb 21 '12 at 16:30
  • @DWin -- Thanks for that correction. Another reminder of why testing and testing again *is* an especially good idea when dealing with dates :) – Josh O'Brien Feb 21 '12 at 17:26
  • 2
    @JoshO'Brien And you may be right in the US, but wrong in a different `tz` location. Dealing with weeks, without a specific timezone and very clear test behavior, is risky. – Iterator Feb 21 '12 at 17:44
5

Day-of-week == zero in the POSIXlt DateTimesClasses system is Sunday. Not exactly Biblical and not in agreement with the R indexing that starts at "1" convention either, but that's what it is. Week zero is the first (partial) week in the year. Week one (but day of week zero) starts with the first Sunday. And all the other sequence types in POSIXlt have 0 as their starting point. It kind of interesting to see what coercing the list elements of POSIXlt objects do. The only way you can actually change a POSIXlt date is to alter the $year, the $mon or the $mday elements. The others seem to be epiphenomena.

  today <- as.POSIXlt(Sys.Date())
  today  # Tuesday
#[1] "2012-02-21 UTC"
     today$wday <- 0  # attempt to make it Sunday
     today
# [1] "2012-02-21 UTC"   The attempt fails
 today$mday <- 19
 today
#[1] "2012-02-19 UTC"   Success
IRTFM
  • 240,863
  • 19
  • 328
  • 451
1

I did not come up with this myself (it's taken from a blog post by Forester), but nevertheless I thought I'd add this to the answer list because it's the first implementation of the ISO 8601 week number convention that I've seen in R.

No doubt, week numbers are a very ambiguous topic, but I prefer an ISO standard over the current implementation of week numbers via format(..., "%U") because it seems that this is what most people agreed on, at least in Germany (calendars etc.).

I've put the actual function def at the bottom to facilitate focusing on the output first. Also, I just stumbled across package ISOweek, maybe worth a try.

Approach Comparison

x.days  <- c("Mon", "Tue", "Wed", "Thu", "Fri", "Sat", "Sun")
x.names <- sapply(1:length(posix), function(x) {
    x.day <- as.POSIXlt(posix[x], tz="Europe/Berlin")$wday
    if (x.day == 0) {
        x.day <- 7
    }
    out <- x.days[x.day]
})

data.frame(
    posix, 
    name=x.names,
    week.r=weeknum, 
    week.iso=ISOweek(as.character(posix), tzone="Europe/Berlin")$weeknum
)

# Result

        posix name week.r week.iso
1  2012-01-01  Sun      1  4480458
2  2012-01-02  Mon      1        1
3  2012-01-03  Tue      1        1
4  2012-01-04  Wed      1        1
5  2012-01-05  Thu      1        1
6  2012-01-06  Fri      1        1
7  2012-01-07  Sat      1        1
8  2012-01-08  Sun      2        1
9  2012-01-09  Mon      2        2
10 2012-01-10  Tue      2        2
11 2012-01-11  Wed      2        2
12 2012-01-12  Thu      2        2
13 2012-01-13  Fri      2        2
14 2012-01-14  Sat      2        2
15 2012-01-15  Sun      3        2
16 2012-01-16  Mon      3        3
17 2012-01-17  Tue      3        3
18 2012-01-18  Wed      3        3
19 2012-01-19  Thu      3        3
20 2012-01-20  Fri      3        3
21 2012-01-21  Sat      3        3
22 2012-01-22  Sun      4        3
23 2012-01-23  Mon      4        4
24 2012-01-24  Tue      4        4
25 2012-01-25  Wed      4        4
26 2012-01-26  Thu      4        4
27 2012-01-27  Fri      4        4
28 2012-01-28  Sat      4        4
29 2012-01-29  Sun      5        4
30 2012-01-30  Mon      5        5
31 2012-01-31  Tue      5        5

Function Def

It's taken directly from the blog post, I've just changed a couple of minor things. The function is still kind of sketchy (e.g. the week number of the first date is far off), but I find it to be a nice start!

ISOweek <- function(
    date, 
    format="%Y-%m-%d", 
    tzone="UTC", 
    return.val="weekofyear"
){
  ##converts dates into "dayofyear" or "weekofyear", the latter providing the ISO-8601 week
  ##date should be a vector of class Date or a vector of formatted character strings
  ##format refers to the date form used if a vector of
  ##  character strings  is supplied

  ##convert date to POSIXt format 
  if(class(date)[1]%in%c("Date","character")){
    date=as.POSIXlt(date,format=format, tz=tzone)
  }

#  if(class(date)[1]!="POSIXt"){
  if (!inherits(date, "POSIXt")) {
    print("Date is of wrong format.")
    break
  }else if(class(date)[2]=="POSIXct"){
    date=as.POSIXlt(date, tz=tzone)
  }
print(date)

  if(return.val=="dayofyear"){
    ##add 1 because POSIXt is base zero
    return(date$yday+1)
  }else if(return.val=="weekofyear"){
    ##Based on the ISO8601 weekdate system,
    ## Monday is the first day of the week
    ## W01 is the week with 4 Jan in it.
    year=1900+date$year
    jan4=strptime(paste(year,1,4,sep="-"),format="%Y-%m-%d")
    wday=jan4$wday

    wday[wday==0]=7  ##convert to base 1, where Monday == 1, Sunday==7

    ##calculate the date of the first week of the year
    weekstart=jan4-(wday-1)*86400  
    weeknum=ceiling(as.numeric((difftime(date,weekstart,units="days")+0.1)/7))

    #########################################################################
    ##calculate week for days of the year occuring in the next year's week 1.
    #########################################################################
    mday=date$mday
    wday=date$wday
    wday[wday==0]=7
    year=ifelse(weeknum==53 & mday-wday>=28,year+1,year)
    weeknum=ifelse(weeknum==53 & mday-wday>=28,1,weeknum)

    ################################################################
    ##calculate week for days of the year occuring prior to week 1.
    ################################################################

    ##first calculate the numbe of weeks in the previous year
    year.shift=year-1
    jan4.shift=strptime(paste(year.shift,1,4,sep="-"),format="%Y-%m-%d")
    wday=jan4.shift$wday
    wday[wday==0]=7  ##convert to base 1, where Monday == 1, Sunday==7
    weekstart=jan4.shift-(wday-1)*86400
    weeknum.shift=ceiling(as.numeric((difftime(date,weekstart)+0.1)/7))

    ##update year and week
    year=ifelse(weeknum==0,year.shift,year)
    weeknum=ifelse(weeknum==0,weeknum.shift,weeknum)

    return(list("year"=year,"weeknum"=weeknum))
  }else{
    print("Unknown return.val")
    break
  }
}
Rappster
  • 11,680
  • 7
  • 58
  • 113