0

I want create a new colume to represent which date are in the same week.

A data.table DATE_SET contains Date information, like:

DATA_SET<- data.table(transday = seq(from  = (Sys.Date()-64), to = Sys.Date(), by = 1))

For example, '2017-03-01' and '2017-03-02' are in the same week, '2017-03-01' and '2017-03-08' both Wednesday, but they are not in the same week.

If "2016-01-01" is the first week in 2016, "2017-01-01" is the first week in 2017, the value is 1, but they are not in the same week. So i want the unique value to pecify "a same week".

MichaelChirico
  • 31,197
  • 13
  • 98
  • 169
Leah210
  • 79
  • 1
  • 1
  • 7
  • Since you're using data.table: convert to IDate and then round. See `?IDate`. – Frank May 04 '17 at 06:31
  • See `%W` in `help("strftime")`. – Roland May 04 '17 at 06:31
  • By "same week" do you mean "7 or less days inclusive between two dates" or "same week of the year (week 1, week2...)" ? – neilfws May 04 '17 at 06:32
  • @neilfws "same week" means "same week of all years", use an unique interger to specifyed a same week, the unique interger just specify "this same week" and can not be reused. – Leah210 May 04 '17 at 06:48
  • 1
    There's some ambiguity in how a week is defined, which is why base has both `week` and `isoweek` functions. – alistaire May 04 '17 at 06:50
  • 2
    There is a `cut.Date` function. Study its help page and examples. – IRTFM May 04 '17 at 06:54

2 Answers2

5

The answer to this question depends strongly on

  • the definition of the first day of the week (usually Sunday or Monday) and
  • the numbering of the weeks within the year (starting with the first Sunday, Monday, or Thursday of the year, or on 1st January, etc).

A selection of different options can be seen from the example below:

      dates  isoweek day week_iso week_US week_UK DT_week DT_iso lub_week lub_iso   cut.Date
 2015-12-25 2015-W52   5 2015-W52      51      51      52     52       52      52 2015-12-21
 2015-12-26 2015-W52   6 2015-W52      51      51      52     52       52      52 2015-12-21
 2015-12-27 2015-W52   7 2015-W52      52      51      52     52       52      52 2015-12-21
 2015-12-28 2015-W53   1 2015-W53      52      52      52     53       52      53 2015-12-28
 2015-12-29 2015-W53   2 2015-W53      52      52      52     53       52      53 2015-12-28
 2015-12-30 2015-W53   3 2015-W53      52      52      53     53       52      53 2015-12-28
 2015-12-31 2015-W53   4 2015-W53      52      52      53     53       53      53 2015-12-28
 2016-01-01 2015-W53   5 2015-W53      00      00       1     53        1      53 2015-12-28
 2016-01-02 2015-W53   6 2015-W53      00      00       1     53        1      53 2015-12-28
 2016-01-03 2015-W53   7 2015-W53      01      00       1     53        1      53 2015-12-28
 2016-01-04 2016-W01   1 2016-W01      01      01       1      1        1       1 2016-01-04
 2016-01-05 2016-W01   2 2016-W01      01      01       1      1        1       1 2016-01-04
 2016-01-06 2016-W01   3 2016-W01      01      01       1      1        1       1 2016-01-04
 2016-01-07 2016-W01   4 2016-W01      01      01       2      1        1       1 2016-01-04
 2016-01-08 2016-W01   5 2016-W01      01      01       2      1        2       1 2016-01-04

which is created by this code:

library(data.table)

dates <- as.Date("2016-01-01") + (-7:7)
print(data.table(
  dates,
  isoweek   = ISOweek::ISOweek(dates),
  day       = ISOweek::ISOweekday(dates),
  week_iso  = format(dates, "%G-W%V"),
  week_US   = format(dates, "%U"),
  week_UK   = format(dates, "%W"),
  DT_week   = data.table::week(dates),
  DT_iso    = data.table::isoweek(dates),
  lub_week  = lubridate::week(dates),
  lub_iso   = lubridate::isoweek(dates),
  cut.Date  = cut.Date(dates, "week")  
), row.names = FALSE)     

The format YYYY-Www used in some of the columns is one of the ISO 8601 week formats. It includes the year which is required to distinguish different weeks in different years as requested by the OP.

The ISO week definition is the only format which ensures that each week always consists of 7 days, also across New Year. The other definitions may start or end the year with "weeks" with less than 7 days. Due to the seamless partioning of the year, the ISO week-numbering year is slightly different from the traditional Gregorian calendar year, e.g., 2016-01-01 belongs to the last ISO week 53 of 2015 (2015-W53).

As suggested here, cut.Date() might be the best option for the OP.

Disclosure: I'm maintainer of the ISOweek package which was published at a time when strptime() did not recognize the %G and %V format specifications for output in the Windows versions of R. (Still today they aren't recognized on input).

Community
  • 1
  • 1
Uwe
  • 34,565
  • 10
  • 75
  • 109
1

You can use the week() function of the lubridate package in R.

library(lubridate)
DATA_SET$week <- week(DATA_SET$transday)

This will give you a new column week. Dates within the same week will have same week number.

Imran Ali
  • 2,127
  • 1
  • 24
  • 34
  • Thanks, but i want the value of volumn week should be unique, the value just specify "this same week", and can not be reused later. – Leah210 May 04 '17 at 06:55
  • I don't understand why it cannot be reused later. A simple comparison `DATA_SET$week[1] > DATA_SET$week[6]` returns `FALSE` and `DATA_SET$week[1] == DATA_SET$week[3]` returns `TRUE` – Imran Ali May 04 '17 at 07:03
  • for example, if "2016-01-01" is the first week in 2016, "2017-01-01" is the first week in 2017, the value is 1, but they are not in the same week. So i want the unique value to pecify "this same week". – Leah210 May 04 '17 at 07:25
  • @Leah210 Then create another variable for year and combine it with week. You will get 20016_01 and 2017_01, for example. – Roman Luštrik May 04 '17 at 08:00
  • In that case, you can just put year in front of the week like `DATA_SET$week – amatsuo_net May 04 '17 at 08:01
  • No need to use `lubridate`, `data.table` has its own internal `week` (and `isoweek`) functions. – MichaelChirico May 04 '17 at 21:26