0

I have a dataset which contains names, date, amount and no of hours worked for some employees. The sample dataset looks like the one below:

| Name    | Date | Amount | 10/1/2018 | 10/2/2018 | 10/3/2018 | 10/4/2018 | 10/5/2018 |
|---------|:----:|------:|-----------|-----------|-----------|-----------|-----------|
| Mike    |      |    50 | -         | -         | -         | -         | -         |
| Mike    |      |   167 | -         | -         | -         | -         | 8.0       |
| Mike    |      |   560 | -         | -         | -         | 8.0       | -         |
| Kristie |      | 467   | -         | -         | -         | -         | -         |
| Kristie |      | 900   | -         | -         | -         | -         | -         |
| Kristie |      | 65    | -         | -         | -         | -         | -         |
| Kristie |      | 90    | -         | -         | -         | -         | -         |
| Kristie |      | 665   | -         | -         | -         | -         | -         |
| Kristie |      | 864   | -         | -         | -         | -         | -         |
| Joe     |      | 566   | -         | -         | -         | -         | -         |
| Joe     |      | 676   | -         | -         | -         | -         | -         |
| Joe     |      | 40    | 8.0       | -         | -         | -         | -         |
| Joe     |      | 45    | -         | -         | -         | -         | -         |

As one can see, the values in Date column are blank and are parsed in columns. The dates range for each day from 10/1/2018 to 11/31/2019. So each date of every month is present in the columns. The values for these dates as seen in the above example is "8" at some places. This value represents the no of hours worked. When I upload this excel file in R, the dates get converted into characters, for example, "10/01/2018" would get converted to "43374" and so do other dates. How can I make sure that these dates retain their original format?

hk2
  • 377
  • 1
  • 7
  • 1
    Are you saving your data in a CSV format or a excel spreadsheet? Excel doesn't play nicely with data. There are [existing answers about converting excel date values to proper date values in R](https://stackoverflow.com/questions/43230470/how-to-convert-excel-date-format-to-proper-date-with-lubridate). But Excel doesn't make it easy to keep the right value in the first place. – MrFlick Nov 15 '19 at 20:11
  • @MrFlick The file is an excel spreadsheet. The problem over here is that the dates are spread out in columns with the values being filled for no of hours worked. – hk2 Nov 15 '19 at 20:17
  • How are you reading the file into R, with `read.csv` or with something like `readxl::read_xlsx`? – Marcus Nov 15 '19 at 20:32
  • @MarcusA. I'm using read_excel to read the file – hk2 Nov 15 '19 at 21:03

1 Answers1

0

Excel on Windows (but not on Mac) uses an origin of December 30, 1899 and counts days after that. To convert the date numbers use this:

(daycount <- as.numeric("43374"))
# [1] 43374
(dayYMD <- as.Date(daycount, origin="1899-12-30"))
# [1] "2018-10-01"
(dayMDY <- format(dayYMD, "%m/%d/%Y"))
# [1] "10/01/2018"
dcarlson
  • 5,687
  • 2
  • 11
  • 16