0

I'm struggling with a problem here and hope you guys can help.. I have some data in the following format (see picture):

    structure(list(Activity = 1:4, Date1 = c("01/09/2019", "01/09/2019", 
    "01/09/2019", "01/09/2019"), Amount1 = c(10L, 20L, 30L, 40L), 
    Date2 = c("01/10/2019", "01/10/2019", "01/10/2019", NA), 
Amount2 = c(32L, 34L, 12L, NA), Date3 = c("01/11/2019", "01/11/2019", 
"01/11/2019", NA), Amount3 = c(54L, 12L, 43L, NA), Date4 = c("01/12/2019", 
NA, "01/12/2019", NA), Amount4 = c(56L, NA, 124L, NA)), row.names = c(NA, 
-4L), class = c("tbl_df", "tbl", "data.frame"), spec = structure(list(
cols = list(Activity = structure(list(), class = c("collector_integer", 
"collector")), Date1 = structure(list(), class = c("collector_character", 
"collector")), Amount1 = structure(list(), class = c("collector_integer", 
"collector")), Date2 = structure(list(), class = c("collector_character", 
"collector")), Amount2 = structure(list(), class = c("collector_integer", 
"collector")), Date3 = structure(list(), class = c("collector_character", 
"collector")), Amount3 = structure(list(), class = c("collector_integer", 
"collector")), Date4 = structure(list(), class = c("collector_character", 
"collector")), Amount4 = structure(list(), class = c("collector_integer", 
"collector"))), default = structure(list(), class = c("collector_guess", 
"collector"))), class = "col_spec"))

There can be hundreds of activities and all of these can have 0, 1, 2, 3, or 4 entries in date and amounts. I need to somehow "transpose" this into this format:

Wanted format

My thinking is, that this can be solved with a nested while loop inside a for loop, but I'm not very fluent in R and therefore I'm feeling a bit lost.

A few things to note:

  1. There can be many activities (not limited to just the 4 in the example)
  2. The activities can have input in any of the four "set" of columns -> date and amount, but it can also have no input.
  3. The end result need not to be ordered, I just need the specific format so we can enter this into a forecast model in another programme.

Please let me know if you need any clarifications on the above - hope you are able to help me here! :)

/Jacob

  • 1
    This is exactly what `gather` from the `tidyr` package is for. Look that up. Post your data in a form that can be copied and pasted easily instead of a picture and someone may be kind enough to walk you through it but this has already been answered a lot so I doubt it! – asachet Mar 04 '19 at 21:06
  • Reshaping data from "wide to long" is pretty common, if you search for that term you'll find lots of examples. I would particularly recommend Arun's answer at the marked duplicate (though the others should work fine too). If you can't get it working, we will reopen if you edit your question to (a)make your example reproducible. *pictures* of data are hard to work with---copy/pasteable R code is easy. Use `dput()` to share your sample data or share code to simulate sample data. [Lots of tips on that here](https://stackoverflow.com/q/5963269/903061). (b) Show what you tried from the dupe. – Gregor Thomas Mar 04 '19 at 21:09
  • Here's another [worked example](https://stackoverflow.com/a/50219149/903061) – Gregor Thomas Mar 04 '19 at 21:16
  • Maybe moderators can open it up so i can answer as feel solution is unique vs the flagged duplicates. For now put it in with the answer as want the original poster to see it. thx – Andrew Bannerman Mar 05 '19 at 02:00
  • I've seen some of the other examples, but still haven't been able to get it to work.. Sorry about the data, this is my first question I've asked, so didn't know how to add it. @Andrew, I would love to see your solution. Let me try and update the question with some data first. – Jacob Steinitz Andersen Mar 05 '19 at 19:57
  • I create index's using rep() so that I use a for loop to go through your data [row,column] wise and then drop the data into your output format again populating it [row,column] wise. I self enforce the index's needed and store them in a vector then iterate through them rows[i], cols[i] for example which would point to a [row,col] location in the input data set + output data frame. It should work for your full set as each activity is per row + the columns are fixed width ending at Amount4. It will have NA values in the output but i'l show you how to subset those out afterwards! – Andrew Bannerman Mar 06 '19 at 01:03
  • Jacob did you solve it? If not ask another question? Center the Q around using base R with no packages to do the reshaping.... – Andrew Bannerman Mar 23 '19 at 14:42

0 Answers0