I would like to go from the left table to the right table in the image below but can't seem to find the coding logic behind to reach the result using R.
Your help is much appreciated !
I would like to go from the left table to the right table in the image below but can't seem to find the coding logic behind to reach the result using R.
Your help is much appreciated !
I have created a minimal example that should do what you want. The main issue here is to phrase your question since I think there are better answer than mine to match lag values to a modality.
library(dplyr)
# --- v0 is your data simplified
v0 <- c("cA", "t1", "t2", "cB", "t3")
# --- indic tels us what are the groups
indic <- v0 %>% stringr::str_detect(string = ., pattern = "c") %>%
cumsum()
# --- here you can try the code line by line (without the %>% (pipe) operator to understand the code
dfr <- tibble(v0, indic)
dfr %>%
group_by(indic) %>%
mutate(v1 = v0[which(stringr::str_detect(v0, "c") )] ) %>%
ungroup() %>%
filter(! stringr::str_detect(v0, "c")) %>%
select(v1, v0)
#> # A tibble: 3 x 2
#> v1 v0
#> <chr> <chr>
#> 1 cA t1
#> 2 cA t2
#> 3 cB t3
# you could also use a loop
An example using base R
:
data <- c(
"cinema A", "17:45", "20:00", "cinema B", "13:00", "15:45", "16:00",
"cinema C", "08:20"
)
time_rows <- grep("cinema", data, invert = TRUE)
data.frame(
time = data[time_rows],
cinema = grep("cinema", data, value = TRUE)[cumsum(grepl("cinema", data))][time_rows]
)
As written in the comments, please provide some sample data for future posts. In this case, I have made it for you based on your attached picture.
There are many ways to solve this. Here is a three-step approach.
library(tidyverse)
library(stringr)
# Create the data
df <- tibble(
X1 = c("cinema A", 17.45, 20.00, "cinema B", 13.00, 15.45, 16.00, "cinema C", 8.20))
df
#> # A tibble: 9 x 1
#> X1
#> <chr>
#> 1 cinema A
#> 2 17.45
#> 3 20
#> 4 cinema B
#> 5 13
#> 6 15.45
#> 7 16
#> 8 cinema C
#> 9 8.2
# Step 1: detect where the cinema values are and copy them to a new column
df$cinema <- ifelse(str_detect(df$X1, "cinema"), df$X1, NA)
df
#> # A tibble: 9 x 2
#> X1 cinema
#> <chr> <chr>
#> 1 cinema A cinema A
#> 2 17.45 <NA>
#> 3 20 <NA>
#> 4 cinema B cinema B
#> 5 13 <NA>
#> 6 15.45 <NA>
#> 7 16 <NA>
#> 8 cinema C cinema C
#> 9 8.2 <NA>
# Step 2: replace NA values in the new column with the values above
df <- fill(df, cinema)
df
#> # A tibble: 9 x 2
#> X1 cinema
#> <chr> <chr>
#> 1 cinema A cinema A
#> 2 17.45 cinema A
#> 3 20 cinema A
#> 4 cinema B cinema B
#> 5 13 cinema B
#> 6 15.45 cinema B
#> 7 16 cinema B
#> 8 cinema C cinema C
#> 9 8.2 cinema C
# Step 3: remove the rows where X1 contains cinema information
df <- filter(df, !str_detect(df$X1, "cinema"))
df
#> # A tibble: 6 x 2
#> X1 cinema
#> <chr> <chr>
#> 1 17.45 cinema A
#> 2 20 cinema A
#> 3 13 cinema B
#> 4 15.45 cinema B
#> 5 16 cinema B
#> 6 8.2 cinema C
Created on 2019-11-26 by the reprex package (v0.3.0)
Here is a solution for base R
.
Assuming the input is given as a data frame, i.e.:
df <- data.frame(X = c("cinema A", 17.45, 20.00, "cinema B", 13.00, 15.45, 16.00, "cinema C", 8.20))
> df
X
1 cinema A
2 17.45
3 20
4 cinema B
5 13
6 15.45
7 16
8 cinema C
9 8.2
the following code may help you to get the table on the right hand side:
lst <- split(df,findInterval(seq(nrow(df)),grep("cinema",df$X)-1,left.open = T))
res <- Reduce(rbind,lapply(lst, function(v) data.frame(ViewingTime = v[-1,],CinemaName = v[1,])))
where the output res
looks like:
> res
ViewingTime CinemaName
1 17.45 cinema A
2 20 cinema A
3 13 cinema B
4 15.45 cinema B
5 16 cinema B
6 8.2 cinema C