Questions tagged [data-wrangling]

482 questions
2
votes
2 answers

Creating multiple new columns within a DF depending on the the order of logical columns

I am trying to create three new columns with values depending on a particular order of three logical type columns. eg I have this: a b c 1 TRUE TRUE TRUE 2 TRUE FALSE TRUE 3 TRUE FALSE TRUE And depending if going across the row the…
jagex
  • 23
  • 3
2
votes
1 answer

I got an error when I run my codes in R Studio

Here are the codes that I use library(quantmod) library(timetk) library(dplyr) library(tibble) library(tidyr) mdate <- "2015-10-30" edate <- "2016-01-05" tickers <- c("ABG","ACH","ADM","AEG","AEM","AGQ","AGRO","AKOb","APO") data <-…
2
votes
2 answers

Divide long data by values in another dataset in R

I have a dataset in a long data format: Date Region X Y Z T D E F 01-01-2020 RegionA 2 4 2 3 2 3 4 01-01-2020 RegionB 1 3 2 2 3 3 3 01-01-2020 RegionC 1 4 4 2 3 4 2 01-01-2020 …
GAT
  • 23
  • 4
2
votes
1 answer

how to get multiple outcomes for running a function on observations?

how would i run this multiple times? I have a variable called percent_people, which looks if we have 5000000 people in the variable country, and have a variable called city_share which looks at the percentage share per city, eg London = 40%, the…
josh
  • 63
  • 4
2
votes
1 answer

find different values in factor in long form

I have data in a long format, similar to the following id <- c(rep(c(1L,2L,3L),3)) year <- c(rep(c(11,12,13),3)) df <- data.frame(id, year)[-c(8,3),] df$factor <- factor(c("a", "b", "a", "c", "d","a","d")) df I would like to create an indicator…
SushiChef
  • 41
  • 6
2
votes
2 answers

Data wrangling from wide to long format with multiple repeating columns of different types

A dataset describes multiple repeating measurements for multiple clusters, with each measurement-cluster pair contained in a single column. I would like to wrangle the data into a long(er) format, such that one column provides information on the…
user6571411
  • 1,898
  • 2
  • 11
  • 24
2
votes
1 answer

How to wrangle the data using Lubridate package and Regex instead of using the separate function?

https://www.kaggle.com/shivamb/netflix-shows-and-movies-exploratory-analysis/data ---- contains the data set. This is an exploratory data analysis performed on the shows from the Netflix data set. There are two main objectives in the data wrangling…
Sri Sreshtan
  • 535
  • 2
  • 10
2
votes
1 answer

Recoding data from 1,1,1, to 1,2,3

So I have this dataframe. Under the column potential_child, I want to recode the values so that the oldest child == 1, the second oldest == 2, third oldest == 3, etc. I have the ages of the children, but I am floundering how to do this exactly. …
AMB1274
  • 45
  • 5
2
votes
4 answers

An elegant way of reading multiple pandas DataFrames and assigning dataframes names in Python using Pandas

Excuse my question, I know this is trivial but for some reasons I am not getting it right. Reading dataframes one by one is highly inefficient especially if you have a lot of dataframes you would like to read from. Remember DRY - DO NOT REPEAT…
JA-pythonista
  • 865
  • 6
  • 18
2
votes
2 answers

Flatten a data frame, combine the values of a column into lists to populate individual cells

I have the following data frame in r: Color Value Red 1 Red 3 Red 4 Red 7 Blue 2 Blue 5 Green 1 Green 2 Green 3 What I would like to do is combine the…
larsonsm
  • 45
  • 4
2
votes
1 answer

Replace column names in dataframe based on a character vector that is ordered differently

My question is how to rename my column names based on a character vector while imposing the order of the vector on my data frame. I have read all the similar posts, yet non responds to my question. I am working on a data frame as such: df<-…
Yach
  • 328
  • 3
  • 16
1
vote
2 answers

String remove up to last "}"

Rvest output is inserting a long string of extra data in one of the cells: QC1 <- read_html("https://en.wikipedia.org/wiki/List_of_airports_in_Quebec")%>% html_node('body #content #bodyContent #mw-content-text .mw-parser-output table') %>% …
ibm
  • 158
  • 1
  • 9
1
vote
2 answers

Is there a function that will extend values in a column into NA positions? [R]

Here is an example of my Data. I am trying to extend the existing Hour_of_Day values down to fill in the missing information. I would appreciate some guidance in this area. Thank You! Hour_of_Day Counter Name People In People Out Day_of_Week…
Deltoidea
  • 11
  • 1
1
vote
1 answer

R: Passing different-lengthed inputs to purrr with nested data structures

I have two lists foo and bar, where length(foo) > length(bar). I want to apply a function length(bar) times to each element of bar and store the output of each application in is own list and store all of the applications of the function to each…
user3614648
  • 1,685
  • 1
  • 16
  • 39
1
vote
2 answers

R: What is the expected output of passing a character vector to dplyr::all_of()?

I am trying to understand the expected output of dplyr::group_by() in conjunction with the use of dplyr::all_of(). My understanding is that using dplyr::all_of() should convert character vectors containing variable names to the bare names so that…
user3614648
  • 1,685
  • 1
  • 16
  • 39