split a string in a text file

Question

I have a text file with the following contents:

19810101 20
19810102 31
19810103 1
19810701 1
19811105 5

I want something like this and save as a csv file.

1981 01 01 20
1981 01 02 31
1981 01 03 1
1981 07 01 1
1981 11 05 5

Is there an easy way to do this in R, bash or awk?

I was looking at similar posts: [1] Split a string every 5 characters [2]Split into 3 character length but these are all applicable for strings with the same length.

"Is there an easy way to do this in R, bash or awk?" If that is the question, then the answer is "yes". — anishsane, Feb 18 '17 at 06:14
I think, parsing the date would be an overkill if the date is in such standard format "yyymmdd"... `sed -r 's/(....)(..)(..)/\1 \2 \3/' file` is sufficient — anishsane, Feb 18 '17 at 06:28
When you "save as a CSV file", do you expect any commas to be added, or do you keep the spaces? — Benjamin W., Feb 18 '17 at 06:33
Sorry I was out a while ago so I didn't check my post. Before I posted here, I was reading some similar post. For example, http://stackoverflow.com/questions/7452156/split-into-3-character-length, and this http://stackoverflow.com/questions/2247045/chopping-a-string-into-a-vector-of-fixed-width-character-elements, but I these are all applicable for splitting strings of same length. — Liliputian, Feb 18 '17 at 11:06

score 2 · Answer 1 · answered Feb 18 '17 at 07:13

We can use extract from tidyverse

library(tidyverse)
extract(df1, v1, into = c("Year", "Month", "Day"), "(.{4})(.{2})(.{2})")

data

df1 <- structure(list(v1 = c(19810101L, 19810102L, 19810103L, 19810701L, 
 19811105L), v2 = c(20L, 31L, 1L, 1L, 5L)), .Names = c("v1", "v2"
), class = "data.frame", row.names = c(NA, -5L))

Akshay Hegde · Accepted Answer · 2017-02-18T06:47:12.440

1

Input

$ cat f
19810101 20
19810102 31
19810103 1
19810701 1
19811105 5

Output

$ awk '{print substr($1,1,4),substr($1,5,2),substr($1,7),$2}' f
1981 01 01 20
1981 01 02 31
1981 01 03 1
1981 07 01 1
1981 11 05 5

For CSV

$ awk  '{print substr($1,1,4),substr($1,5,2),substr($1,7),$2}' OFS=, f
1981,01,01,20
1981,01,02,31
1981,01,03,1
1981,07,01,1
1981,11,05,5

edited Feb 18 '17 at 06:47

answered Feb 18 '17 at 06:44

Akshay Hegde

15,144
2
16
34

Please refrain from answering unless OP clearly shows his research towards solving their problem. – Inian Feb 18 '17 at 06:45
Wow sorry inian I didn't read fully, should I delete my answer now ? – Akshay Hegde Feb 18 '17 at 06:46
Just trying to maintain the _etics_ of this community. The OP being nearly 3 years in the site should know better to post an effort for this. Refer @anishsane comment above, it already solved the problem, but he hasn't posted it. Waiting for a proper effort shown. So kindly do likewise – Inian Feb 18 '17 at 06:48
@Inian : Sorry, I will take care of it from my next posts – Akshay Hegde Feb 18 '17 at 06:49

score 1 · Answer 3 · answered Feb 18 '17 at 07:19

1

below will work

sed -r 's/([[:digit:]]{4})([[:digit:]]{2})([[:digit:]]{2})/\1 \2 \3/' lines.txt|tr ' ' , > newfile.csv

or

sed -r 's/(.{4})(.{2})(.{2})/\1 \2 \3/' lines.txt |tr ' ' ,  > newfile.csv

answered Feb 18 '17 at 07:19

Peddipaga

74
5

score 1 · Answer 4 · answered Feb 18 '17 at 20:57

1

awk '{sub(/..../,"& ")sub(/../,"& ",$2)}1' file

1981 01 01 20
1981 01 02 31
1981 01 03 1
1981 07 01 1
1981 11 05 5

answered Feb 18 '17 at 20:57

Claes Wikner

1,369
1
7
7

split a string in a text file

4 Answers4

data