Concise and portable "join" on the Unix command-line

Question

How can I join multiple lines into one line, with a separator where the new-line characters were, and avoiding a trailing separator and, optionally, ignoring empty lines?

Example. Consider a text file, foo.txt, with three lines:

foo
bar
baz

The desired output is:

foo,bar,baz

The command I'm using now:

tr '\n' ',' <foo.txt |sed 's/,$//g'

Ideally it would be something like this:

cat foo.txt |join ,

What's:

the most portable, concise, readable way.
the most concise way using non-standard unix tools.

Of course I could write something, or just use an alias. But I'm interested to know the options.

possible duplicate of [Joining multiple lines into one with bash](http://stackoverflow.com/questions/2764051/joining-multiple-lines-into-one-with-bash) — Ciro Santilli新疆棉花TRUMP BAN BAD, Apr 01 '15 at 23:32

score 132 · Accepted Answer · edited Apr 01 '15 at 23:15

132

Perhaps a little surprisingly, paste is a good way to do this:

paste -s -d","

This won't deal with the empty lines you mentioned. For that, pipe your text through grep, first:

grep -v '^$' | paste -s -d"," -

edited Apr 01 '15 at 23:15

Ciro Santilli新疆棉花TRUMP BAN BAD

256,395
72
959
767

answered Dec 15 '11 at 16:00

Michael J. Barber

22,744
8
61
84

@codaddict Nor I, but I must admit that I don't find it intuitive at all - I always need to check the man pages for this. I'm definitely curious to see what others suggest. – Michael J. Barber Dec 15 '11 at 16:05
There are other ways, but none nicer (and the fun ones are a bit bashy). – sorpigal Dec 15 '11 at 16:07
It doesn't seem to ignore empty lines but this is still very nice and works for my use-case. Thanks! – butt Dec 15 '11 at 16:12
@butt Sorry, I'd missed the point about the empty lines, and just duplicated what your pipeline did. See the revisions. – Michael J. Barber Dec 15 '11 at 16:20
13

For enhanced portability, consider adding `-` at the end of the `paste` command whenever it's expected to read from `stdin`. (Some versions of `paste`, such BSD's, won't read from `stdin` unless `-` is explicitly passed to it.) – kjo Mar 02 '13 at 21:03
@kjo and POSIX mandates it. Editing the post now. http://pubs.opengroup.org/onlinepubs/9699919799/utilities/paste.htm – Ciro Santilli新疆棉花TRUMP BAN BAD Apr 01 '15 at 23:14
The command name may not be that intuitive, but it does exactly what's needed here. Man page starts with `paste -- merge corresponding or subsequent lines of files`. – ericbn Aug 07 '17 at 15:06
2

Thanks for the hint about `paste`! I noticed it allows only single-character delimiters, and it's `\t` by default. To accomplish longer delimiters (e.g. `, `): `cat foo.txt | paste -s | sed 's/\t/, /g'` – Arild Aug 31 '17 at 07:15
`paste -sd,` to be more concise. Don't need the double quotes around `,`. – codeforester Nov 03 '18 at 02:36

jaypal singh · Answer 2 · 2011-12-15T17:49:41.953

12

This sed one-line should work -

sed -e :a -e 'N;s/\n/,/;ba' file

Test:

[jaypal:~/Temp] cat file
foo
bar
baz

[jaypal:~/Temp] sed -e :a -e 'N;s/\n/,/;ba' file
foo,bar,baz

To handle empty lines, you can remove the empty lines and pipe it to the above one-liner.

sed -e '/^$/d' file | sed -e :a -e 'N;s/\n/,/;ba'

edited Dec 15 '11 at 17:49

answered Dec 15 '11 at 17:26

jaypal singh

67,706
21
93
138

An explanation would be nice! – Tejas Kale May 07 '17 at 09:24
1

It's more clear to combine two -e expression into one, `sed -e ':a; N; s/\n/,/; ba'`. But this is still an O(n²) method, because sed will do some substitution every time a new line added. `sed -e ':a; N; $!ba; s/\n/,/g'` is linear, substituting only once after all lines are appended into sed's Pattern Space. `$!ba` means "if it's the last line ($) do not (!) jump to (b) label :a (a), break the loop" – zhazha Jan 28 '18 at 11:39

score 9 · Answer 3 · answered Jul 29 '15 at 07:21

9

How about to use xargs?

for your case

$ cat foo.txt | sed 's/$/, /' | xargs

Be careful about the limit length of input of xargs command. (This means very long input file cannot be handled by this.)

answered Jul 29 '15 at 07:21

plhn

4,091
4
32
44

I found the `-L` flag on xargs helpful `-L 50` will 50 items per line. – jmunsch Oct 24 '16 at 22:04

mykhal · Answer 4 · 2014-06-17T14:36:48.730

6

Perl:

cat data.txt | perl -pe 'if(!eof){chomp;$_.=","}'

or yet shorter and faster, surprisingly:

cat data.txt | perl -pe 'if(!eof){s/\n/,/}'

or, if you want:

cat data.txt | perl -pe 's/\n/,/ unless eof'

edited Jun 17 '14 at 14:36

answered Jun 17 '14 at 14:21

mykhal

16,760
11
69
76

2

The nice thing about this is you can use any string instead of just a simple comma. The accepted answer is less versatile. I especially like the final iteration, although I would have written it like: `perl -pe 's/\n/,/ unless eof' data.txt` (no need for the spurious cat). – Mike S Apr 01 '16 at 16:04

score 4 · Answer 5 · answered Dec 15 '11 at 16:14

4

Just for fun, here's an all-builtins solution

IFS=$'\n' read -r -d '' -a data < foo.txt ; ( IFS=, ; echo "${data[*]}" ; )

You can use printf instead of echo if the trailing newline is a problem.

This works by setting IFS, the delimiters that read will split on, to just newline and not other whitespace, then telling read to not stop reading until it reaches a nul, instead of the newline it usually uses, and to add each item read into the array (-a) data. Then, in a subshell so as not to clobber the IFS of the interactive shell, we set IFS to , and expand the array with *, which delimits each item in the array with the first character in IFS

answered Dec 15 '11 at 16:14

sorpigal

23,262
7
54
73

1

interesting, however portability is not excellent, since there's no `-d` option in pure `sh` shell `read` command. – mykhal Jun 16 '14 at 14:38
@mykhal: True. However, `bash` can be found on many systems, so it has some utility. If you want portability arrays are probably out, too, otherwise you could simply use a `while` loop to work around the lack of `-d`. For a proper, portable all-builtins version you'd want something like `c= ; while IFS= read -r d ; do if ! [ -z "$d" ] ; then printf "$c$d" ; fi c=, ; done < foo.txt` but it still fails for `read` that knows `-r`, but that could be omitted, and assumes a builtin `printf`, so `echo` is probably better there if efficiency is important. Still, the accepted answer is much better! – sorpigal Jun 16 '14 at 17:18

score 0 · Answer 6 · answered Nov 06 '13 at 00:55

0

I needed to accomplish something similar, printing a comma-separated list of fields from a file, and was happy with piping STDOUT to xargs and ruby, like so:

cat data.txt | cut -f 16 -d ' ' | grep -o "\d\+" | xargs ruby -e "puts ARGV.join(', ')"

answered Nov 06 '13 at 00:55

mchail

781
4
12

kenorb · Answer 7 · 2015-12-21T21:46:09.230

Simple way to join the lines with space in-place using ex (also ignoring blank lines), use:

ex +%j -cwq foo.txt

If you want to print the results to the standard output, try:

ex +%j +%p -scq! foo.txt

To join lines without spaces, use +%j! instead of +%j.

To use different delimiter, it's a bit more tricky:

ex +"g/^$/d" +"%s/\n/_/e" +%p -scq! foo.txt

where g/^$/d (or v/\S/d) removes blank lines and s/\n/_/ is substitution which basically works the same as using sed, but for all lines (%). When parsing is done, print the buffer (%p). And finally -cq! executing vi q! command, which basically quits without saving (-s is to silence the output).

Please note that ex is equivalent to vi -e.

This method is quite portable as most of the Linux/Unix are shipped with ex/vi by default. And it's more compatible than using sed where in-place parameter (-i) is not standard extension and utility it-self is more stream oriented, therefore it's not so portable.

score 0 · Answer 8 · edited Dec 11 '15 at 20:19

I had a log file where some data was broken into multiple lines. When this occurred, the last character of the first line was the semi-colon (;). I joined these lines by using the following commands:

for LINE in 'cat $FILE | tr -s " " "|"'
do
    if [ $(echo $LINE | egrep ";$") ]
    then
        echo "$LINE\c" | tr -s "|" " " >> $MYFILE
    else
        echo "$LINE" | tr -s "|" " " >> $MYFILE
    fi
done

The result is a file where lines that were split in the log file were one line in my new file.

score -1 · Answer 9 · edited Jan 28 '16 at 03:25

-1

My answer is:

awk '{printf "%s", ","$0}' foo.txt

printf is enough. We don't need -F"\n" to change field separator.

edited Jan 28 '16 at 03:25

tumultous_rooster

10,446
27
81
140

answered Jan 28 '16 at 03:01

Duc Chi

333
3
7

1

This adds a spurious comma at the beginning of the output. -1 for not testing. – Mike S Apr 01 '16 at 15:51

Concise and portable "join" on the Unix command-line

9 Answers9

Linked

Related