2301

I have this string stored in a variable:

IN="bla@some.com;john@home.com"

Now I would like to split the strings by ; delimiter so that I have:

ADDR1="bla@some.com"
ADDR2="john@home.com"

I don't necessarily need the ADDR1 and ADDR2 variables. If they are elements of an array that's even better.


After suggestions from the answers below, I ended up with the following which is what I was after:

#!/usr/bin/env bash

IN="bla@some.com;john@home.com"

mails=$(echo $IN | tr ";" "\n")

for addr in $mails
do
    echo "> [$addr]"
done

Output:

> [bla@some.com]
> [john@home.com]

There was a solution involving setting Internal_field_separator (IFS) to ;. I am not sure what happened with that answer, how do you reset IFS back to default?

RE: IFS solution, I tried this and it works, I keep the old IFS and then restore it:

IN="bla@some.com;john@home.com"

OIFS=$IFS
IFS=';'
mails2=$IN
for x in $mails2
do
    echo "> [$x]"
done

IFS=$OIFS

BTW, when I tried

mails2=($IN)

I only got the first string when printing it in loop, without brackets around $IN it works.

codeforester
  • 28,846
  • 11
  • 78
  • 104
stefanB
  • 69,149
  • 26
  • 113
  • 140
  • 20
    With regards to your "Edit2": You can simply "unset IFS" and it will return to the default state. There's no need to save and restore it explicitly unless you have some reason to expect that it's already been set to a non-default value. Moreover, if you're doing this inside a function (and, if you aren't, why not?), you can set IFS as a local variable and it will return to its previous value once you exit the function. – Brooks Moses May 01 '12 at 01:26
  • 20
    @BrooksMoses: (a) +1 for using `local IFS=...` where possible; (b) -1 for `unset IFS`, this doesn't exactly reset IFS to its default value, though I believe an unset IFS behaves the same as the default value of IFS ($' \t\n'), however it seems bad practice to be assuming blindly that your code will never be invoked with IFS set to a custom value; (c) another idea is to invoke a subshell: `(IFS=$custom; ...)` when the subshell exits IFS will return to whatever it was originally. – dubiousjim May 31 '12 at 05:21
  • I just want to have a quick look at the paths to decide where to throw an executable, so I resorted to run `ruby -e "puts ENV.fetch('PATH').split(':')"`. If you want to stay pure bash won't help but using *any scripting language* that has a built-in split is easier. – nicooga Mar 07 '16 at 15:32
  • This is kind of a drive-by comment, but since the OP used email addresses as the example, has anyone bothered to answer it in a way that is fully RFC 5322 compliant, namely that any quoted string can appear before the @ which means you're going to need regular expressions or some other kind of parser instead of naive use of IFS or other simplistic splitter functions. – Jeff Apr 22 '18 at 17:51
  • 6
    `for x in $(IFS=';';echo $IN); do echo "> [$x]"; done` – user2037659 Apr 26 '18 at 20:15
  • 3
    In order to save it as an array I had to place another set of parenthesis and change the `\n` for just a space. So the final line is `mails=($(echo $IN | tr ";" " "))`. So now I can check the elements of `mails` by using the array notation `mails[index]` or just iterating in a loop – afranques Jul 03 '18 at 14:08
  • For what it's worth, the `tr` solution doesn't work the same in zsh. – Ben Kushigian Oct 03 '18 at 20:13
  • for `$IFS` see [What is the exact meaning of `IFS=$'\n'`](https://stackoverflow.com/questions/4128235/what-is-the-exact-meaning-of-ifs-n/66942306#66942306) – Pmpr Apr 04 '21 at 21:55

32 Answers32

1394

You can set the internal field separator (IFS) variable, and then let it parse into an array. When this happens in a command, then the assignment to IFS only takes place to that single command's environment (to read ). It then parses the input according to the IFS variable value into an array, which we can then iterate over.

This example will parse one line of items separated by ;, pushing it into an array:

IFS=';' read -ra ADDR <<< "$IN"
for i in "${ADDR[@]}"; do
  # process "$i"
done

This other example is for processing the whole content of $IN, each time one line of input separated by ;:

while IFS=';' read -ra ADDR; do
  for i in "${ADDR[@]}"; do
    # process "$i"
  done
done <<< "$IN"
robe007
  • 2,625
  • 2
  • 25
  • 48
Johannes Schaub - litb
  • 466,055
  • 116
  • 851
  • 1,175
  • 25
    This is probably the best way. How long will IFS persist in it's current value, can it mess up my code by being set when it shouldn't be, and how can I reset it when I'm done with it? – Chris Lutz May 28 '09 at 02:25
  • 7
    now after the fix applied, only within the duration of the read command :) – Johannes Schaub - litb May 28 '09 at 03:04
  • 1
    I knew there was a way with arrays, just couldn't remember what it was. I like setting the IFS but am not sure with the redirect from $IN and go through read just to populate array. Isn't just restoring IFS easier? Anyway +1 fro IFS suggestion, thanks. – stefanB May 28 '09 at 03:11
  • 1
    I didn't like this saved="$IFS"; IFS=';'; ADDR=($IN); IFS="$saved" mess. :) – Johannes Schaub - litb May 28 '09 at 03:14
  • 16
    You can read everything at once without using a while loop: read -r -d '' -a addr <<< "$in" # The -d '' is key here, it tells read not to stop at the first newline (which is the default -d) but to continue until EOF or a NULL byte (which only occur in binary data). – lhunath May 28 '09 at 06:14
  • lhunath, ah nice idea :) However when i say "-d ''", then it always adds a linefeed as last element to the array. I don't know why that is :( – Johannes Schaub - litb May 28 '09 at 15:23
  • 1
    Seems to me the natural solution to the problem of splitting a line in bash with a custom word delimiter in a safe manner. Help me a lot. – Eduardo Lago Aguilar Sep 08 '11 at 15:33
  • +1 Only a side note: shouldn't it be recommendable to keep the old IFS and then restore it? (as shown by stefanB in his edit3) people landing here (sometimes just copying and pasting a solution) might not think about this. – Luca Borrione Sep 03 '12 at 09:23
  • 64
    @LucaBorrione Setting `IFS` on the same line as the `read` with no semicolon or other separator, as opposed to in a separate command, scopes it to that command -- so it's always "restored"; you don't need to do anything manually. – Charles Duffy Jul 06 '13 at 14:39
  • I noticed that parentheses are needed around $IN. Otherwise the whole string gets put into ADDR[0]. Why is this the case? – imagineerThat Jan 09 '14 at 21:20
  • 5
    @imagineerThis There is a bug involving herestrings and local changes to IFS that requires `$IN` to be quoted. The bug is fixed in `bash` 4.3. – chepner Oct 02 '14 at 03:50
  • 1
    Does not parse `newline` (`\n`) correctly, neither when `IN` declared like `IN=$'...'` nor when `IN="..."`. To see it, try `echo $i` in `for` loop, or `declare -p ADDR`. See [that solution](http://stackoverflow.com/a/24426608/1566267) for a workaround. – John_West Jan 08 '16 at 12:03
  • 4
    Doesn't process included newlines. Also add a trailing newline. – Isaac Oct 26 '16 at 03:28
  • This produces an extra empty array elements if the string to split by has more than one character. – ssc Dec 07 '16 at 09:19
  • In docker alpine (with `apk add bash`), this approach results in this error `cannot create temp file for here-document: Text file busy` – hychou May 11 '20 at 12:39
  • What about if I only need the first part? – Aaron Franke Jul 27 '20 at 03:51
1157

Taken from Bash shell script split array:

IN="bla@some.com;john@home.com"
arrIN=(${IN//;/ })
echo ${arrIN[1]}                  # Output: john@home.com

Explanation:

This construction replaces all occurrences of ';' (the initial // means global replace) in the string IN with ' ' (a single space), then interprets the space-delimited string as an array (that's what the surrounding parentheses do).

The syntax used inside of the curly braces to replace each ';' character with a ' ' character is called Parameter Expansion.

There are some common gotchas:

  1. If the original string has spaces, you will need to use IFS:
  • IFS=':'; arrIN=($IN); unset IFS;
  1. If the original string has spaces and the delimiter is a new line, you can set IFS with:
  • IFS=$'\n'; arrIN=($IN); unset IFS;
amo-ej1
  • 3,014
  • 22
  • 31
palindrom
  • 13,280
  • 1
  • 16
  • 31
  • 103
    I just want to add: this is the simplest of all, you can access array elements with ${arrIN[1]} (starting from zeros of course) – oz123 Mar 21 '11 at 18:50
  • 29
    Found it: the technique of modifying a variable within a ${} is known as 'parameter expansion'. – KomodoDave Jan 05 '12 at 15:13
  • 9
    If you want to split on a special character such as tilde (~) make sure to escape it: arrIN=(${IN//\~/ }) – David Parks Dec 01 '12 at 04:21
  • 25
    No, I don't think this works when there are also spaces present... it's converting the ',' to ' ' and then building a space-separated array. – Ethan Apr 12 '13 at 22:47
  • 15
    Very concise, but there are *caveats for general use*: the shell applies *word splitting* and *expansions* to the string, which may be undesired; just try it with. `IN="bla@some.com;john@home.com;*;broken apart"`. In short: this approach will break, if your tokens contain embedded spaces and/or chars. such as `*` that happen to make a token match filenames in the current folder. – mklement0 Apr 24 '13 at 14:08
  • 61
    This is a bad approach for other reasons: For instance, if your string contains `;*;`, then the `*` will be expanded to a list of filenames in the current directory. -1 – Charles Duffy Jul 06 '13 at 14:39
  • 2
    You can actually fix the spaces problem by using `IFS` instead of parameter expansion/substitution: `IFS=':' arrIN=($IN)` This is also somewhat more readable in my opinion. – Kyle Strand Feb 18 '15 at 23:37
  • 3
    @KyleStrand That sets `IFS`, then sets `arrIN`, same as if they were executed on separate lines or separated by a `;`. That is, assignments are temporary only if they appear before a *non*-assignment command. So after `IFS=':' arrIN=($IN)`, `echo "$IFS"` gives `:` and words are split on `:` for subsequent commands, which usually isn't wanted. (This is easy to overlook, since `echo $var` is sufficient to check if `$var` is `:`, when `:` is not in `$IFS`.) Therefore, except perhaps at the very end of a script, `IFS=':' arrIN=($IN) IFS=$' \t\n'` or `IFS=':' arrIN=($IN); unset IFS` is preferable. – Eliah Kagan May 08 '15 at 00:18
  • @EliahKagan Ah. Is there some benefit to that particular inconsistency? – Kyle Strand May 08 '15 at 00:50
  • @KyleStrand Yes, in that while it makes sense for a variable assignment to be scoped to a command, it doesn't really make sense for one variable assignment to be scoped to another. The shell performs variable/parameter expansion before it assigns values (or runs commands). For example, `x=foo echo $x` doesn't output `foo`, as `$x` is expanded before `foo` is assigned to `x` or `echo` is run. Likewise, if `x=foo y=$x` *were* to assign `foo` to `x` only while `y=$x` ran, then `y` would be assigned the original `$x` (not `foo`) because `$x` would expand before any variable assignments happened. – Eliah Kagan May 08 '15 at 02:42
  • @CharlesDuffy This could be avoided with `set -f`: `set -f; IN="bla@some.com;*;john@home.com"; arrIN=(${IN//;/ }); echo ${arrIN[1]}` – John_West Jan 06 '16 at 01:08
  • 5
    @John_West, yes, this approach _can_ be made usable by modifying global state to disable globbing (and taking close control of further global state in the form of `IFS`), but... well, why would you do that when `read -a` is available with none of the risks? – Charles Duffy Jan 06 '16 at 04:19
  • 1
    @Ethan thanks for pointing out the problem with when spaces are present and I am surprised this gotcha was not in the answer. I took the liberty of editing the answer to mention this gotcha and provide a solution for it (and one other gotcha). (@EliahKagan thanks for giving a good solution that is consistent with the original answer.) – Trevor Boyd Smith Jun 17 '16 at 16:10
  • 2
    Not sure why `IFS=';' declare -a arr=($IN)` isn't getting more cred here. No need to set any intermediary variables, the IFS change only applies to the one `declare` command, and we expand on the IFS rather than having to change it to something else. – ghoti Sep 03 '16 at 19:12
  • Shouldn't `IFS=':';` be `IFS=';';` to match the input string? in the later example – zpon Nov 08 '16 at 07:29
  • Why does the syntax `arrIN=(${IN//;/ })` break when put inside a for loop, in a bash script? – Nikos Alexandris Nov 05 '18 at 16:07
  • Also if you are using strings that have IFS in the process this solution is the only one that won't break the process. – Luiz Fernando Lobo Apr 01 '20 at 19:14
  • @CharlesDuffy because its read -A in zsh and i really don't like using build-in functions – caduceus May 05 '20 at 08:33
  • You can write a wrapper around a built-in function that calls the right underlying implementation/usage, if you really want to write a script that works on two mutually-incompatible shells (I strongly prefer to just use an appropriate shebang, and then a guard at the first line that exits if invoked with an incompatible interpreter). By contrast, you *can't* write portability shims for syntax. – Charles Duffy May 05 '20 at 15:30
  • for some reason, `arrIN=(${IN//;/ })` doesn't work for me on a Mac and on Linux... it had to be `arrIN=${IN//;/ }` and I can loop over the two email addresses. Otherwise, the first form, if I `echo $arrIN`, it is only 1 address – nonopolarity Dec 08 '20 at 04:51
287

If you don't mind processing them immediately, I like to do this:

for i in $(echo $IN | tr ";" "\n")
do
  # process
done

You could use this kind of loop to initialize an array, but there's probably an easier way to do it. Hope this helps, though.

Chris Lutz
  • 66,621
  • 15
  • 121
  • 178
  • You should have kept the IFS answer. It taught me something I didn't know, and it definitely made an array, whereas this just makes a cheap substitute. – Chris Lutz May 28 '09 at 02:42
  • I see. Yeah i find doing these silly experiments, i'm going to learn new things each time i'm trying to answer things. I've edited stuff based on #bash IRC feedback and undeleted :) – Johannes Schaub - litb May 28 '09 at 02:59
  • 3
    You could change it to echo "$IN" | tr ';' '\n' | while read -r ADDY; do # process "$ADDY"; done to make him lucky, i think :) Note that this will fork, and you can't change outer variables from within the loop (that's why i used the <<< "$IN" syntax) then – Johannes Schaub - litb May 28 '09 at 17:00
  • 10
    To summarize the debate in the comments: *Caveats for general use*: the shell applies *word splitting* and *expansions* to the string, which may be undesired; just try it with. `IN="bla@some.com;john@home.com;*;broken apart"`. In short: this approach will break, if your tokens contain embedded spaces and/or chars. such as `*` that happen to make a token match filenames in the current folder. – mklement0 Apr 24 '13 at 14:13
  • This is very helpful answer. e.g. `IN=abc;def;123`. How can we also print the index number? `echo $count $i ?` –  Oct 10 '18 at 18:50
253

Compatible answer

There are a lot of different ways to do this in .

However, it's important to first note that bash has many special features (so-called bashisms) that won't work in any other .

In particular, arrays, associative arrays, and pattern substitution, which are used in the solutions in this post as well as others in the thread, are bashisms and may not work under other shells that many people use.

For instance: on my Debian GNU/Linux, there is a standard shell called ; I know many people who like to use another shell called ; and there is also a special tool called with his own shell interpreter ().

Requested string

The string to be split in the above question is:

IN="bla@some.com;john@home.com"

I will use a modified version of this string to ensure that my solution is robust to strings containing whitespace, which could break other solutions:

IN="bla@some.com;john@home.com;Full Name <fulnam@other.org>"

Split string based on delimiter in (version >=4.2)

In pure bash, we can create an array with elements split by a temporary value for IFS (the input field separator). The IFS, among other things, tells bash which character(s) it should treat as a delimiter between elements when defining an array:

IN="bla@some.com;john@home.com;Full Name <fulnam@other.org>"

# save original IFS value so we can restore it later
oIFS="$IFS"
IFS=";"
declare -a fields=($IN)
IFS="$oIFS"
unset oIFS

In newer versions of bash, prefixing a command with an IFS definition changes the IFS for that command only and resets it to the previous value immediately afterwards. This means we can do the above in just one line:

IFS=\; read -a fields <<<"$IN"
# after this command, the IFS resets back to its previous value (here, the default):
set | grep ^IFS=
# IFS=$' \t\n'

We can see that the string IN has been stored into an array named fields, split on the semicolons:

set | grep ^fields=\\\|^IN=
# fields=([0]="bla@some.com" [1]="john@home.com" [2]="Full Name <fulnam@other.org>")
# IN='bla@some.com;john@home.com;Full Name <fulnam@other.org>'

(We can also display the contents of these variables using declare -p:)

declare -p IN fields
# declare -- IN="bla@some.com;john@home.com;Full Name <fulnam@other.org>"
# declare -a fields=([0]="bla@some.com" [1]="john@home.com" [2]="Full Name <fulnam@other.org>")

Note that read is the quickest way to do the split because there are no forks or external resources called.

Once the array is defined, you can use a simple loop to process each field (or, rather, each element in the array you've now defined):

# `"${fields[@]}"` expands to return every element of `fields` array as a separate argument
for x in "${fields[@]}" ;do
    echo "> [$x]"
    done
# > [bla@some.com]
# > [john@home.com]
# > [Full Name <fulnam@other.org>]

Or you could drop each field from the array after processing using a shifting approach, which I like:

while [ "$fields" ] ;do
    echo "> [$fields]"
    # slice the array 
    fields=("${fields[@]:1}")
    done
# > [bla@some.com]
# > [john@home.com]
# > [Full Name <fulnam@other.org>]

And if you just want a simple printout of the array, you don't even need to loop over it:

printf "> [%s]\n" "${fields[@]}"
# > [bla@some.com]
# > [john@home.com]
# > [Full Name <fulnam@other.org>]

Update: recent >= 4.4

In newer versions of bash, you can also play with the command mapfile:

mapfile -td \; fields < <(printf "%s\0" "$IN")

This syntax preserve special chars, newlines and empty fields!

If you don't want to include empty fields, you could do the following:

mapfile -td \; fields <<<"$IN"
fields=("${fields[@]%$'\n'}")   # drop '\n' added by '<<<'

With mapfile, you can also skip declaring an array and implicitly "loop" over the delimited elements, calling a function on each:

myPubliMail() {
    printf "Seq: %6d: Sending mail to '%s'..." $1 "$2"
    # mail -s "This is not a spam..." "$2" </path/to/body
    printf "\e[3D, done.\n"
}

mapfile < <(printf "%s\0" "$IN") -td \; -c 1 -C myPubliMail

(Note: the \0 at end of the format string is useless if you don't care about empty fields at end of the string or they're not present.)

mapfile < <(echo -n "$IN") -td \; -c 1 -C myPubliMail

# Seq:      0: Sending mail to 'bla@some.com', done.
# Seq:      1: Sending mail to 'john@home.com', done.
# Seq:      2: Sending mail to 'Full Name <fulnam@other.org>', done.

Or you could use <<<, and in the function body include some processing to drop the newline it adds:

myPubliMail() {
    local seq=$1 dest="${2%$'\n'}"
    printf "Seq: %6d: Sending mail to '%s'..." $seq "$dest"
    # mail -s "This is not a spam..." "$dest" </path/to/body
    printf "\e[3D, done.\n"
}

mapfile <<<"$IN" -td \; -c 1 -C myPubliMail

# Renders the same output:
# Seq:      0: Sending mail to 'bla@some.com', done.
# Seq:      1: Sending mail to 'john@home.com', done.
# Seq:      2: Sending mail to 'Full Name <fulnam@other.org>', done.

Split string based on delimiter in

If you can't use bash, or if you want to write something that can be used in many different shells, you often can't use bashisms -- and this includes the arrays we've been using in the solutions above.

However, we don't need to use arrays to loop over "elements" of a string. There is a syntax used in many shells for deleting substrings of a string from the first or last occurrence of a pattern. Note that * is a wildcard that stands for zero or more characters:

(The lack of this approach in any solution posted so far is the main reason I'm writing this answer ;)

${var#*SubStr}  # drops substring from start of string up to first occurrence of `SubStr`
${var##*SubStr} # drops substring from start of string up to last occurrence of `SubStr`
${var%SubStr*}  # drops substring from last occurrence of `SubStr` to end of string
${var%%SubStr*} # drops substring from first occurrence of `SubStr` to end of string

As explained by Score_Under:

# and % delete the shortest possible matching substring from the start and end of the string respectively, and

## and %% delete the longest possible matching substring.

Using the above syntax, we can create an approach where we extract substring "elements" from the string by deleting the substrings up to or after the delimiter.

The codeblock below works well in (including Mac OS's bash), , , and 's :

IN="bla@some.com;john@home.com;Full Name <fulnam@other.org>"
while [ "$IN" ] ;do
    # extract the substring from start of string up to delimiter.
    # this is the first "element" of the string.
    iter=${IN%%;*}
    echo "> [$iter]"
    # if there's only one element left, set `IN` to an empty string.
    # this causes us to exit this `while` loop.
    # else, we delete the first "element" of the string from IN, and move onto the next.
    [ "$IN" = "$iter" ] && \
        IN='' || \
        IN="${IN#*;}"
  done
# > [bla@some.com]
# > [john@home.com]
# > [Full Name <fulnam@other.org>]

Have fun!

Community
  • 1
  • 1
F. Hauri
  • 51,421
  • 13
  • 88
  • 109
240

I've seen a couple of answers referencing the cut command, but they've all been deleted. It's a little odd that nobody has elaborated on that, because I think it's one of the more useful commands for doing this type of thing, especially for parsing delimited log files.

In the case of splitting this specific example into a bash script array, tr is probably more efficient, but cut can be used, and is more effective if you want to pull specific fields from the middle.

Example:

$ echo "bla@some.com;john@home.com" | cut -d ";" -f 1
bla@some.com
$ echo "bla@some.com;john@home.com" | cut -d ";" -f 2
john@home.com

You can obviously put that into a loop, and iterate the -f parameter to pull each field independently.

This gets more useful when you have a delimited log file with rows like this:

2015-04-27|12345|some action|an attribute|meta data

cut is very handy to be able to cat this file and select a particular field for further processing.

DougW
  • 25,384
  • 18
  • 76
  • 106
  • 10
    Kudos for using `cut`, it's the right tool for the job! Much cleared than any of those shell hacks. – MisterMiyagi Nov 02 '16 at 08:42
  • 6
    This approach will only work if you know the number of elements in advance; you'd need to program some more logic around it. It also runs an external tool for every element. – uli42 Sep 14 '17 at 08:30
  • 1
    Excatly waht i was looking for trying to avoid empty string in a csv. Now i can point the exact 'column' value as well. Work with IFS already used in a loop. Better than expected for my situation. – Louis Loudog Trottier May 10 '18 at 04:20
  • 1
    Very useful for pulling IDs and PIDs too i.e. – Milos Grujic Oct 21 '19 at 09:07
  • 2
    This answer is worth scrolling down over half a page :) – Gucu112 Jan 03 '20 at 17:26
152

This worked for me:

string="1;2"
echo $string | cut -d';' -f1 # output is 1
echo $string | cut -d';' -f2 # output is 2
lfender6445
  • 25,940
  • 9
  • 95
  • 82
Steven Lizarazo
  • 4,722
  • 2
  • 26
  • 25
  • 1
    Though it only works with a single character delimiter, that's what the OP was looking for (records delimited by a semicolon). – GuyPaddock Dec 12 '18 at 01:37
  • Answered about four years ago by [@Ashok](https://stackoverflow.com/a/12328162/4215651), and also, more than one year ago by [@DougW](https://stackoverflow.com/a/29903172/4215651), than your answer, with even more information. Please post different solution than others'. – MAChitgarha Apr 03 '20 at 09:41
111

I think AWK is the best and efficient command to resolve your problem. AWK is included by default in almost every Linux distribution.

echo "bla@some.com;john@home.com" | awk -F';' '{print $1,$2}'

will give

bla@some.com john@home.com

Of course your can store each email address by redefining the awk print field.

noamtm
  • 10,618
  • 13
  • 60
  • 93
Tong
  • 1,517
  • 2
  • 10
  • 20
  • 8
    Or even simpler: echo "bla@some.com;john@home.com" | awk 'BEGIN{RS=";"} {print}' – Jaro Jan 07 '14 at 21:30
  • @Jaro This worked perfectly for me when I had a string with commas and needed to reformat it into lines. Thanks. – Aquarelle May 06 '14 at 21:58
  • It worked in this scenario -> "echo "$SPLIT_0" | awk -F' inode=' '{print $1}'"! I had problems when trying to use atrings (" inode=") instead of characters (";"). $ 1, $ 2, $ 3, $ 4 are set as positions in an array! If there is a way of setting an array... better! Thanks! – Eduardo Lucio Aug 05 '15 at 12:59
  • @EduardoLucio, what I'm thinking about is maybe you can first replace your delimiter `inode=` into `;` for example by `sed -i 's/inode\=/\;/g' your_file_to_process`, then define `-F';'` when apply `awk`, hope that can help you. – Tong Aug 06 '15 at 02:42
95

How about this approach:

IN="bla@some.com;john@home.com" 
set -- "$IN" 
IFS=";"; declare -a Array=($*) 
echo "${Array[@]}" 
echo "${Array[0]}" 
echo "${Array[1]}" 

Source

BLeB
  • 1,656
  • 17
  • 24
  • 7
    +1 ... but I wouldn't name the variable "Array" ... pet peev I guess. Good solution. – Yzmir Ramirez Sep 05 '11 at 01:06
  • 14
    +1 ... but the "set" and declare -a are unnecessary. You could as well have used just `IFS";" && Array=($IN)` – ata Nov 03 '11 at 22:33
  • +1 Only a side note: shouldn't it be recommendable to keep the old IFS and then restore it? (as shown by stefanB in his edit3) people landing here (sometimes just copying and pasting a solution) might not think about this – Luca Borrione Sep 03 '12 at 09:26
  • 6
    -1: First, @ata is right that most of the commands in this do nothing. Second, it uses word-splitting to form the array, and doesn't do anything to inhibit glob-expansion when doing so (so if you have glob characters in any of the array elements, those elements are replaced with matching filenames). – Charles Duffy Jul 06 '13 at 14:44
  • 1
    Suggest to use `$'...'`: `IN=$'bla@some.com;john@home.com;bet '`. Then `echo "${Array[2]}"` will print a string with newline. `set -- "$IN"` is also neccessary in this case. Yes, to prevent glob expansion, the solution should include `set -f`. – John_West Jan 08 '16 at 12:29
  • The external link does not explain this code. Please add an explanation for `set` and for `$*`. – mgutt Nov 28 '19 at 14:03
76
echo "bla@some.com;john@home.com" | sed -e 's/;/\n/g'
bla@some.com
john@home.com
lothar
  • 18,633
  • 5
  • 43
  • 59
  • 4
    -1 **what if the string contains spaces?** for example `IN="this is first line; this is second line" arrIN=( $( echo "$IN" | sed -e 's/;/\n/g' ) )` will produce an array of 8 elements in this case (an element for each word space separated), rather than 2 (an element for each line semi colon separated) – Luca Borrione Sep 03 '12 at 10:08
  • 5
    @Luca No the sed script creates exactly two lines. What creates the multiple entries for you is when you put it into a bash array (which splits on white space by default) – lothar Sep 03 '12 at 17:33
  • That's exactly the point: the OP needs to store entries into an array to loop over it, as you can see in his edits. I think your (good) answer missed to mention to use `arrIN=( $( echo "$IN" | sed -e 's/;/\n/g' ) )` to achieve that, and to advice to change IFS to `IFS=$'\n'` for those who land here in the future and needs to split a string containing spaces. (and to restore it back afterwards). :) – Luca Borrione Sep 04 '12 at 07:09
  • 3
    @Luca Good point. However the array assignment was not in the initial question when I wrote up that answer. – lothar Sep 04 '12 at 16:55
68

This also works:

IN="bla@some.com;john@home.com"
echo ADD1=`echo $IN | cut -d \; -f 1`
echo ADD2=`echo $IN | cut -d \; -f 2`

Be careful, this solution is not always correct. In case you pass "bla@some.com" only, it will assign it to both ADD1 and ADD2.

Boris S.
  • 3
  • 2
Ashok
  • 699
  • 5
  • 3
  • 1
    You can use -s to avoid the mentioned problem: http://superuser.com/questions/896800/cut-lies-if-delimeter-doesn-t-exist "-f, --fields=LIST select only these fields; also print any line that contains no delimiter character, unless the -s option is specified" – fersarr Mar 03 '16 at 17:17
36

A different take on Darron's answer, this is how I do it:

IN="bla@some.com;john@home.com"
read ADDR1 ADDR2 <<<$(IFS=";"; echo $IN)
Community
  • 1
  • 1
nickjb
  • 1,006
  • 1
  • 12
  • 16
  • I think it does! Run the commands above and then "echo $ADDR1 ... $ADDR2" and i get "bla@some.com ... john@home.com" output – nickjb Oct 06 '11 at 15:33
  • 1
    This worked REALLY well for me... I used it to itterate over an array of strings which contained comma separated DB,SERVER,PORT data to use mysqldump. – Nick Oct 28 '11 at 14:36
  • 5
    Diagnosis: the `IFS=";"` assignment exists only in the `$(...; echo $IN)` subshell; this is why some readers (including me) initially think it won't work. I assumed that all of $IN was getting slurped up by ADDR1. But nickjb is correct; it does work. The reason is that `echo $IN` command parses its arguments using the current value of $IFS, but then echoes them to stdout using a space delimiter, regardless of the setting of $IFS. So the net effect is as though one had called `read ADDR1 ADDR2 <<< "bla@some.com john@home.com"` (note the input is space-separated not ;-separated). – dubiousjim May 31 '12 at 05:28
  • 1
    This fails on spaces and newlines, and also expand wildcards `*` in the `echo $IN` with an unquoted variable expansion. – Isaac Oct 26 '16 at 04:43
  • I really like this solution. A description of why it works would be very useful and make it a better overall answer. – Michael Gaskill Jan 30 '17 at 02:28
35

In Bash, a bullet proof way, that will work even if your variable contains newlines:

IFS=';' read -d '' -ra array < <(printf '%s;\0' "$in")

Look:

$ in=$'one;two three;*;there is\na newline\nin this field'
$ IFS=';' read -d '' -ra array < <(printf '%s;\0' "$in")
$ declare -p array
declare -a array='([0]="one" [1]="two three" [2]="*" [3]="there is
a newline
in this field")'

The trick for this to work is to use the -d option of read (delimiter) with an empty delimiter, so that read is forced to read everything it's fed. And we feed read with exactly the content of the variable in, with no trailing newline thanks to printf. Note that's we're also putting the delimiter in printf to ensure that the string passed to read has a trailing delimiter. Without it, read would trim potential trailing empty fields:

$ in='one;two;three;'    # there's an empty field
$ IFS=';' read -d '' -ra array < <(printf '%s;\0' "$in")
$ declare -p array
declare -a array='([0]="one" [1]="two" [2]="three" [3]="")'

the trailing empty field is preserved.


Update for Bash≥4.4

Since Bash 4.4, the builtin mapfile (aka readarray) supports the -d option to specify a delimiter. Hence another canonical way is:

mapfile -d ';' -t array < <(printf '%s;' "$in")
Community
  • 1
  • 1
gniourf_gniourf
  • 38,851
  • 8
  • 82
  • 94
  • 5
    I found it as the rare solution on that list that works correctly with `\n`, spaces and `*` simultaneously. Also, no loops; array variable is accessible in the shell after execution (contrary to the highest upvoted answer). Note, `in=$'...'`, it does not work with double quotes. I think, it needs more upvotes. – John_West Jan 08 '16 at 12:10
35

How about this one liner, if you're not using arrays:

IFS=';' read ADDR1 ADDR2 <<<$IN
Darron
  • 20,463
  • 5
  • 47
  • 53
  • 1
    Consider using `read -r ...` to ensure that, for example, the two characters "\t" in the input end up as the same two characters in your variables (instead of a single tab char). – dubiousjim May 31 '12 at 05:36
  • -1 This is not working here (ubuntu 12.04). Adding `echo "ADDR1 $ADDR1"\n echo "ADDR2 $ADDR2"` to your snippet will output `ADDR1 bla@some.com john@home.com\nADDR2` (\n is newline) – Luca Borrione Sep 03 '12 at 10:07
  • 1
    This is probably due to a bug involving `IFS` and here strings that was fixed in `bash` 4.3. Quoting `$IN` should fix it. (In theory, `$IN` is not subject to word splitting or globbing after it expands, meaning the quotes should be unnecessary. Even in 4.3, though, there's at least one bug remaining--reported and scheduled to be fixed--so quoting remains a good idea.) – chepner Sep 19 '15 at 13:59
  • This breaks if $in contain newlines even if $IN is quoted. And adds a trailing newline. – Isaac Oct 26 '16 at 04:55
  • A problem with this, and many other solutions is also that it assumes there are EXACTLY TWO elements in $IN - OR that you're willing to have the second and subsequent items smashed together in ADDR2. I understand that this meets the ask, but it's a time bomb. – Steven the Easily Amused Sep 01 '19 at 14:36
27

Without setting the IFS

If you just have one colon you can do that:

a="foo:bar"
b=${a%:*}
c=${a##*:}

you will get:

b = foo
c = bar
Emilien Brigand
  • 7,143
  • 7
  • 30
  • 35
22

Here is a clean 3-liner:

in="foo@bar;bizz@buzz;fizz@buzz;buzz@woof"
IFS=';' list=($in)
for item in "${list[@]}"; do echo $item; done

where IFS delimit words based on the separator and () is used to create an array. Then [@] is used to return each item as a separate word.

If you've any code after that, you also need to restore $IFS, e.g. unset IFS.

kenorb
  • 118,428
  • 63
  • 588
  • 624
13

The following Bash/zsh function splits its first argument on the delimiter given by the second argument:

split() {
    local string="$1"
    local delimiter="$2"
    if [ -n "$string" ]; then
        local part
        while read -d "$delimiter" part; do
            echo $part
        done <<< "$string"
        echo $part
    fi
}

For instance, the command

$ split 'a;b;c' ';'

yields

a
b
c

This output may, for instance, be piped to other commands. Example:

$ split 'a;b;c' ';' | cat -n
1   a
2   b
3   c

Compared to the other solutions given, this one has the following advantages:

  • IFS is not overriden: Due to dynamic scoping of even local variables, overriding IFS over a loop causes the new value to leak into function calls performed from within the loop.

  • Arrays are not used: Reading a string into an array using read requires the flag -a in Bash and -A in zsh.

If desired, the function may be put into a script as follows:

#!/usr/bin/env bash

split() {
    # ...
}

split "$@"
Halle Knast
  • 3,598
  • 4
  • 22
  • 32
10

you can apply awk to many situations

echo "bla@some.com;john@home.com"|awk -F';' '{printf "%s\n%s\n", $1, $2}'

also you can use this

echo "bla@some.com;john@home.com"|awk -F';' '{print $1,$2}' OFS="\n"
shuaihanhungry
  • 399
  • 4
  • 6
8

There is a simple and smart way like this:

echo "add:sfff" | xargs -d: -i  echo {}

But you must use gnu xargs, BSD xargs cant support -d delim. If you use apple mac like me. You can install gnu xargs :

brew install findutils

then

echo "add:sfff" | gxargs -d: -i  echo {}
Victor Choy
  • 3,281
  • 21
  • 30
4

There are some cool answers here (errator esp.), but for something analogous to split in other languages -- which is what I took the original question to mean -- I settled on this:

IN="bla@some.com;john@home.com"
declare -a a="(${IN/;/ })";

Now ${a[0]}, ${a[1]}, etc, are as you would expect. Use ${#a[*]} for number of terms. Or to iterate, of course:

for i in ${a[*]}; do echo $i; done

IMPORTANT NOTE:

This works in cases where there are no spaces to worry about, which solved my problem, but may not solve yours. Go with the $IFS solution(s) in that case.

Benjamin W.
  • 33,075
  • 16
  • 78
  • 86
eukras
  • 799
  • 5
  • 4
  • Does not work when `IN` contains more than two e-mail addresses. Please refer to same idea (but fixed) at [palindrom's answer](http://stackoverflow.com/a/5257398/938111) – oHo Oct 07 '13 at 13:33
  • Better use `${IN//;/ }` (double slash) to make it also work with more than two values. Beware that any wildcard (`*?[`) will be expanded. And a trailing empty field will be discarded. – Isaac Oct 26 '16 at 05:14
4

If no space, Why not this?

IN="bla@some.com;john@home.com"
arr=(`echo $IN | tr ';' ' '`)

echo ${arr[0]}
echo ${arr[1]}
ghost
  • 438
  • 4
  • 10
4

This is the simplest way to do it.

spo='one;two;three'
OIFS=$IFS
IFS=';'
spo_array=($spo)
IFS=$OIFS
echo ${spo_array[*]}
Heavy Gray
  • 21,077
  • 14
  • 49
  • 72
3
IN="bla@some.com;john@home.com"
IFS=';'
read -a IN_arr <<< "${IN}"
for entry in "${IN_arr[@]}"
do
    echo $entry
done

Output

bla@some.com
john@home.com

System : Ubuntu 12.04.1

rashok
  • 10,508
  • 11
  • 76
  • 90
2

Use the set built-in to load up the $@ array:

IN="bla@some.com;john@home.com"
IFS=';'; set $IN; IFS=$' \t\n'

Then, let the party begin:

echo $#
for a; do echo $a; done
ADDR1=$1 ADDR2=$2
jeberle
  • 680
  • 3
  • 14
  • Better use `set -- $IN` to avoid some issues with "$IN" starting with dash. Still, the unquoted expansion of `$IN` will expand wildcards (`*?[`). – Isaac Oct 26 '16 at 05:17
2

Two bourne-ish alternatives where neither require bash arrays:

Case 1: Keep it nice and simple: Use a NewLine as the Record-Separator... eg.

IN="bla@some.com
john@home.com"

while read i; do
  # process "$i" ... eg.
    echo "[email:$i]"
done <<< "$IN"

Note: in this first case no sub-process is forked to assist with list manipulation.

Idea: Maybe it is worth using NL extensively internally, and only converting to a different RS when generating the final result externally.

Case 2: Using a ";" as a record separator... eg.

NL="
" IRS=";" ORS=";"

conv_IRS() {
  exec tr "$1" "$NL"
}

conv_ORS() {
  exec tr "$NL" "$1"
}

IN="bla@some.com;john@home.com"
IN="$(conv_IRS ";" <<< "$IN")"

while read i; do
  # process "$i" ... eg.
    echo -n "[email:$i]$ORS"
done <<< "$IN"

In both cases a sub-list can be composed within the loop is persistent after the loop has completed. This is useful when manipulating lists in memory, instead storing lists in files. {p.s. keep calm and carry on B-) }

NevilleDNZ
  • 1,183
  • 9
  • 29
2
IN='bla@some.com;john@home.com;Charlie Brown <cbrown@acme.com;!"#$%&/()[]{}*? are no problem;simple is beautiful :-)'
set -f
oldifs="$IFS"
IFS=';'; arrayIN=($IN)
IFS="$oldifs"
for i in "${arrayIN[@]}"; do
echo "$i"
done
set +f

Output:

bla@some.com
john@home.com
Charlie Brown <cbrown@acme.com
!"#$%&/()[]{}*? are no problem
simple is beautiful :-)

Explanation: Simple assignment using parenthesis () converts semicolon separated list into an array provided you have correct IFS while doing that. Standard FOR loop handles individual items in that array as usual. Notice that the list given for IN variable must be "hard" quoted, that is, with single ticks.

IFS must be saved and restored since Bash does not treat an assignment the same way as a command. An alternate workaround is to wrap the assignment inside a function and call that function with a modified IFS. In that case separate saving/restoring of IFS is not needed. Thanks for "Bize" for pointing that out.

Peter Mortensen
  • 28,342
  • 21
  • 95
  • 123
ajaaskel
  • 1,379
  • 10
  • 12
  • `!"#$%&/()[]{}*? are no problem` well... not quite: `[]*?` are glob characters. So what about creating this directory and file: `mkdir '!"#$%&'; touch '!"#$%&/()[]{} got you hahahaha - are no problem' and running your command? simple may be beautiful, but when it's broken, it's broken. – gniourf_gniourf Feb 20 '15 at 16:45
  • @gniourf_gniourf The string is stored in a variable. Please see the original question. – ajaaskel Feb 25 '15 at 07:20
  • 1
    @ajaaskel you didn't fully understand my comment. Go in a scratch directory and issue these commands: `mkdir '!"#$%&'; touch '!"#$%&/()[]{} got you hahahaha - are no problem'`. They will only create a directory and a file, with weird looking names, I must admit. Then run your commands with the exact `IN` you gave: `IN='bla@some.com;john@home.com;Charlie Brown – gniourf_gniourf Feb 25 '15 at 07:26
  • This is to demonstrate that the characters `*`, `?`, `[...]` and even, if `extglob` is set, `!(...)`, `@(...)`, `?(...)`, `+(...)` _are_ problems with this method! – gniourf_gniourf Feb 25 '15 at 07:29
  • One more argument against your method for the road: if someone uses this method with `nullglob` or `failglob` set, there'll be some surprises! you can try it: run your code with `shopt -s nullglob` and also with `shopt -s failglob`. – gniourf_gniourf Feb 25 '15 at 07:31
  • 1
    @gniourf_gniourf Thanks for detailed comments on globbing. I adjusted the code to have globbing off. My point was however just to show that rather simple assignment can do the splitting job. – ajaaskel Feb 26 '15 at 15:26
2

Apart from the fantastic answers that were already provided, if it is just a matter of printing out the data you may consider using awk:

awk -F";" '{for (i=1;i<=NF;i++) printf("> [%s]\n", $i)}' <<< "$IN"

This sets the field separator to ;, so that it can loop through the fields with a for loop and print accordingly.

Test

$ IN="bla@some.com;john@home.com"
$ awk -F";" '{for (i=1;i<=NF;i++) printf("> [%s]\n", $i)}' <<< "$IN"
> [bla@some.com]
> [john@home.com]

With another input:

$ awk -F";" '{for (i=1;i<=NF;i++) printf("> [%s]\n", $i)}' <<< "a;b;c   d;e_;f"
> [a]
> [b]
> [c   d]
> [e_]
> [f]
fedorqui 'SO stop harming'
  • 228,878
  • 81
  • 465
  • 523
2

In Android shell, most of the proposed methods just do not work:

$ IFS=':' read -ra ADDR <<<"$PATH"                             
/system/bin/sh: can't create temporary file /sqlite_stmt_journals/mksh.EbNoR10629: No such file or directory

What does work is:

$ for i in ${PATH//:/ }; do echo $i; done
/sbin
/vendor/bin
/system/sbin
/system/bin
/system/xbin

where // means global replacement.

Peter Mortensen
  • 28,342
  • 21
  • 95
  • 123
18446744073709551615
  • 14,600
  • 3
  • 82
  • 116
  • 1
    Fails if any part of $PATH contains spaces (or newlines). Also expands wildcards (asterisk *, question mark ? and braces […]). – Isaac Oct 26 '16 at 05:08
1

A one-liner to split a string separated by ';' into an array is:

IN="bla@some.com;john@home.com"
ADDRS=( $(IFS=";" echo "$IN") )
echo ${ADDRS[0]}
echo ${ADDRS[1]}

This only sets IFS in a subshell, so you don't have to worry about saving and restoring its value.

Peter Mortensen
  • 28,342
  • 21
  • 95
  • 123
Michael Hale
  • 1,357
  • 13
  • 16
  • -1 this doesn't work here (ubuntu 12.04). it prints only the first echo with all $IN value in it, while the second is empty. you can see it if you put echo "0: "${ADDRS[0]}\n echo "1: "${ADDRS[1]} the output is`0: bla@some.com;john@home.com\n 1:` (\n is new line) – Luca Borrione Sep 03 '12 at 10:04
  • 1
    please refer to nickjb's answer at for a working alternative to this idea stackoverflow.com/a/6583589/1032370 – Luca Borrione Sep 03 '12 at 10:05
  • 1
    -1, 1. IFS isn't being set in that subshell (it's being passed to the environment of "echo", which is a builtin, so nothing is happening anyway). 2. `$IN` is quoted so it isn't subject to IFS splitting. 3. The process substitution is split by whitespace, but this may corrupt the original data. – Score_Under Apr 28 '15 at 17:09
1

Okay guys!

Here's my answer!

DELIMITER_VAL='='

read -d '' F_ABOUT_DISTRO_R <<"EOF"
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=14.04
DISTRIB_CODENAME=trusty
DISTRIB_DESCRIPTION="Ubuntu 14.04.4 LTS"
NAME="Ubuntu"
VERSION="14.04.4 LTS, Trusty Tahr"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 14.04.4 LTS"
VERSION_ID="14.04"
HOME_URL="http://www.ubuntu.com/"
SUPPORT_URL="http://help.ubuntu.com/"
BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"
EOF

SPLIT_NOW=$(awk -F$DELIMITER_VAL '{for(i=1;i<=NF;i++){printf "%s\n", $i}}' <<<"${F_ABOUT_DISTRO_R}")
while read -r line; do
   SPLIT+=("$line")
done <<< "$SPLIT_NOW"
for i in "${SPLIT[@]}"; do
    echo "$i"
done

Why this approach is "the best" for me?

Because of two reasons:

  1. You do not need to escape the delimiter;
  2. You will not have problem with blank spaces. The value will be properly separated in the array!

[]'s

Eduardo Lucio
  • 835
  • 11
  • 22
  • FYI, `/etc/os-release` and `/etc/lsb-release` are meant to be sourced, and not parsed. So your method is really wrong. Moreover, you're not quite answering the question about _spiltting a string on a delimiter._ – gniourf_gniourf Jan 30 '17 at 08:26
0

Maybe not the most elegant solution, but works with * and spaces:

IN="bla@so me.com;*;john@home.com"
for i in `delims=${IN//[^;]}; seq 1 $((${#delims} + 1))`
do
   echo "> [`echo $IN | cut -d';' -f$i`]"
done

Outputs

> [bla@so me.com]
> [*]
> [john@home.com]

Other example (delimiters at beginning and end):

IN=";bla@so me.com;*;john@home.com;"
> []
> [bla@so me.com]
> [*]
> [john@home.com]
> []

Basically it removes every character other than ; making delims eg. ;;;. Then it does for loop from 1 to number-of-delimiters as counted by ${#delims}. The final step is to safely get the $ith part using cut.

Petr Újezdský
  • 952
  • 9
  • 12
-8

There are two simple methods:

cat "text1;text2;text3" | tr " " "\n"

and

cat "text1;text2;text3" | sed -e 's/ /\n/g'
Peter Mortensen
  • 28,342
  • 21
  • 95
  • 123
-10

Yet another late answer... If you are java minded, here is the bashj (https://sourceforge.net/projects/bashj/) solution:

#!/usr/bin/bashj

#!java

private static String[] cuts;
private static int cnt=0;
public static void split(String words,String regexp) {cuts=words.split(regexp);}
public static String next() {return(cnt<cuts.length ? cuts[cnt++] : "null");}

#!bash

IN="bla@some.com;john@home.com"

: j.split($IN,";")    # java method call

while true
do
    NAME=j.next()     # java method call
    if [ $NAME != null ] ; then echo $NAME ; else exit ; fi
done
Fil
  • 21
  • 4