How does this sed command work?

Question

I tried command sed 's/$/\r/g' linux.txt > linux2win.txt to convert the text file from Linux to Windows.

And it works! all \n are converted to \r\n

for example, hello, world \n is converted to hello, world \r\n

What confuses me is that what exactly $ refers to? \n ? or an empty char before \n? I don't even know what I replaced.

This might help: [The Stack Overflow Regular Expressions FAQ](http://stackoverflow.com/a/22944075/3776858) — Cyrus, Jan 04 '17 at 13:51
Possible duplicate of [Dollar sign in regular expression and new line character](http://stackoverflow.com/questions/13912373/dollar-sign-in-regular-expression-and-new-line-character) — Cyrus, Jan 04 '17 at 13:53
See [here](https://www.gnu.org/software/sed/manual/sed.html#BRE-vs-ERE), [here](https://www.gnu.org/software/sed/manual/sed.html#regexp-extensions) and last paragraph of [here](https://www.gnu.org/software/sed/manual/sed.html#The-_0022s_0022-Command). The last two references only for GNU sed. — potong, Jan 04 '17 at 15:09

Ed Morton · Accepted Answer · 2017-01-04T14:52:20.543

The answers/comments so far stating that $ matches the end of line are misleading. $ in a regexp matches end of string, that is all. The reason it appears to match end of line in sed is that by default sed reads 1 line at a time so in that context (but not in others) each string it's operating on does end at the end of the line.

So $ matches end-of-string and if your string ends at the end of a line then $ matches at the end of the line but if your string contains multiple lines (e.g. in sed you can create a multi-line string stored in a buffer) then $ does not match at the end of any given line, it simply and consistently matches at the end of the string.

Similarly ^ matches start-of-string, btw, not start-of-line as you may hear people claim.

wrt your comment:

my original line is hello, world \n$ and $ is invisible , and $ is replaced by \r, now my line is hello, world\n\r$ .`

No, that is not what is happening. Your original line is:

hello, world\n

and sed reads one \n-separated line at a time so what is read into seds buffer is the string:

hello, world

Now $ is a regexp metacharacter that matches the end-of-string so given the above string $ will match after the d (and ^ would match before h) so when you do

s/$/\r/

It changes the above string to:

hello world\r

and then when sed prints it out it adds back the newline (because a string with no terminating newline is not a text line per POSIX) and to outputs:

hello world\r\n

Note that $ is never part of the string, it's just a metacharacter that when used in a regexp matches the end of the string so you can test for characters appearing just at the end of a string or do other operations (like the above) after the end of the string.

Thanks! This answer really helps me! I didn't know that sed would consume one `\n`. Without your answer maybe I will misunderstand `$` in regexp for a long time. — PYL, Jan 04 '17 at 15:05

score 0 · Answer 2 · answered Jan 04 '17 at 13:54

0

$ matches the end of line, so the command:

sed 's/$/\r/g'

simply adds \r to the end of line, which is not what you say. If the input is "hello, world \r\n", the output would be "hello, world \r\n".

answered Jan 04 '17 at 13:54

Maroun

87,488
26
172
226

@PYL The command simply "appends" `\r` to the end of line (it actually replaces the "last place" with a `\r`). – Maroun Jan 04 '17 at 14:01
I think of it in this way: my original line is hello, world \n$ and $ is invisible , and $ is replaced by \r, now my line is hello, world\n\r$ . It is weird ,isn't it? – PYL Jan 04 '17 at 14:15
1

wrt `If the input is "hello, world \r\n", the output would be "hello, world \r\n"` - that depends on the environment you're running in and whether or not the underlying C primitives allow the `\r` from the input to get through to `sed`. If you ran on cygwin with sed in binary mode, for example, then the output would be `"hello, world \r\r\n"` – Ed Morton Jan 04 '17 at 15:02

score 0 · Answer 3 · answered Jan 04 '17 at 13:59

0

The premise of your question is flawed. The sed command you present converts Linux-style line terminators (newline alone) to Windows-style (carriage-return / newline), not the other way around.

It works like this:

the $ is a regex metacharacter that matches the zero-width end of the line (i.e. just prior to the line terminator, if any).
the substitution string is a carriage return character (expressed as \r); it replaces the zero-width character sequence matched by the regex, in effect inserting the carriage return immediately before the newline

The trailing g in the sed command specifies that all matches in each line should be replaced; it is superfluous because the cannot be more than one match per line.

Note also that this can be slightly quirky: if the input file does not end with a newline, then the output will end with just \r, because the end of the file is then the end of the last line.

answered Jan 04 '17 at 13:59

John Bollinger

121,924
8
64
118

I think of it in this way: my original line is hello, world \n$ and $ is invisible , and $ is replaced by \r, now my line is hello, world\n\r$ . It is weird ,isn't it? – PYL Jan 04 '17 at 14:14
No, @PYL, it is not weird at all. The newlines in the input file are not considered *part of* the lines by `sed`. They are line terminators -- the line ends just before, and the next line starts just after. `sed` consumes line terminators on input, and (by default) introduces new ones on output. You can get newlines into `sed`'s pattern space by other means, but you do not get them from simply reading a line into the pattern space. – John Bollinger Jan 04 '17 at 14:57
Thanks! Helps me a lot! – PYL Jan 04 '17 at 15:09

How does this sed command work?

3 Answers3