252

I was trying to use the following code to read lines from a file. But when reading a file, the contents are all in one line:

line_num=0
File.open('xxx.txt').each do |line|
  print "#{line_num += 1} #{line}"
end

But this file prints each line separately.


I have to use stdin, like ruby my_prog.rb < file.txt, where I can't assume what the line-ending character is that the file uses. How can I handle it?

the Tin Man
  • 150,910
  • 39
  • 198
  • 279
draw
  • 4,259
  • 6
  • 27
  • 36
  • 7
    Rather than doing `line_num = 0`, you could use `each.each_with_index` or possibly `each.with_index`. – Andrew Grimm May 16 '11 at 04:15
  • @andrew-grimm thank you, it makes cleaner code. – draw May 16 '11 at 04:26
  • See http://stackoverflow.com/q/25189262/128421 for why line-by-line IO is preferred over using `read`. – the Tin Man Aug 19 '14 at 00:09
  • Use `line.chomp` to handle the line endings (courtesy of [@SreenivasanAC](http://stackoverflow.com/a/24483583/165673)) – Yarin Feb 22 '15 at 22:14
  • Possible duplicate of [What are all the common ways to read a file in Ruby?](http://stackoverflow.com/questions/5545068/what-are-all-the-common-ways-to-read-a-file-in-ruby) – Brad Werth Mar 01 '16 at 15:31
  • I'd recommend reading http://stackoverflow.com/questions/25189262/why-is-slurping-a-file-bad – the Tin Man Dec 15 '16 at 17:54

8 Answers8

556

Ruby does have a method for this:

File.readlines('foo').each do |line|

http://ruby-doc.org/core-1.9.3/IO.html#method-c-readlines

the Tin Man
  • 150,910
  • 39
  • 198
  • 279
Jonathan
  • 14,900
  • 11
  • 59
  • 96
  • this methond slower than methond that's @Olivier L. – HelloWorld Jan 12 '13 at 02:33
  • 1
    @HelloWorld Probably because it's deleting each preceding line from memory and loading in each line into memory. May be wrong, but Ruby's probably doing things properly (so that large files don't cause your script to crash). – Starkers Sep 27 '13 at 11:19
  • Can you use `with_index` with this as well? – Joshua Pinter Jul 05 '15 at 18:11
  • 2
    Yes, you can, e.g. `File.readlines(filename).each_with_index { |line, i| puts "#{i}: #{line}" }` – wulftone Jun 17 '17 at 18:24
  • This method seems better. I am reading very large files and this way it doesn't crash the application by attempting to load the entire file into memory at once. – Shelby S Aug 31 '17 at 15:10
  • @HelloWorld Only if you the entire file fits into memory at once, otherwise this one will be faster or the only one that runs without crashing. Try each methods on a 1 GB file on a system with only 1 GB RAM. – Mecki Jun 30 '19 at 18:41
416
File.foreach(filename).with_index do |line, line_num|
   puts "#{line_num}: #{line}"
end

This will execute the given block for each line in the file without slurping the entire file into memory. See: IO::foreach.

ihaztehcodez
  • 1,973
  • 12
  • 28
talabes
  • 4,910
  • 1
  • 16
  • 26
153

I believe my answer covers your new concerns about handling any type of line endings since both "\r\n" and "\r" are converted to Linux standard "\n" before parsing the lines.

To support the "\r" EOL character along with the regular "\n", and "\r\n" from Windows, here's what I would do:

line_num=0
text=File.open('xxx.txt').read
text.gsub!(/\r\n?/, "\n")
text.each_line do |line|
  print "#{line_num += 1} #{line}"
end

Of course this could be a bad idea on very large files since it means loading the whole file into memory.

the Tin Man
  • 150,910
  • 39
  • 198
  • 279
Olivier L.
  • 2,529
  • 1
  • 15
  • 11
  • That regex didn't work for me. Unix format uses \n, windows \r\n, mac uses \n -- .gsub(/(\r|\n)+/,"\n") worked for me with all cases. – Pod May 02 '13 at 08:36
  • 5
    Correct regex should be ```/\r?\n/``` which will cover both \r\n and \n without combining empty lines as Pod's comment would do – Irongaze.com May 23 '13 at 17:05
  • 14
    This will read the entire file into memory, which could be impossible depending on how large the file is. – eremzeit Jun 25 '13 at 02:38
  • I think instead of text=File.open('xxx.txt').read you want File.read('xxx.txt'). Otherwise you need to close the file ? – Antoine Toulme Aug 02 '13 at 21:11
  • 1
    This method is very highly inefficient, talabes answer here http://stackoverflow.com/a/17415655/228589 is the best answer. Please verify the implementation these two methods. – CantGetANick Jan 06 '14 at 17:52
  • @AntoineToulme No, ruby will automatically close the file when gc. The real point is we should use Jonathan answer. – xis Mar 04 '15 at 22:00
  • 1
    This is not the ruby way. The answer below shows the right behavior. – Merovex Dec 26 '15 at 19:53
19

Your first file has Mac Classic line endings (that’s "\r" instead of the usual "\n"). Open it with

File.open('foo').each(sep="\r") do |line|

to specify the line endings.

Josh Lee
  • 149,877
  • 34
  • 253
  • 263
  • 1
    Sadly, there’s nothing like the universal newlines in Python, at least that I know of. – Josh Lee May 16 '11 at 03:40
  • one more question, I have to use stdin, like ruby my_prog.rb < file.txt, where I can't assume what the line ending char the file uses... How can I handle it? – draw May 16 '11 at 03:49
  • Olivier’s answer seems helpful, if you’re OK with loading the whole file into memory. Detecting newlines while still scanning the file will take a bit more work. – Josh Lee May 16 '11 at 23:27
9

I'm partial to the following approach for files that have headers:

File.open(file, "r") do |fh|
    header = fh.readline
    # Process the header
    while(line = fh.gets) != nil
        #do stuff
    end
end

This allows you to process a header line (or lines) differently than the content lines.

Ron Gejman
  • 5,775
  • 2
  • 23
  • 33
7

It is because of the endlines in each lines. Use the chomp method in ruby to delete the endline '\n' or 'r' at the end.

line_num=0
File.open('xxx.txt').each do |line|
  print "#{line_num += 1} #{line.chomp}"
end
Sreenivasan AC
  • 141
  • 2
  • 4
6

how about gets ?

myFile=File.open("paths_to_file","r")
while(line=myFile.gets)
 //do stuff with line
end
JBoy
  • 4,834
  • 12
  • 54
  • 89
4

Don't forget that if you are concerned about reading in a file that might have huge lines that could swamp your RAM during runtime, you can always read the file piece-meal. See "Why slurping a file is bad".

File.open('file_path', 'rb') do |io|
  while chunk = io.read(16 * 1024) do
    something_with_the chunk
    # like stream it across a network
    # or write it to another file:
    # other_io.write chunk
  end
end
Community
  • 1
  • 1
Nels
  • 308
  • 4
  • 15