7

This is how I am planning to build my utilities for a project :

  • logdump dumps log results to file log. The results are appended to the existing results if the file is already there (like if a new file is created every month, the results are appended to the same file for that month).

  • extract reads the log result file to extract relevant results depending on the arguments provided.

  • The thing is that I do not want to wait for logdump to finish writing to log to begin processing it. Also that way I will need to remember till where I already read log to begin extracting more information, which is not what I want to do.

  • I need live results so that whenever something is added to the log results file, extract will get the required results.

  • The processing that extract will do will be generic (will depend on some command line arguments to it), but surely on a line by line basis.

This involves reading a file as and when it is being written to and continuously monitoring it for new updates even after you reach the end of the log file.

How can I do this using C or C++ or shell scripting or Perl?

Lazer
  • 79,569
  • 109
  • 264
  • 349
  • 2
    In these cases, I try to modify the logging to go to a database. Then it's really easy to get the records you haven't processed yet. If you haven't designed the logging part yet, that could be the way to go. – brian d foy Sep 09 '10 at 22:28

3 Answers3

15

tail -f will read from a file and monitor it for updates when it reaches EOF instead of quitting outright. It's an easy way to read a log file "live". Could be as simple as:

tail -f log.file | extract

Or maybe tail -n 0 -f so it only prints new lines, not existing lines. Or tail -n +0 -f to display the entire file, and then continue updating thereafter.

John Kugelman
  • 307,513
  • 65
  • 473
  • 519
  • While this serves my need, is there any way to do the same using C or C++? – Lazer Sep 09 '10 at 21:03
  • 1
    @Lazer: you can always "cheat" and look at the "Hacker's Man Page"--the source code to tail. IIRC, it is really simple C code. Look here: http://stackoverflow.com/questions/1439799/how-can-i-get-the-source-code-for-the-linux-utility-tail – Harold Bamford Sep 09 '10 at 21:15
9

The traditional unix tool for this is tail -f, which keeps reading data appended to its argument until you kill it. So you can do

tail -c +1 -f log | extract

In the unix world, reading from continuously appended-to files has come to be known as “tailing”. In Perl, the File::Tail module performs the same task.

use File::Tail;
my $log_file = File::Tail->new("log");
while (defined (my $log_line = $log_file->read)) {
    process_line($log_line);
}
cjm
  • 59,511
  • 9
  • 121
  • 166
Gilles 'SO- stop being evil'
  • 92,660
  • 35
  • 189
  • 229
6

Using a simple stand-in for logdump

#! /usr/bin/perl

use warnings;
use strict;

open my $fh, ">", "log" or die "$0: open: $!";
select $fh;
$| = 1;  # disable buffering

for (1 .. 10) {
  print $fh "message $_\n" or warn "$0: print: $!";
  sleep rand 5;
}

and the skeleton for extract below to get the processing you want. When logfile encounters end-of-file, logfile.eof() is true. Calling logfile.clear() resets all the error state, and then we sleep and try again.

#include <iostream>
#include <fstream>
#include <cerrno>
#include <cstring>
#include <unistd.h>

int main(int argc, char *argv[])
{
  const char *path;
  if      (argc == 2) path = argv[1];
  else if (argc == 1) path = "log";
  else {
    std::cerr << "Usage: " << argv[0] << " [ log-file ]\n";
    return 1;
  }

  std::ifstream logfile(path);
  std::string line;
  next_line: while (std::getline(logfile, line))
    std::cout << argv[0] << ": extracted [" << line << "]\n";

  if (logfile.eof()) {
    sleep(3);
    logfile.clear();
    goto next_line;
  }
  else {
    std::cerr << argv[0] << ": " << path << ": " << std::strerror(errno) << '\n';
    return 1;
  }

  return 0;
}

It's not as interesting as watching it live, but the output is

./extract: extracted [message 1]
./extract: extracted [message 2]
./extract: extracted [message 3]
./extract: extracted [message 4]
./extract: extracted [message 5]
./extract: extracted [message 6]
./extract: extracted [message 7]
./extract: extracted [message 8]
./extract: extracted [message 9]
./extract: extracted [message 10]
^C

I left the interrupt in the output to emphasize that this is an infinite loop.

Use Perl as a glue language to make extract get lines from the log by way of tail:

#! /usr/bin/perl

use warnings;
use strict;

die "Usage: $0 [ log-file ]\n" if @ARGV > 1;
my $path = @ARGV ? shift : "log";

open my $fh, "-|", "tail", "-c", "+1", "-f", $path
  or die "$0: could not start tail: $!";

while (<$fh>) {
  chomp;
  print "$0: extracted [$_]\n";
}

Finally, if you insist on doing the heavy lifting yourself, there's a related Perl FAQ:

How do I do a tail -f in perl?

First try

seek(GWFILE, 0, 1);

The statement seek(GWFILE, 0, 1) doesn't change the current position, but it does clear the end-of-file condition on the handle, so that the next <GWFILE> makes Perl try again to read something.

If that doesn't work (it relies on features of your stdio implementation), then you need something more like this:

for (;;) {
  for ($curpos = tell(GWFILE); <GWFILE>; $curpos = tell(GWFILE)) {
    # search for some stuff and put it into files
  }
  # sleep for a while
  seek(GWFILE, $curpos, 0);  # seek to where we had been
}

If this still doesn't work, look into the clearerr method from IO::Handle, which resets the error and end-of-file states on the handle.

There's also a File::Tail module from CPAN.

Greg Bacon
  • 121,231
  • 29
  • 179
  • 236