0

I want to do 2 things on my script below in which I am currently struggling.

Info about the function: Currently it grabs last 20 lines and display info based on the requested columns from the lines.

First) I want to ignore lines that contain same column 5 and 13. Example: (The both lines below should be ignored because column 5 in the first line matches column 13 and the second line is the same as the first one because first name from column 5 matches the name of the column 13)

1,42,16, 201,stackoverflow_user, 1, 6762160, 39799, 9817242, 6762160, 39884, 10010545,stackoverflow_user, 2, 1351147, 1165, 483259, 1351147, 1115, 241630, 0 
1,46,27, 201,[stackoverflow_user | stackoverflow_userother], 1, 4078465, 286991, 1594830, 4078465, 287036, 1643156,stackoverflow_user, 2, 1357147, 1115, 241630, 1357147, 1065, 120815, 0 

Second) I want to include reserved words. Similiar to the above but I should specify the reserved words: lets say if column 5 or 13 or both contain the name: STACK, it should ignore these lines.

1,42,16, 201,STACK, 1, 6762160, 39799, 9817242, 6762160, 39884, 10010545,stackoverflow_usersecond, 2, 1351147, 1165, 483259, 1351147, 1115, 241630, 0 
1,46,27, 201,[stackoverflow_user | stackoverflow_userother], 1, 4078465, 286991, 1594830, 4078465, 287036, 1643156,STACK, 2, 1357147, 1115, 241630, 1357147, 1065, 120815, 0
1,46,27, 201,[STACK | stackoverflow_userother], 1, 4078465, 286991, 1594830, 4078465, 287036, 1643156,STACK, 2, 1357147, 1115, 241630, 1357147, 1065, 120815, 0 

All lines above should be ignored because STACK is either in the 5th column, or 13th or in both.

This below is my actual function:

    function DMMRankings()

    {
        # read a file into an array
        $lines = file('C:/path/to/file.txt', FILE_IGNORE_NEW_LINES | FILE_SKIP_EMPTY_LINES);

        # take the last 20 lines of the file -- i.e. the last 20 values in the array
        $last_ten = array_slice($lines, -20);

        #create an array for the output
        $n = 1;
        $content = '';

        foreach ($last_ten as $l) {
            # treat the data as comma-separated values
            $arr = explode(",", $l);
            # if col 5 has multiple values, take the first one
            if (preg_match("/\[(.+?) \|/", $arr[4], $matches)) {
            $arr[4] = $matches[1];
            }
            # store the data we want in an output array.
            $data = array('rank-pos' => $n++, 'rank-name' => $arr[4], 'rank-dmuser' => $arr[12]);
        $content .= Template::Load('rankinguserdm-' . ($n % 2 == 1 ? 2 : 1), $data);
        }

        $this->content = Template::Load('user_rankingsdm', array('rankings' => $content));
    }

Please help me out how I can do this on my current function. Thanks!

Monk25
  • 25
  • 1
  • 5
  • how big is this file going to get? slurping in an entire file and then throwing away all but 20 lines is going to be painfully wasteful as it gets larger. Plus, if it's csv data, why not use fgetcsv() to read the lines to start with, or at least str_getcsv() to the csv parsing for you? – Marc B Oct 17 '14 at 14:40
  • File may contain a lot of lines like 5,000 (more or less) It will be a big file but I am not sure how to do it in a better way.You say currently it will read all the lines until it grabs the last 20 ? :( – Monk25 Oct 17 '14 at 14:42
  • no. it's going to read ALL of the lines, then throw away all but 20. – Marc B Oct 17 '14 at 14:42
  • I will really appreciate it if you are able to help me out to edit it in a better way so it wont affect the performance of the server? The text file is in plain text and you can see sample of the lines above. What I should edit mostly and actually how to include the checks as stated on my main post? Thanks a lot! – Monk25 Oct 17 '14 at 14:44
  • Other solutions for reading in the last **x** lines of a file: http://stackoverflow.com/questions/2961618/how-to-read-only-5-last-line-of-the-text-file-in-php . @MarcB even more painful is reading in a huge file using `fgetcsv` when you only need the last 20 lines... – i alarmed alien Oct 17 '14 at 15:05
  • I posted it as separate question, i alarmed alien. Thanks a lot for the help in the previous one. If you can help me out with this matter to add the following checks, it will be really greatly appreciated. I will accept the other answer in the previous question now. – Monk25 Oct 17 '14 at 15:14
  • @ialarmedalien: no, but you can fseek() to the end of the file, scan backwards to find -20 files, then start fgetcsv()'ing – Marc B Oct 17 '14 at 16:14
  • can you include that in your exampple with the `fseek()` ? – Monk25 Oct 17 '14 at 16:16
  • @Monk25 No point in MarcB posting that -- just look at the answers to the linked SO question. Your approach really depends on the size of your data file and how much memory you have available. If you're parsing massive files, fseek would be a better approach. Adding more detail to your question will elicit more precise responses. – i alarmed alien Oct 17 '14 at 16:34
  • Actually the average size of the file could be around 1mb and around 5,000 - 6,000 lines. I've got really strong Windows Server but still will this be a problem to use your suggestion with `file()` or I should edit it ? – Monk25 Oct 17 '14 at 16:38

1 Answers1

0

The following stack question will provide you with a rundown about the best way to read the last few lines of your file, I saw your comments.

What is the best way to read lines from end of file

With regard to the actual processing, it's CSV so check out the following inbuilt PHP function to create your array:

str_getcsv

once you have that sorted:

/* This is just a one-liner to parse the file */
$aryCSV = array_map('str_getcsv', file('data.csv'));

foreach($aryCSV as $aryRow) {
    /* This is your "ignore check" */
    if(
        $aryRow[5] == 'STACK' &&
        $aryRow[13] == 'STACK'
    ) {
        /* Go to next iteration */
        continue;
    } 
    /* Do your template loading and data saving */        
}
Community
  • 1
  • 1
Simon
  • 666
  • 2
  • 5
  • 15