7

I'm trying to make a page which will generate a result-set from a complex database query & php parsing... but that's mainly beside the point... The main point is that this takes a minute or two to complete, and I'm hoping to display a progress bar rather then a generic gif animation "loading..." picture.

A breakdown would be...

  • User opens Page A.
  • Page A requests data from Page B (Most likely AJAX).
  • Page B processes the 100000+ or so entries in the database and parses them.
  • Page A shows a progress bar which shows roughly how far through the process is
  • Page B returns the result set.
  • Page A displays the result set.

I know how to return data to the ajax query, but my issue is I don't know how to continuously return data to show the status of the process (Eg. % of rows scanned).

I've looked into EventSource / Server-Sent-Events, which shows promise, I'm just not too sure how to get it working properly, or if there is a better way to do it.

I've tried making a quick little mock-up page, using just EventSource works fine, but when I split it up into an eventSource call (page which monitors a session variable for change), and an ajax request (the actual data sending/return) it falls apart.

I'm probably missing something obvious, or doing something stupidly wrong, but this is most of what I have anyway... Any help, suggestions, tips, or even suggestions of completely other ways to do it would be awesome :)

User page:

<!DOCTYPE html>
<html>
<head>
    <title>Dynamic Progress Bar Example</title>
    <script src="script.js"></script>
</head>
<body>
    <input type="button" value="Submit" onclick="connect()" />
    <progress id='progressor' value="0" max='100' style=""></progress>
</body>
</html>

Javascript

 var es;

   function connect() {
       startListener();
       $.ajax({
           url: "server.php",
           success: function() {
               alert("Success");
           },
           error: function() {
               alert("Error");
           }
       });
   }

   function startListener() {
       es = new EventSource('monitor.php');

       //a message is received
       es.addEventListener('message', function(e) {
           var result = JSON.parse(e.data);

           if (e.lastEventId == 'CLOSE') {
               alert("Finished!");
               es.close();
           } else {
               var pBar = document.getElementById('progressor');
               pBar.value = result;
           }
       });

       es.addEventListener('error', function(e) {
           alert('Error occurred');
           es.close();
       });
   }

   function stopListener() {
       es.close();
       alert('Interrupted');
   }

   function addLog(message) {
       var r = document.getElementById('results');
       r.innerHTML += message + '<br>';
       r.scrollTop = r.scrollHeight;
   }

Monitor PHP

<?php
SESSION_START();
header('Content-Type: text/event-stream');
// recommended to prevent caching of event data.
header('Cache-Control: no-cache'); 

function send_message($id, $data) {
    $d = $data;
    if (!is_array($d)){
        $d = array($d);
    }

    echo "id: $id" . PHP_EOL;
    echo "data: " . json_encode($d) . PHP_EOL;
    echo PHP_EOL;

    ob_flush();
    flush();
}


$run = true;
$time = time();
$last = -10;

while($run){
    // Timeout kill checks
    if (time()-$time > 360){
        file_put_contents("test.txt", "DEBUG: Timeout Kill", FILE_APPEND);
        $run = false;
    }

    // Only update if it's changed
    if ($last != $_SESSION['progress']['percent']){
        file_put_contents("test.txt", "DEBUG: Changed", FILE_APPEND);
        $p = $_SESSION['progress']['percent'];
        send_message(1, $p); 
        $last = $p;
    }

    sleep(2);
}
?>

EDIT: I've tried a different approach, where:

  • Page A AJAX calls page B, which runs the request, and saves the progress to a SESSION variable
  • Page A AJAX calls page C every 2 seconds, which simply returns the value of the session variable. This loop is terminated when it reaches 100

However, this is not quite working either. It seems that the two AJAX requests, or the two scripts server-side are not running simultaneously.

Looking at debug output: Both AJAX calls are executed at about the same time, but then the page B script runs to completion by itself, and -then- the page C script runs. Is this some limitation of PHP I'm missing???

more code!

Server (Page B) PHP

<?PHP
    SESSION_START();

    file_put_contents("log.log", "Job Started\n", FILE_APPEND);

    $job = isset($_POST['job']) ? $_POST['job'] : 'err_unknown';
    $_SESSION['progress']['job'] = $job;
    $_SESSION['progress']['percent'] = 0;

    $max = 10;
    for ($i=0; $i<=$max;$i++){
        $_SESSION['progress']['percent'] = floor(($i/$max)*100);
        file_put_contents("log.log", "Progress now at " . floor(($i/$max)*100) . "\n", FILE_APPEND);
        sleep(2);
    }

    file_put_contents("log.log", "Job Finished", FILE_APPEND);
    echo json_encode("Success. We are done.");
?>

Progress (Page C) PHP

<?php
    SESSION_START();
    file_put_contents("log.log", "PR: Request Made", FILE_APPEND);

    if (isset($_SESSION['progress'])){
        echo json_encode(array("job"=>$_SESSION['progress']['job'],"progress"=>$_SESSION['progress']['percent']));
    } else {
        echo json_encode(array("job"=>"","progress"=>"error"));
    }
?>

Index (Page A) JS/HTML

<!DOCTYPE html>
<html>
<head>
        <title>Progress Bar Test</title>
</head>
<body>
        <input type="button" value="Start Process" onclick="start('test', 'pg');"/><br />
        <progress id="pg" max="100" value="0"/>

        <script src="https://ajax.googleapis.com/ajax/libs/jquery/1.11.3/jquery.min.js"></script>
        <script type="text/javascript">
            var progress = 0;
            var job = "";

            function start(jobName, barName){
                startProgress(jobName, barName);
                getData(jobName);
            }

            function getData(jobName){
                console.log("Process Started");
                $.ajax({
                    url: "server.php",
                    data: {job: jobName},
                    method: "POST",
                    cache: false,
                    dataType: "JSON",
                    timeout: 300,
                    success: function(data){
                        console.log("SUCCESS: " + data)
                        alert(data);
                    },
                    error: function(xhr,status,err){
                        console.log("ERROR: " + err);
                        alert("ERROR");
                    }
                });
            }

            function startProgress(jobName, barName){
                console.log("PG Process Started");
                progressLoop(jobName, barName);
            }

            function progressLoop(jobName, barName){
                console.log("Progress Called");
                $.ajax({
                    url: "progress.php",
                    cache: false,
                    dataType: "JSON",
                    success: function(data){
                        console.log("pSUCCESS: " . data);
                        document.getElementById(barName).value = data.progress;
                        if (data.progress < 100 && !isNaN(data.progress)){
                            setTimeout(progressLoop(jobName, barName), (1000*2));
                        }
                    },
                    error: function(xhr,status,err){
                        console.log("pERROR: " + err);
                        alert("PROGRESS ERROR");
                    }
                });
            }
        </script>
</body>
</html>

Debug: log.log output

PR: Request Made
Job Started
Progress now at 0
Progress now at 10
Progress now at 20
Progress now at 30
Progress now at 40
Progress now at 50
Progress now at 60
Progress now at 70
Progress now at 80
Progress now at 90
Progress now at 100
Job Finished
PR: Request Made
Kieran
  • 643
  • 8
  • 20
  • 2
    There are a number of similar questions and answers over there ↑ → – Raad May 13 '15 at 12:24
  • 2
    None of the ones I looked at answered what I was after (or at least they did not seem to). For example, many are about file uploads, which rely on file-uploading specific variables and functions. I suppose what I am also asking in here, in addition to how to do it, is: is it possible to have the server push the progress to the user, rather then the user having to continuously request it. (EG. SSE) – Kieran May 13 '15 at 12:33
  • 1
    Ah, ok - so for "push" notifications (a la COMET) you might find this useful: http://stackoverflow.com/a/14436086/1407999 and this http://www.nolithius.com/game-development/comet-long-polling-with-php-and-jquery – Raad May 13 '15 at 12:46
  • 1
    An overly simplified solution: Page A makes the call to Page B to start the job, Page B writes it's progress to a secondary table with a job number. Page A polls the progress database every x seconds getting data like step 1 of 1000. Then you can use simple math to calculate a % and output your progress bar appropriately. – Novocaine May 13 '15 at 12:49
  • Raad - thanks for the links, will look into those :) ... @Novocaine - See edit. May just be doing something wrong, but doesn't seem to be working. (Although using session vars rather then a table, but should be the same concept?) – Kieran May 14 '15 at 05:23
  • 1
    Another solution could be to use a websocket. I've only ever used websockets with socket.io and node.js, but it looks like there are php solutions for websockets: http://socketo.me/ Websockets are great because once a connection is established, they allow either client or server to initiate data transmission. – Kevin Dice May 14 '15 at 07:57
  • `startProgress` serves no purpose in this - unless you're just using it for debugging purposes. Additionally `setTimeout` should be `setInterval` again you might be doing this for debugging? I'm not sure otherwise why it might not be working. It seems logical. – Novocaine May 14 '15 at 09:38
  • No wait - that's wrong. As your loop function is recursive, of course you don't need `setInterval`. Does it atleast output to the console every 2 seconds, just with the same number? – Novocaine May 14 '15 at 09:45
  • @Novocaine I'll edit the question with the debug log.log output – Kieran May 14 '15 at 12:11
  • Could it be that, once you open the session in B, it is locked until the execution completes, so that execution of C must wait for B to release its lock on session? – Matteo Tassinari May 14 '15 at 12:24
  • That's what I was thinking?? But I'm not experienced enough with the intricacies of apache and php to know for sure. It seems as though they both can't execute at the same time. Hopefully I can find out some solution for this progress bar thing! >.> – Kieran May 14 '15 at 13:16
  • @Novocaine the startProcess() etc, I pulled them all into diff functions cause I was trying a few diff things to see if I could figure out why I got that log output. I did have them all in about 2 function initially – Kieran May 14 '15 at 13:48
  • I'm thinking the same as Matteo, as the log file appears to be writing correctly. I've never tried reading/writing from the session at the same time with ajax before - only on separate instances. I've previously used the database approach which worked well, and does help with keeping a log of everything that ever happened incase something goes wrong and needs checking. While using the session does seem to provide a smaller footprint, maybe it's just not viable in this case? p.s. I thought as much with the functions being split out ;] I'd end up doing the same myself. – Novocaine May 14 '15 at 14:32
  • @Novocaine - Oh ok, so you think its the reading/writing of session, as opposed to the actual execution of the script??? If thats the case, why wouldnt the PR log come up sooner? as it gets called before the session variable is accessed... or maybe it's waiting on the session_start() ????? ... I'll do a trial with a database or something when I get a moment and get back to you guys :) – Kieran May 14 '15 at 14:47
  • @Novocaine I switched it over to a file_put_ and file_get_contents, and it seemed to work fine. Looks like it might be something with the sessions? I'll do some more playing around... – Kieran May 15 '15 at 06:23

1 Answers1

2

In similar cases, I usually do it this way:

  • Client sends AJAX request to Page B. Important: On success, client sends the same request again.
  • On the initial request, Page B says: OK, THERE ARE 54555 RECORDS.. I use this count to initiate the progress bar.
  • On each of next requests, Page B returns a chunk of data. Client counts the size of chunk and updates progress bar. Also it collects chunks in one list.
  • On last request, when all data is sent, Page B says: THAT'S ALL and client renders the data.

I think, you've gotten the idea.

NOTE: you can request all chunks in parallel, but it is a complex way. Server (Page B) should also return a fixed chunksize in the initial response, then client sends TOTAL_COUNT / CHUNK_SIZE requests concurrently and combines the responses till the last request is completed. So it is much faster. You can use https://github.com/caolan/async in this case to do the code much more readable.

Kirill Rogovoy
  • 555
  • 3
  • 11
  • Oh ok, so the initial request tells the user how much to expect, and every other request contains data, so the user's browser can go amount received vs how much expecting?? Is that the idea? – Kieran May 15 '15 at 10:02
  • 1
    Yep, that is the idea. Extra bonus: you won't kill user's browser by one-time allocationg huge amount of memory. You even can render every chunk "behind the scene", so you won't kill user's browser by one-time render of huge amount of DOM nodes. – Kirill Rogovoy May 15 '15 at 10:19
  • I suppose you'd need to factor in what to do in cases of missing data or errors or whatever... I guess sending a "this is packet 3" could solve some of that... Anyway, I'll def look into this, its actually something I didn't consider :) ... I guess the main issue is figuring out how to have the server start the process on the first request, then return parts of data on the following requests... Any suggestions on that? Because generally each time you'd query, at least in the scripts I've seen, it would just re-start (new instance), as opposed to grab data from earlier instances... – Kieran May 15 '15 at 11:34
  • Nothing stops you from sending some exclusive flag in your first message like `GIMME DATA`. On server, you may use some session to share status between php processes (I mean, between different requests). But I recommend you to create idempotent server handler and manage chunking on client like: – Kirill Rogovoy May 15 '15 at 15:20
  • CLIENT: `HOW MUCH RECORD DO YOU HAVE? QUERY: "ABC"` SERVER: `I HAVE 55565`, CLIENT: `GIVE ME 0-99. QUERY: "ABC"`, SERVER: `OK`, CLIENT: `GIVE ME 100-199. QUERY: "ABC"`, SERVER: `OK`, and so on... So you don't need any logics on the server besides data fetching and errors throwing. Client manages chunks and error handling. – Kirill Rogovoy May 15 '15 at 15:26
  • Ahhh ok. In my case I have a lot of scanning and comparison to do, but I think I know how I could work that to fit into the same system. (i.e. give me filter a, then filter b, then filter c, etc) ... to be specific, I'm taking 3 different tables (about 66000 entries), and then scanning a particular column(s) in them to find any values that don't match a pre-generated list, then returning those non-matched values + other identifying info from the offending row – Kieran May 15 '15 at 22:19
  • This method is working perfectly! :D ... it also reduced the execution time of the script by like 4x haha (due to handling smaller chunks)... thanks heaps :) – Kieran May 18 '15 at 06:02