1

I have a Amazon S3 bucket with files and want to generate a force download zip-file on Amazon EC2 (With Amazon Linux, similar to CentOS/Redhat Linux).

I have all signed urls in an (post) array and right now I'am using this code.

    <?php
    ini_set('max_execution_time', 300); //300 seconds = 5 minutes
    ini_set('memory_limit', '8192M');

    $imageNumber = 0;
    # create new zip opbject
    $zip = new ZipArchive();

    # create a temp file & open it
    $tmp_file = tempnam('.','');
    $zip->open($tmp_file, ZipArchive::CREATE);
    foreach ($_POST['text'] as $key => $value) {
        //Get extension
        $ext = pathinfo(parse_url($value, PHP_URL_PATH), PATHINFO_EXTENSION);

        # download file
        $download_file = file_get_contents($value);

        #add it to the zip
        $zip->addFromString(basename($imageNumber.'-image.'.$ext),$download_file);

        $imageNumber++;
    }

    # close zip
    $zip->close();

    # send the file to the browser as a download
    header('Content-disposition: attachment; filename=download.zip');
    header('Content-type: application/zip');
    readfile($tmp_file);
    ?>

My goal is to keep the costs on EC2 as low as possible and right now I just have 4GB on my instance. As you can see, I have increased the memory limit and the problem is that I have 50-100 files of about 5 MB each, which is a few hundred MB per Zip file which of course is a problem if several people generates a zip at the same time.

Is there a better solution (less memory use and as low cost as possible on EC2)?

Xtreme
  • 1,420
  • 7
  • 25
  • 46
  • if your EC2 instance is in the same region as your S3 instance I would just download, zip, upload, remove zip from EC2, then point your user to the new zip on your S3 bucket – cmorrissey Oct 05 '15 at 19:33
  • I would suggest figuring out how to make a queue in MySQL and having your PHP code figure out whether to place a ZIP request in the queue or simply process it right away based on the current processing load of the queue. You will also need a CRON job to continuously process the queue and possibly notify users when their ZIP is ready. – MonkeyZeus Oct 05 '15 at 19:33
  • 1
    Another thing which is likely to kill your memory consumption is `file_get_contents()`. I heavily recommend looking into mod_xsendfile so that Apache can take care of the file downloads. – MonkeyZeus Oct 05 '15 at 19:35
  • 1
    If you only have 4GB in your instance then why are you telling PHP that it's OK to try and use 8GB? – MonkeyZeus Oct 05 '15 at 19:37
  • don't do mysql for the queue. you're going to be burned by this at some point. use a proper queue-ing mechanism (sqs comes to mind in aws land) – Mircea Oct 05 '15 at 19:38
  • You should probably listen to @Mircea , it sounds like he's done this before :) – MonkeyZeus Oct 05 '15 at 19:38
  • also look at streaming to s3 directly and using a zip compressor while streaming – Mircea Oct 05 '15 at 19:42
  • Why should I sream (or upload) to S3? I want to generate a zip-file on EC2, let the user download it and then delete the zip-file. The content (urls in the array to S3) is different for every user. – Xtreme Oct 05 '15 at 19:44
  • in this case, stream to the user directly – Mircea Oct 05 '15 at 19:50
  • http://stackoverflow.com/questions/3078266/zip-stream-in-php – Mircea Oct 05 '15 at 19:51
  • @Mircea Do I need first to download the files to EC2, and then steam? I looked at this http://stackoverflow.com/questions/4357073/on-the-fly-zipping-streaming-of-large-files-in-php-or-otherwise/4357904#4357904 and I can't replace `$fp = popen('zip -r - 1.jpg 2.jpg 3.jpg', 'r'); ` with `$fp = popen('zip -r - www.link-to-s3-image-1.jpg www.link-to-s3-image-2.jpg www.link-to-s3-image-3.jpg', 'r'); ` and then delete the files from EC2. – Xtreme Oct 05 '15 at 20:21

0 Answers0