1

Currently, I am using prepared statements to insert a bunch of rows with underneath code:

UPDATED CODE:

$sql = $mysqli->prepare("INSERT INTO `TBLName` (Col1, Col2, Col3) Values(?, ?, ?)");
for ($i = 0; $i < 1000; $i++) {
    $sql->bind_param('sss', $Col1Data, $Col2Data, $Col3Data);
    $sql->execute();
    }
$sql->close();

However, I have seen an article talking about Transaction being way faster... Could I use transaction in combination with the prepared statement or should I use it in a different way? Suppose that I have to insert 1000 rows per script run and the script get's triggered with an interval of 5 seconds, what is the best way to go for me?

TVA van Hesteren
  • 753
  • 1
  • 11
  • 36
  • 1
    Sidenote: Why are you using `real_escape_string()` with a prepared statement? You don't need to and using an additional function/db call each time you run this. – Funk Forty Niner Mar 29 '17 at 10:03
  • 1
    You may want to look at this http://stackoverflow.com/questions/1176352/pdo-prepared-inserts-multiple-rows-in-single-query – Vini Mar 29 '17 at 10:04
  • @Fred-ii-, you are telling me that with a prepared statement I don't have to use the real_escape_string function? are you serious.. that would be a big win through my scripts already I guess! – TVA van Hesteren Mar 29 '17 at 10:04
  • 1
    that is exactly what I'm saying @TVAvanHesteren the prepared statement is enough and yes it would save some time. – Funk Forty Niner Mar 29 '17 at 10:05
  • @Fred-ii-, Wauw, I'm going to adjust that to start with! So when I want to insert something similar to 'Tom's book' it won't be a problem because of the apostrophe after the m of Tom? – TVA van Hesteren Mar 29 '17 at 10:07
  • 1
    @TVAvanHesteren you'll certainly gain performance and avoid an injection. Give it a whirl. – Funk Forty Niner Mar 29 '17 at 10:08
  • @Fred-ii-, might that also be the reason that NULL isn't inserted correctly when I use prepared statement and real_escape_string function? Will it insert correctly when I remove the real_escape_string and keep the prepared statement? – TVA van Hesteren Mar 29 '17 at 10:09
  • @Fred-ii-, I did some fast test already and I indeed proved your sidenote, many thanks for this! much appreciated, going to modify my scripts with this knowledge thanks to you – TVA van Hesteren Mar 29 '17 at 10:12
  • 1
    @TVAvanHesteren that's hard to say and as to what the `$ColXData`'s are and if your columns accept NULL values. Thing is right now, you're doing 1000x db calls for each 3 values; that's 3000 right there. Btw, you're welcome. – Funk Forty Niner Mar 29 '17 at 10:14
  • @Fred-ii-, right. It are string with an approximate length of 10-20 characters. So when I remove the real_escape_string function it would only require 1000 SQL connections instead of 4000 right? (400 because 1000 with all 1000 3 times extra for the real_escape_string, if I understood you correctly?) – TVA van Hesteren Mar 29 '17 at 10:29
  • 1
    @TVAvanHesteren well actually, you'd only be "executing/binding" the number of times it's inserted and not using a db call each time; you're already connecting/preparing outside the loop. But in theory, you're right about what you said above. If you run a benchmark on both methods, you'll see what you'll lose/gain. – Funk Forty Niner Mar 29 '17 at 11:12
  • @Fred-ii-, great explanation, thanks. I have updated my scripts and it is already running faster! However, I still want to perform these queries even faster. I will update my code with the code I am using right now so you can maybe help me on a faster execution of the data – TVA van Hesteren Mar 29 '17 at 11:25

1 Answers1

1

I recommend 100 to 1000 rows being inserted in a single INSERT statement. (I don't think the code you presented will work without some more changes.)

And do the 100-1000 rows either with autocommit=1 or in a TRANSACTION. This has the benefit of not hogging the system and not collecting lots of data to potentially "undo". But the drawback of not doing all the rows atomically.

Less than 100 leads to extra overhead; more than 1000 gets into diminishing returns and other forms of overhead.

The 100-1000 is also Replication-friendly, rather than hogging the replication stream.

If you already have the data in a csv file, use LOAD DATA.

Rick James
  • 106,233
  • 9
  • 103
  • 171