8

I cringed when Sebastien stated he was disconnecting & reconnecting between each use of mysqli_multi_query() @ Can mysqli_multi_query do UPDATE statements? because it just didn't seem like best practice.

However, Craig @ mysqli multi_query followed by query stated in his case that it was faster to disconnect & reconnect between each use of mysqli_multi_query() than to employ mysqli_next_result().

I would like to ask if anyone has further first-hand knowledge or benchmark evidence to suggest an approximate "cutoff" (based on query volume or something) when a programmer should choose the "new connection" versus "next result" method.

I am also happy to hear any/all concerns not pertaining to speed. Does Craig's use of a connecting function have any bearing on speed?

Is there a speed difference between Craig's while statement:

while ($mysqli->next_result()) {;}

- versus -

a while statement that I'm suggesting:

while(mysqli_more_results($mysqli) && mysqli_next_result($mysqli));

- versus -

creating a new connection for each expected multi_query, before running first multi_query. I just tested this, and the two mysqli_multi_query()s were error free = no close() needed:

$mysqli1=mysqli_connect("$host","$user","$pass","$db");
$mysqli2=mysqli_connect("$host","$user","$pass","$db");

- versus -

Opening and closing between each mysqli_multi_query() like Sebastien and Craig:

$mysqli = newSQL();
$mysqli->multi_query($multiUpdates);
$mysqli->close();

- versus -

Anyone have another option to test against?

Community
  • 1
  • 1
mickmackusa
  • 33,121
  • 11
  • 58
  • 86

2 Answers2

10

It is not next_result() to blame but queries themselves. The time your code takes to run relies on the time actual queries take to perform.

Although mysqli_multi_query() returns control quite fast, it doesn't mean that all queries got executed by that time. Quite contrary, by the time mysqli_multi_query() finished, only first query got executed. While all other queries are queued on the mysql side for the asynchronous execution.

From this you may conclude that next_result() call doesn't add any timeout by itself - it's just waiting for the next query to finish. And if query itself takes time, then next_result() have to wait as well.

Knowing that you already may tell which way to choose: if you don't care for the results, you may just close the connection. But in fact, it'll be just sweeping dirt under the rug, leaving all the slow queries in place. So, it's better to keep next_result() loop in place (especially because you have to check for errors/affected rows/etc. anyway) but speed up the queries themselves.

So, it turns out that to solve the problem with next_result() you have to actually solve the regular problem of the query speed. So, here are some recommendations:

  1. For the select queries it's usual indexing/explain analyze, already explained in other answers.
  2. For the DML queries, especially run in batches, there are other ways:

Speaking of Craig's case, it's quite much resembling the known problem of speed of innodb writes. By default, innodb engine is set up into very cautious mode, where no following write is performed until engine ensured that previous one were finished successfully. So, it makes writes awfully slow (something like only 10 queries/sec). The common workaround for this is to make all the writes at once. For insert queries there are plenty of methods:

  • you can use multiple values insert syntax
  • you can use LOAD DATA INFILE query
  • you can wrap all the queries in a transaction.

While for updating and deleting only transaction remains reliable way. So, as a universal solution such a workaround can be offered

 $multiSQL = "BEGIN;{$multiSQL}COMMIT;";
 $mysqli->multi_query($multiSQL);
 while ($mysqli->next_result()) {/* check results here */}

If it doesn't work/inapplicable in your case, then I'd suggest to change mysqli_multi_query() for the single queries run in a loop, investigate and optimize the speed and then return to multi_query.

Your Common Sense
  • 152,517
  • 33
  • 193
  • 313
  • I've never seen a bounty on SO before. According to the SO bounty help page [http://stackoverflow.com/help/bounty], you don't have to be the asker to put a bounty on a question. If you know anyone who has tens of thousands of rep points, I give them my gratitude in advance for setting a bounty. Obviously, the minimum bounty of 50 points would really break my piggy bank and strip me of nearly every possible privilege. – mickmackusa Mar 20 '14 at 23:27
  • 4
    Heh. That's the only part of the answer you seems got interested in. – Your Common Sense Mar 21 '14 at 05:49
  • +1 for a great answer pointing out the real issue and possible 'side effects' of programming choices. – LSerni Mar 24 '14 at 13:05
  • "multiple values insert syntax" - thanks by doing this my inserts were sped up by 40 times. – Rog Jul 21 '17 at 19:03
4

To answer your question:

look before you jump

I expect your mysqli_more_results() call (the look before you jump), doesn't speed up things: If you have n results, you'll do (2*n)-1 calls to the database, whereas Craig does n+1.

multiple connections

multi_query executes async, so you'll just be adding connection overhead.

opening and closing db

Listen to Your Common Sense ;-) But don't loose track of what you're doing. Wrapping queries in a transaction, will make them atomic. That means, they all fail, or they all succeed. Sometimes that is required to make the database never conflict with your universe of discourse. But using transactions for speedups, may have unwanted side-effects. Consider the case where one of your queries violates a constraint. That will make the whole transaction fail. Meaning that if they weren't a logical transaction in the first place and most queries should have succeeded, that you'll have to find out which went wrong and which will have to be reissued. Costing you more instead of delivering a speedup.

Sebastien's queries actually look like they should be part of some bigger transaction, that contains the deletion or updates of the parents.

Instead, try and remember

there is no spoon

In your examples, there was no need for multiple queries. The INSERT ... VALUES form takes multiple tuples for VALUES. So instead of preparing one prepared statement and wrap its repeated executions in a transaction like Your Common Sense suggest. You could prepare a single statement and have it executed and auto-committed. As per mysqli manual this saves you a bunch of roundtrips.

So make a SQL statement of the form:

INSERT INTO r (a, b, c) VALUES (?, ?, ?), (?, ?, ?), ...

and bind and execute it. mysqldump --opt does it, so why don't we? The mysql reference manual as a section on statement optimization. Look in its DML section for insert and update queries. But understanding why --opt does what it does is a good start.

the underestimated value of preparing a statement

To me, the real value of prepared statements is not that you can execute them multiple times, but the automatic input escaping. For a measly single extra client-server round-trip, you save yourself from SQL injection. SQL injection is a serious point of attention especially when you're using multi_query. multi_query tells mysql to expect multiple queries and execute them. So fail to escape properly and you're in for some fun:

Exploits of a mom

So my best practise would be:

  1. Do I really need multiple queries?
  2. If I do, escape them well, or prepare them!
Community
  • 1
  • 1
Chris Wesseling
  • 5,012
  • 2
  • 27
  • 64
  • Although my answer does mention particular case of repeated insert, the *question* doesn't bear even a trace of it. And the matter of this question is too far away from both topics you had fancy to discuss in your answer. – Your Common Sense Mar 28 '14 at 12:03
  • @YourCommonSense I've adjusted the answer to address the question more clearly. – Chris Wesseling Mar 28 '14 at 13:02
  • 1
    That's interesting genre - a comment for other answer in the form of answer. Although you are making some points in it, you fail to answer particular question explicitly. – Your Common Sense Mar 28 '14 at 13:07
  • @YourCommonSense Interesting... perhaps a wiki answer would be better form. I, don't quite get why I fail *deliberately*. I'm not a native speaker. Do you mean it in a different sense than say: „In TDD I deliberately make a test fail, *then* write the code to make it pass"? – Chris Wesseling Mar 28 '14 at 13:12
  • Neither I am. I changed the word already. The question is, in essence: why next_result is so slow. The answer is: it's not next_result to blame. – Your Common Sense Mar 28 '14 at 13:15
  • Thank you for highlighting the fallacies of my answer, by the way. – Your Common Sense Mar 28 '14 at 13:21
  • @YourCommonSense it's our lot in life ;) – Chris Wesseling Mar 28 '14 at 13:23
  • @ Chris Wesseling Thanks for taking the time to post. To explain the reason for my general question... I am making the switch from mysql to mysqli for a client at work. The project has php saving pdf-submitted data across four tables. Programs are performing INSERT(some with a nested SELECT query to correct the pdf export value), SHOW, SELECT, UPDATE, DELETE queries -- some of which are not simplistic in form. I am currently using a mixture of single queries, multi_queries, and TRIGGERS. For my usage, "Atomic" querying is fine; beneficial even. – mickmackusa Mar 30 '14 at 11:56
  • While I am already employing a few layers of injection defense, I am finding myself very compelled to try Your Common Sense's https://github.com/colshrapnel/safemysql for the sake of defense and DRYness. I guess I'll just have to apply it to some of the trickier queries to see it I can make it work for me. – mickmackusa Mar 30 '14 at 11:56