11

I'm trying to store a query result in a temporary table for further processing.

create temporary table tmpTest
(
    a FLOAT,
    b FLOAT,
    c FLOAT
)
engine = memory;

insert into tmpTest
(
    select a,b,c from someTable
    where ...
);

But for some reason the insert takes up to a minute, whereas the subselect alone just takes a few seconds. Why would it take so much longer to write the data to a temporary table instead of printing it to my SQL management tool's output???

UPDATE My Setup: MySQL 7.3.2 Cluster with 8 Debian Linux ndb data nodes 1 SQL Node (Windows Server 2012)

The table I'm running the select on is a ndb table.

I tried to find out, if the execution plan would differ when using 'insert into..', but they look the same: (sorry for the formatting, stackoverflow doesn't have tables)

id  select_type     table       type    possible_keys   key     key_len ref                 rows        Extra
1   PRIMARY         <subquery3> ALL     \N              \N      \N      \N                  \N          \N
1   PRIMARY         foo         ref     PRIMARY         PRIMARY 3       <subquery3>.fooId   9747434     Using where
2   SUBQUERY        someTable   range   PRIMARY         PRIMARY 3       \N                  136933000   Using where with pushed condition; Using MRR; Using temporary; Using filesort
3   MATERIALIZED    tmpBar      ALL     \N              \N      \N      \N                  1000        \N

CREATE TABLE ... SELECT is slow, too. 47 seconds vs. 5 seconds without table insert/create.

techdude
  • 1,253
  • 20
  • 26
Ben
  • 4,037
  • 5
  • 29
  • 46
  • 1
    You should be more specific and provide some data you're writing. Also `INSERT .. SELECT` syntax is different. Your sample will result in error. – Alma Do Oct 09 '13 at 10:21
  • good question. I don't really know how mysql reserves `memory` for itself. If it goes through the OS api, it might just *asks* for memory, which would be then served as either *ram* or *harddrive* space, according to system's status. See windows' virtual bytes management. – Sebas Oct 09 '13 at 10:52
  • Have you tried a `CREATE TABLE ... SELECT`-type query for comparison? http://dev.mysql.com/doc/refman/5.0/en/create-table-select.html – xiankai Oct 09 '13 at 11:54
  • @AlmaDoMundo: why do you say my syntax is wrong? I don't get any errors. – Ben Oct 09 '13 at 12:48
  • @xiankai: I did :( Same result as INSERT INTO ... SELECT – Ben Oct 09 '13 at 12:49

3 Answers3

3

I wrote a comment above, then stumbled across this as a workaround.

This will accomplish what you want to do.

SELECT * FROM aTable INTO OUTFILE '/tmp/atable.txt';
LOAD DATA INFILE '/tmp/atable.txt' INTO TABLE anotherTable;

Note that doing this means managing the /tmp tables in some way. If you try to SELECT data into an OUTFILE that already exists, you get an error. So you need to generate unique temporary file names. And then run a cron job of some sort to go clean them up.

I guess INFILE and OUTFILE behave differently. If someone can shed some light on what is going on here to explain mysql behavior, I would appreciate it.

D

Here is a better way than using INFILE / OUTFILE.

SET TRANSACTION ISOLATION LEVEL READ COMMITTED; INSERT INTO aTable SELECT ... FROM ...

Here is a relevant post to read :

How to improve INSERT INTO ... SELECT locking behavior

Community
  • 1
  • 1
Don Wool
  • 237
  • 2
  • 9
1

I experiences the same issue and was playing around with subqueries that actually solved it. If the select has a huge amount of rows, its taking very long to insert the data. Example:

INSERT INTO b2b_customers (b2b_name, b2b_address, b2b_language)
SELECT customer_name, customer_address, customer_language
FROM customers
WHERE customer_name LIKE "%john%"
ORDER BY customer_created_date DESC
LIMIT 1

using LIMIT in combination to INSERT data is not a good option. So you could use 2 seperate queries for getting data and inserting, or you can use a subquery. Example:

INSERT INTO b2b_customers (b2b_name, b2b_address, b2b_language)
SELECT * FROM (
SELECT customer_name, customer_address, customer_language
FROM customers
WHERE customer_name LIKE "%john%"
ORDER BY customer_created_date DESC
LIMIT 1
) sub1

that would be a fast solution without changing your script.

So im not sure why its taking 0.01 seconds to run the subquery and 60 seconds to run the insert. I get 1000+ results without the Limit. In my case the subquery improved performance from 60 seconds to 0.01 seconds.

Ichigo Kurosaki
  • 3,555
  • 8
  • 35
  • 51
joevette
  • 11
  • 1
  • You're not sure why it takes a millisecond to **read** the data an 60 seconds to **write** 1000+ records? Maybe because reading is faster than writing? – Mjh Nov 03 '16 at 10:48
  • yes, thats true, but im writing only 1 row while using LIMIT 1. – joevette Nov 03 '16 at 13:01
  • `WHERE customer_name LIKE "%john%"` - this is a full table scan. Once every single record is scanned and reduced to retrieve what contains `john`. This takes a while. Then you write 1 record. It's inefficient. – Mjh Nov 03 '16 at 13:06
  • so mysql is writing all data into a temporary table and at the end it recognizes that I put a LIMIT 1 and writes only the first record into the table ? that would also explain why my capsulation with the subquery would run much faster. – joevette Nov 03 '16 at 14:30
  • It buffers the records it found (we can say it writes to temp table, yes) and then it uses only one record, correct. – Mjh Nov 03 '16 at 14:31
  • The results of the ``SELECT`` are buffered, so none of the records are inserted until the ``SELECT`` finishes. – kmoser May 17 '17 at 02:47
0

The reason has to do with how the computer reads and writes and the way a temp file works. Select is reading data that is in an indexed file on the hard drive whereas insert is using a temporary file and is writing to that file. More RAM is required and doing so is more difficult. As for why it takes a minute, I am not sure exactly but I think the code might be a little incorrect so that would contribute.

Ben Cartwright
  • 203
  • 1
  • 2
  • 5
  • My temporary table has its engine explicitly set to 'memory'. Doesn't this mean it should stay in RAM? My machine has 32GB RAM of which 39% are used. And my temporary table just has 1000 rows... – Ben Oct 09 '13 at 11:22
  • 3
    Did you ever get an answer to this? Or figure out how to improve the performance? I am getting the same behavior, where the SELECT takes sub-second, but the INSERT takes 30 seconds. It's completely baffling. I wonder if it is a bug? To get around this, I tried creating a cursor on the data I was SELECTing and INSERTED one row at a time and it was faster. Which is crazy. I will try a question similar to yours, but it appears that there is no answer to this. – Don Wool Nov 01 '13 at 09:07