8

Here's the deal:

  1. I create a connection conn using the DriverManager
  2. I set conn.autoCommit(false);
  3. Then I have PreparedStatement pStat = conn.prepareStatement(insert_string_with_parameter);
  4. I set several parameters with pStat.set... then I add the batch with pStat.addBatch();
  5. Every 10000 rows (I call addBatch() 10000 times), I call pStat.executeBatch();
  6. Not sure if needed but I call also pStat.clearBatch() right after

Even if all the above sounds good to me, this is SLOW!!!.

I have an average of only 35 records (only 8 columns total, only a technical auto-incrementing primary key and some not null constraints) per second. I calculate that it would take me a week to insert all my 20M rows...

Am I doing anything wrong?

How many rows should I try to add at every batch cycle? Are 10000 too many?

Marsellus Wallace
  • 15,881
  • 21
  • 79
  • 143
  • I tried also with 1000 rows at the time but I didn't see any significant improvement... – Marsellus Wallace Sep 01 '11 at 01:00
  • 6
    Please show SQL, schema and indicate indexes and triggers that are on the table(s) involved. – Bohemian Sep 01 '11 at 01:02
  • no indexes and no triggers. It's the simplest table ever... – Marsellus Wallace Sep 01 '11 at 01:15
  • What is the network connection like between you and the db server? If it's slow/error-prone/high ping then that could explain it. (Just let us know - there are work arounds) – Bohemian Sep 01 '11 at 01:22
  • The internet is good. Could it depend on a not so updated db driver? Unfortunately I don't have much control over that part... – Marsellus Wallace Sep 01 '11 at 01:28
  • 2
    What is the database? Even if a driver provides the interface for batch insert, it doesn't mean that the operation is actually done in batch mode. This is the case of Ingres where the INSERTs will be sent indivually (as regular INSERT) even if executeBatch() was used. – RealHowTo Sep 01 '11 at 02:16
  • Tag this question with the database you are using –  Sep 01 '11 at 12:18
  • I'm afraid I cannot disclose which database i'm using by now... @RealHowTo: that's good to know, I'll check with the vendor. Thanks – Marsellus Wallace Sep 01 '11 at 14:32
  • Did you compare JDBC's insert time to the time of SQL script execution? 20M rows is very large number, so on slower machines it could take a week and even a month! – Danubian Sailor Mar 02 '12 at 11:35

1 Answers1

1

If you happen to be using MySQL with a JDBC driver around version 5.1.7 you may be affected by a bug that slows down batch inserts. Updating to 5.1.10 or later should take care of it.

Knut Forkalsrud
  • 933
  • 1
  • 6
  • 15