0

This question is involved by this one: How to speed up insertion performance in PostgreSQL

So, I have java application which is doing a lot of (aprox. billion) INSERTs into PostgreSQL database. It opens several JDBC connections to the same DB for doing these inserts in parallel. As I read in mentioned question-answer:

INSERT or COPY in parallel from several connections. How many depends on your hardware's disk subsystem; as a rule of thumb, you want one connection per physical hard drive if using direct attached storage.

But in my case I have only one disk storage for my DB.

So, my question is: does it really have sence to open several connections in this case? Could it reduce perfomance instead of desired increasing due to I/O operations competitions?

For clarifying, here is the picture with actual postgresql processes load:

enter image description here

Community
  • 1
  • 1
Andremoniy
  • 31,241
  • 14
  • 110
  • 218
  • It depends on the disk, CPU, how much CPU work is involved in processing the data load, etc. *Test it and see*. In general I'd be surprised if you benefited from more than two concurrent `COPY` sessions unless you have a high performance SSD. If you insist on using `INSERT`s instead, then there are all sorts of overheads, so more may be helpful. Again, *test and see*. Or better, switch to `COPY`. – Craig Ringer Feb 05 '14 at 02:26
  • @CraigRinger I can not `COPY` because I'm reading `XML` files and pushing almost its content into `bytea` column. Preparing these files in new format doesn't seem to be possible. – Andremoniy Feb 05 '14 at 05:17
  • Yeah, you can, you just need to supply the input in the right format. If you aren't sure, try `COPY FROM` to dump a table you've loaded a little data into; also take a look at the documentation for details on the format. http://www.postgresql.org/docs/current/static/sql-copy.html . And remember you can use `COPY` via PgJDBC using the `CopyManager` API. – Craig Ringer Feb 05 '14 at 05:25

1 Answers1

-1

Since you mentioned INSERT in Java application, I'd assume (utilizing plain JDBC) COPY is not what you're looking for. Without using API like JPA or framework such as Spring-data, may I introduce addBatch() and executeBatch() in case you haven't heard of these:

/*
 the whole nine yards
 */
Connection c = ...;
PreparedStatement ps = c.prepareStatement("INSERT INTO table1(columnInt2,columnVarchar)VALUES(?,?)");

Then read data in a loop:

ps.setShort(1, someShortValue);
ps.setString(2, someStringValue);
ps.addBatch();  // one row at a time from human's perspective

When data of all rows are prepared:

ps.executeBatch();

May I also recommend:

高科技黑手
  • 1,232
  • 1
  • 12
  • 20
  • It's a very broad answer, which doesn't cover exactly my case and isn't related to `PostgreSQL`'s features. – Andremoniy Feb 06 '14 at 10:40