0

The project requires storing binary data into PostgreSQL (project requirement) database. For that purpose we made a table with following columns:

id   : integer, primary key, generated by client
data : bytea, for storing client binary data

The client is a C++ program, running on Linux. The rows must be inserted (initialized with a chunk of binary data), and after that updated (concatenating additional binary data to data field). Simple tests have shown that this yields better performance.

Depending on your inputs, we will make client use concurrent threads to insert / update data (with different DB connections), or a single thread with only one DB connection.

We haven't much experience with PostgreSQL, so could you help us with some pointers concerning possible bottlenecks, and whether using multiple threads to insert data is better than using a single thread.

Thank you :)

Edit 1:

More detailed information:

  • there will be only one client accessing the database, using only one Linux process
  • database and client are on the same high performance server, but this must not matter, client must be fast no matter the machine, without additional client configuration
  • we will get new stream of data every 10 seconds, stream will provide new 16000 bytes per 0.5 seconds (CBR, but we can use buffering and only do inserts every 4 seconds max)
  • stream will last anywhere between 10 seconds and 5 minutes
omittones
  • 799
  • 9
  • 23

1 Answers1

2

It makes extremely little sense that you should get better performance inserting a row then appending to it if you are using bytea.

PostgreSQL's MVCC design means that an UPDATE is logically equivalent to a DELETE and an INSERT. When you insert the row then update it, what's happening is that the original tuple you inserted is marked as deleted and new tuple is written that contains the concatentation of the old and added data.

I question your testing methodology - can you explain in more detail how you determined that insert-then-append was faster? It makes no sense.

Beyond that, I think this question is too broad as written to really say much of use. You've given no details or numbers; no estimates of binary data size, rowcount estimates, client count estimates, etc.

bytea insert performance is no different to any other insert performance tuning in PostgreSQL. All the same advice applies: Batch work into transactions, use multiple concurrent sessions (but not too many; rule of thumb is number_of_cpus + number_of_hard_drives) to insert data, avoid having transactions use each others' data so you don't need UPDATE locks, use async commit and/or a commit_delay if you don't have a disk subsystem with a safe write-back cache like a battery-backed RAID controller, etc.

Given the updated stats you provided in the main comments thread, the amount of data you want to consume sounds entirely practical with appropriate hardware and application design. Your peak load might be achievable even on a plain hard drive if you had to commit every block that came in, since it'd require about 60 transactions per second. You could use a commit_delay to achieve group commit and significantly lower fsync() overhead, or even use synchronous_commit = off if you can afford to lose a time window of transactions in case of a crash.

With a write-back caching storage device like a battery-backed cache RAID controller or an SSD with reliable power-loss-safe cache, this load should be easy to cope with.

I haven't benchmarked different scenarios for this, so I can only speak in general terms. If designing this myself, I'd be concerned about checkpoint stalls with PostgreSQL, and would want to make sure I could buffer a bit of data. It sounds like you can so you should be OK.

Here's the first approach I'd test, benchmark and load-test, as it's in my view probably the most practical:

One connection per data stream, synchronous_commit = off + a commit_delay.

INSERT each 16kb record as it comes in into a staging table (if possible UNLOGGED or TEMPORARY if you can afford to lose incomplete records) and let Pg synchronize and group up commits. When each stream ends, read the byte arrays, concatenate them, and write the record to the final table.

For absolutely best speed with this approach, implement a bytea_agg aggregate function for bytea as an extension module (and submit it to PostgreSQL for inclusion in future versions). In reality it's likely you can get away with doing the bytea concatenation in your application by reading the data out, or with the rather inefficient and nonlinearly scaling:

CREATE AGGREGATE bytea_agg(bytea) (SFUNC=byteacat,STYPE=bytea);

INSERT INTO final_table SELECT stream_id, bytea_agg(data_block) FROM temp_stream_table;

You would want to be sure to tune your checkpointing behaviour, and if you were using an ordinary or UNLOGGED table rather than a TEMPORARY table to accumulate those 16kb records, you'd need to make sure it was being quite aggressively VACUUMed.

See also:

Community
  • 1
  • 1
Craig Ringer
  • 259,831
  • 56
  • 584
  • 684
  • But isn't an update to the `bytea` column eligible for a HOT update as it is not indexed? So it would not result in a "delete/insert" . Not sure about TOAST as well. Would a bytea value that is completely "toasted" result in a delete/insert of the tuple when being updated? – a_horse_with_no_name Jan 26 '13 at 08:24
  • Added additional information to the question, will upload code sample Monday. – omittones Jan 26 '13 at 09:42
  • @a_horse_with_no_name A HOT update is still effectively a lightweight insert+delete. There's a new tuple written and an old one marked with an xmax. The difference with HOT is that *if there is free space on the page to write the new tuple*, and no indexed columns are touched by the update, it is possible to write a new tuple to the heap and avoid updating the indexes. A new tuple is still written, it isn't an overwrite, and it's only done if there's enough space to add both the old and new tuples. `TOAST` further complicates things by storing data out-of-line. – Craig Ringer Jan 26 '13 at 10:30
  • @omittones You're talking about consuming a lot of data. If a new stream comes in every 10 seconds and can last 10 seconds to 5 minutes, that's a maximum of 30 concurrent streams (300 seconds duration / 10 second start intervals), each delivering 32 megabytes per second of data. That's up to 960 *megabytes per second* of data. That's enormous. Is it possible I've misunderstood your explanation? Assuming the average stream duration was 2.5 minutes so average data rate was half peak rate that'd be 1.7 petabytes per hour. – Craig Ringer Jan 26 '13 at 10:36
  • @Craig: the question says 16000 **bytes** every 0.5s, not 16 megabytes – Daniel Vérité Jan 26 '13 at 12:21
  • @CraigRinger: it's 16000 **bytes**, or 16 kilobytes, not megabytes :) The scenario I described is peak usage. During normal workload, the usage will probably be quarter that or even less, and during night, even less than that. Problem is the client has to operate realtime, so minimum insertion rate HAS to be 32000 bytes per second per stream (there will NEVER be more than 30 streams). Plus, old data WILL be deleted (not sure how old though). – omittones Jan 26 '13 at 16:35
  • @DanielVérité Er. *blushes*. Yeah. That's less horrifying. Off by 10^3. – Craig Ringer Jan 26 '13 at 22:35