3

My understanding of an in-memory table is a table that will be created in memory and would resort to disk as little as possible, if at all. I am assuming that I have enough RAM to fit the table there, or at least most of it. I do not want to use an explicit function to load tables (like pg_prewarm) in memory, I just want the table to be there by default as soon as I issue a CREATE TABLE or CREATE TABLE AS select statement, unless memory is full or unless I indicate otherwise. I do not particularly care about logging to disk.

7 years ago, a similar question was asked here PostgreSQL equivalent of MySQL memory tables?. It has received 2 answers and one of them was a bit late (4 years later).

One answer says to create a RAM disk and to add a tablespace for it. Or to use an UNLOGGED table. Or to wait for global temporary tables. However, I do not have special hardware, I only have regular RAM - so I am not sure how to go about that. I can use UNLOGGED feature, but as I understand, there is still quite a bit of disk interaction involved (this is what I am trying to reduce) and I am not sure if tables will be loaded in memory by default. Furthermore, I do not see how global temporary spaces are related. My understanding of them is that they are just tables in spaces that can be shared.

Another answer recommends an in-memory column store engine. And to then use a function to load everything in memory. The issue I have with this approach is that the engine being referred to looks old and unmaintained and I cannot find any other. Also, I was hoping I wouldn't have to explicitly resort to using a 'load into memory' function, but instead that everything will happen by default.

I was just wondering how to get in-memory tables now in Postgres 12, 7 years later.

Zeruno
  • 986
  • 12
  • 27

1 Answers1

2

Postgres has not in-memory tables, and I have not any information about serious work on this topic now. If you need this feature, then you can use special in-memory databases like REDIS, MEMCACHED or MonetDB. There are FDW drivers to these databases. So you can create in-memory tables in specialized database and you can work with these tables from Postgres via foreign tables.

MySQL memory tables was necessary when there was only MyISAM engine, because this engine has very primitive work with IO and MySQL had not own buffers. Now, MySQL has InnoDB engine (with modern form of joins like other databases) and lot of arguments for using MySQL in-memory tables are obsolete. Against old MySQL Postgres has own buffers and doesn't bypass file system caches, so all RAM is available for your data and you have to do nothing. Ten years ago we had to use MySQL inmemory engine to have good enough performance. But after migration to Postgres we had better performance without necessary work with inmemory tables.

If you have lot of memory, then Postgres can use it by default - via file system cache.

Pavel Stehule
  • 33,657
  • 3
  • 68
  • 77
  • Well, I am in a situation where I must use Postgres and I am not particularly interested in MySQL. How would I use it by default on Posgres "via file system cache"? – Zeruno Mar 08 '20 at 15:54
  • 1
    @Zeruno - do nothing, postgres does it by everytime – Pavel Stehule Mar 08 '20 at 16:03
  • I am interested in creating table using CTAS syntax. Currently in PostgreSQL, this invokes disk IO, that is what I am trying to minimize because I have a lot of available memory. I am seeing your suggestion to use those FDWs from PostgreSQL, but my understanding is that they do not support CTAS? Is this correct? – Zeruno Mar 18 '20 at 13:16
  • @Zeruno - if there is lot of write operations, then Postgres has to write to disc. You can reduce writing with unlogged tables or temporary tables, but you cannot to eliminate writing. But unlogged or temp tables are not guarded by transaction log, so the number of write operations is significantly reduced. For FDW you can use COPY (SELECT) TO fdwtab or INSERT INTO fdwtab SELECT ... – Pavel Stehule Mar 18 '20 at 16:10
  • Thanks. I was wondering, since you are more familiar with the internals, would you know if the "intermediate" relations of the query (at the query optimizer code) invoke as much disk access? I am assuming these internal structures stay in memory for as long as possible and avoid/delay writing to disk (as I want), maybe sacrificing some guarantees. I wouldn't mind exploring the code further if it is the case and I can access these structures. – Zeruno Mar 18 '20 at 16:14
  • 1
    There are not any intermediate relations - Postgres has special structure - tuplestore. When this structure is lower, then work_mem, then data are buffered in memory. When data are higher, then are stored in temp files. – Pavel Stehule Mar 18 '20 at 16:24
  • Is there straightforward functionality to perform SQL over tuplestores? – Zeruno Mar 18 '20 at 16:33
  • You can use SELECT * FROM tab ORDER BY column. Sort is based on usage of tuplestore – Pavel Stehule Mar 18 '20 at 16:43