13

I recently migrated my rails app to PostgreSQL in order to take advantage of fulltext search.

Since the migration coincided with moving to a new webhost, the steps for migration were:

  1. Deploy the application and db:create/db:schema:load on the new server, with appropriate database.yml file
  2. mysqldump data only from existing MySQL production database
  3. import data into PostgreSQL database

The application is running successfully but the issue comes when trying to add new content to the database. For example, when I run the rake task to update my twitter feed:

PG::Error: ERROR:  duplicate key value violates unique constraint "twitter_feeds_pkey" DETAIL:  Key (id)=(3) already exists.

This also happens for all other models, creating new articles, users etc. In development I can see that posting the insert statement n+1 times will successfully save the record without error.

My question is: How do I tell PostgreSQL to start adding indexes sequentially from the existing data?

I've read the REINDEX page but don't think that is really the operation I'm looking for.

Peter Mellett
  • 532
  • 5
  • 18
  • 1
    Does the table contain serials / sequences , used as a PK? – wildplasser Dec 02 '12 at 14:37
  • 1
    This question should help: http://stackoverflow.com/questions/244243/how-to-reset-postgres-primary-key-sequence-when-it-falls-out-of-sync – mccannf Dec 02 '12 at 14:44
  • Damn, I should have known there was prior art, before digging it up from my "personal archive". – wildplasser Dec 02 '12 at 14:50
  • Thank you for the linked question, the language I failed to summon from my brain was 'primary key sequence is not in sync with my table rows' – Peter Mellett Dec 02 '12 at 16:12
  • 1
    This question is answered extremely well here: http://stackoverflow.com/questions/244243/how-to-reset-postgres-primary-key-sequence-when-it-falls-out-of-sync – superluminary Apr 22 '13 at 10:44

4 Answers4

29

In Rails you can use the command

ActiveRecord::Base.connection.reset_pk_sequence!('users')

to bring the primary key index for the User table in sync again.

18

If the schema contains serial or sequence columns, you should reset these to the max value that occurs in the corresponding column. (normally you should not import the serials from a file, but give them the freedom to autoincrement.)

For all imported tables you should identify the sequence fields and run the following code on them. (substitute your schema name for "sch", your table name for "mytable" and your id column name for "id")

WITH mx AS ( SELECT MAX(id) AS id FROM sch.mytable)
SELECT setval('sch.mytable_id_seq', mx.id) AS curseq
FROM mx
        ;
wildplasser
  • 38,231
  • 6
  • 56
  • 94
4

You can automate wildplasser's solution so that all sequences are synchronized with the current maximum value of their associated column:

do
$block$
declare 
  r        record;
  stmt     text;
  max_id   integer;
begin
  for r in (
              select *
              from (
                select table_schema, 
                       table_name, 
                       column_name, 
                       pg_get_serial_sequence(table_schema||'.'||table_name, column_name) as col_sequence
                from information_schema.columns
                where table_schema not in ('pg_catalog', 'information_schema')
              ) t
              where col_sequence is not null
        ) 
  loop
    stmt := 'select coalesce(max('||r.column_name||'), 0) + 1 from '||r.table_schema||'.'||r.table_name;
    execute stmt into max_id;
    raise notice 'Next ID for %.%.% is %', r.table_schema, r.table_name, r.column_name, max_id;
    perform setval(r.col_sequence, max_id); 
  end loop;
end;
$block$

Note that this will only work if the columns have been defined as serial, bigserial or have been made the "owner" of a sequence.

a_horse_with_no_name
  • 440,273
  • 77
  • 685
  • 758
  • Great. I had a similar thing handy but I did not want to spoil the OP's editing party ;-) BTW: you could add a few quote_identifier() functions to the names. – wildplasser Dec 02 '12 at 15:38
  • Thank you, I'm always a fan of automation. – Peter Mellett Dec 02 '12 at 16:17
  • @a_horse_with_no_name: I did not know about the pg_get_serial_sequence() function. Comes in handy, avoiding the joining of three(?) additional catalogs. – wildplasser Dec 02 '12 at 16:21
0

Another way is remove the primary key (id) from the columns (or don't dump the id). So your data would look like

INSERT INTO book (name, price) VALUES ('Alchemist' , 10);

instead of

INSERT INTO book (id, name, price) VALUES (1 , 'Alchemist' , 10);

This way, you won't have to reset the primary key after loading initial data

user
  • 15,863
  • 15
  • 90
  • 110