1

I'm working on a script to populate the database with data consisting millions of records, so ORM scripting (eg. using objects.create, objects.get_or_create) won't work for me. I quickly browsed over as to how should I go and write raw sql statements for my code, and I came across this link and as such I'm looking to replicate the "create_sqlite_db.py" code snippet for my project.

This is the code that I've written as of now, relevant to the problem:

models.py (location web(appname) -> models.py)

class Contact(SmartModel):
    c_id = CharField(primary_key=True, max_length=255, blank=False, null=False)
    duns = CharField(primary_key=False, max_length=255, blank=False, null=False)
    fullName = CharField(max_length=255)
    level = CharField(max_length=255)
    title = CharField(max_length=255)

populate_Contact_db.py (location web -> populate_Contact_db.py)

import sqlite3

conn = sqlite3.connect('lpdp2.db') #.db file located in projectname->lpdp2.db
c = conn.cursor()

#c.execute('''CREATE TABLE Contact (c_id TEXT, duns TEXT, fullName TEXT, level TEXT, title TEXT)''')

lst = list()

with open("filename.txt", "rb") as myfile:
    for line in myfile:
        col = line.strip().split("|")
        c_id = col[0]
        duns=col[1]
        name=col[8]
        level=col[11]
        job=col[12]
        lst.extend([(c_id, duns, name, level, job)])

    c.executemany("INSERT INTO web_contact VALUES (?,?,?,?,?)", lst)

conn.commit()
conn.close()

On running populate_Contact_db, I get the following error:

    ---------------------------------------------------------------------------
OperationalError                          Traceback (most recent call last)
C:\Users\Saurabh\workspace\local_mayorbrown\web\populate_Contact_db.py in <module>()
     18         lst.extend([(c_id, duns, name, level, job)])
     19
---> 20     c.executemany("INSERT INTO web_contact VALUES (?,?,?,?,?)", lst)
     21
     22 conn.commit()

OperationalError: table web_contact has 3 columns but 5 values were supplied

In this case I'm confused because I've already made migrations using makemigrations web -> migrate -> syncdb.

I've also tried:

c.executemany("INSERT INTO web_contact(c_id, duns, fullName, level, title) VALUES", lst)

In that case I'm getting the following error:

---------------------------------------------------------------------------
OperationalError                          Traceback (most recent call last)
C:\Users\Saurabh\workspace\local_mayorbrown\web\populate_Contact_db.py in <module>()
     18         lst.extend([(c_id, duns, name, level, job)])
     19
---> 20     c.executemany("INSERT INTO web_contact(c_id, duns, fullName, level, title) VALUES", lst)
     21
     22 conn.commit()

OperationalError: near "VALUES": syntax error

So I would like some suggestions/ corrections in my code, as to how I should rectify this issue and also whether I'm importing the database table properly in my code. Thanks!!!

CL.
  • 158,085
  • 15
  • 181
  • 214
user3466132
  • 239
  • 1
  • 4
  • 10
  • Just in case, there is a [`bulk_create`](https://docs.djangoproject.com/en/1.8/ref/models/querysets/#bulk-create) method on the ORM which allows insertion of many records in one query. So you could for example instanciate batchs of 1000 models and save them all at once. – Agate Oct 04 '15 at 22:24
  • Thanks for your suggestion. I know that bulk_create is the only feasible option, as per my understanding, in this case and I'm working on a script using that as well. However I think for 6 milliion records with 5 cols, even that would take hours (I'm just assuming here) whereas raw SQL will be faster and would take max an hour, so I was looking for a raw SQL script for insertion of data. – user3466132 Oct 04 '15 at 23:25
  • If you don't need anything fancy, just convert your file to CSV, and import it [directly in your database](http://stackoverflow.com/questions/2987433/how-to-import-csv-file-data-into-a-postgres-table) ? – Agate Oct 05 '15 at 11:44

0 Answers0