0

I ran a small test regarding insertion speed with a sqlite3 database on my HDD. This is the code.

import sqlite3
import json
import time

db_conn = sqlite3.connect("test_database.db")
db_cursor = db_conn.cursor()
db_cursor.execute("CREATE TABLE IF NOT EXISTS DataTable (json_data TEXT)")

test_start = time.time()
for i in range(0,500):
    dict_to_add = {"number": str(i), "value": "sadhashdaszdas9dzas0ds9d69ad6adsadsasda6"}
    db_cursor.execute("INSERT INTO DataTable (json_data) VALUES(?)", (json.dumps(dict_to_add),))
    db_conn.commit()

test_end = time.time()
db_cursor.close()
db_conn.close()
print ("Duration:", round(test_end-test_start, 2), "seconds.")

It took 41 seconds for these 500 really small inserts. That's how I usually insert values to a database and on SSD it seems decently fast, but on HDD it is slow even for these small inserts.

Is there a way to make it go faster without having invalid entries (I read about PRAGMA synchronous = OFF, but if it gets interrupted there would be invalid entries)?



EDIT
Here is what I ended up using, it is executemany method, which seems to work much faster (0.08 seconds).

def multitransaction_test():
    db_conn = sqlite3.connect("test_database.db")
    db_cursor = db_conn.cursor()
    db_cursor.execute("CREATE TABLE IF NOT EXISTS DataTable (json_data TEXT)")

    test_start = time.time()
    vals_to_insert = []
    for i in range(0,500):
        dict_to_add = {"number": str(i), "value": "sadhashdaszdas9dzas0ds9d69ad6adsadsasda6"}
        vals_to_insert.append( (json.dumps(dict_to_add),) )
        #db_cursor.execute("INSERT INTO DataTable (json_data) VALUES(?)", (json.dumps(dict_to_add),))
        #db_conn.commit()
    print ('adding')
    try:
        db_cursor.executemany("INSERT INTO DataTable (json_data) VALUES (?)", tuple(vals_to_insert))
        db_conn.commit()
    except:
        print ("An exception during insertion!")
        db_conn.rollback()


    test_end = time.time()
    db_cursor.close()
    db_conn.close()
    print ("Duration:", round(test_end-test_start, 2), "seconds.")
    return
DoctorEvil
  • 377
  • 5
  • 11
  • **transaction** inserts are way faster: https://stackoverflow.com/a/5009740/575376 – juergen d Aug 17 '19 at 10:44
  • Also see [this question](https://stackoverflow.com/questions/1711631/improve-insert-per-second-performance-of-sqlite). – Shawn Aug 17 '19 at 10:55
  • 1
    [Here is a question](https://stackoverflow.com/questions/15856976/transactions-with-python-sqlite3) specific to sqlite and Python. Whatever you decide to do, the existing code could be immediately improved by moving `db_conn.commit()` out of the for loop. – C Perkins Aug 17 '19 at 16:45
  • 1
    Although general optimizations should help in all cases, if you really want to understand and get help for the different storage types, then don't give an exact number for one test, then just say "decently fast" for the other. Give precise times for multiple tests and with different number of records. Show times for both HDD and SSD for 500, 10000 and 100000 records. – C Perkins Aug 17 '19 at 16:55

0 Answers0