0

I have a csv file for all the plane crash incidents since 1908. The csv header columns are,

Date,Time,Location,Operator,Flight #,Route,Type,Registration,cn/In,Aboard,Fatalities,Ground,Summary

My code is below, but this prints the entire file, is there a way using CSV or shelve module using which I can make a persistent db. End goal here is to do a slice and dice on the data, eg: sort by date, group by year etc.

#!/usr/bin/env python3.8
import sqlite3
import csv

conn = sqlite3.connect(":memory:")
cur = conn.cursor()

cur.execute('DROP TABLE IF EXISTS air_disaster')
cur.execute('''
CREATE TABLE "air_disaster" (
    "Date1" TEXT,
    "Time1" TEXT,
    "Location1" TEXT,
    "Operator1" TEXT,
    "Flight_No" TEXT,
    "Route" TEXT,
    "Type1" TEXT,
    "Registration" TEXT,
    "cn" TEXT,
    "Aboard" REAL,
    "Fatalities" REAL,
    "ground" TEXT,
    "Summary" TEXT
)
''')

with  open("/Users/foobar/Downloads/Airplane_Crashes_and_Fatalities_Since_1908.csv") as csv_file:
    csv_reader = csv.DictReader(csv_file)
    for row in csv_reader:
 
        Date1 = row["Date"]
        Time1 = row["Time"]
        Location1 = row["Location"]
        Operator1 = row["Operator"]
        Flight_No = row["Flight #"]
        Route = row["Route"]
        Type1 = row["Type"]
        Registration = row["Registration"]
        cn = row["cn/In"]
        Aboard = row["Aboard"]
        Fatalities = row["Fatalities"]
        ground = row["Ground"]
        Summary = row["Summary"]
        cur.execute('''INSERT INTO air_disaster(Date1,Time1,Location1,Operator1,Flight_No,Route,Type1,Registration,cn,Aboard,Fatalities,ground,Summary)
        VALUES(?,?,?,?,?,?,?,?,?,?,?,?,?)''', (Date1,Time1,Location1,Operator1,Flight_No,Route,Type1,Registration,cn,Aboard,Fatalities,ground,Summary))
        conn.commit()
    (cur.execute('SELECT * from air_disaster where Fatalities > 1000'))

foobar666
  • 11
  • 2

1 Answers1

0

If you are open to using pandas, you could create a dataframe with read_csv(), and then save to a DB with to_sql(). You would also have the added benefit of having a rich environment for transforming or preprocessing the data before uploading it.

Gus
  • 115
  • 6