0

I have an upcoming project working with many technologies, and I am in the beginnings of my research. I thought maybe it would be best if I got some others’ opinions on strategizing my problem.

The problem: I have a large csv (containing > 100,000 entries). I need to create an online html table from this csv file. This table needs to be searchable and sortable.

I’ve come up with two possible solutions:

1) Use a database

From what I’ve gathered, plenty of databases offer the ability to import a csv file to create a searchable data table. However, I do not know what the differences would be if I were to use a standard SQL type database (e.g. MySQL or PostgresSQL) or if I were to use a NoSQL type database like MongoDB. I am more familiar with the NoSQL like databases, (obviously, I’m not an expert in the tech; I just have a working knowledge of such databases) such as MongoDB or RethinkDB (for live data streaming).

2) Parse the csv to JSON

Parsing an csv to JSON would force me to implement code for manual searching and sorting with something like JQuery or something else. This solution does not seem inherently like a better solution than using a database.

Another facet of this problem is that csv file will be rewritten possible daily or weekly. How do I handle something like this? Would I need to drop the data from the old table in my database and then recreate a new table containing the data from the new csv?

Note I will more than likely be using NodeJS with other libraries/APIs/frameworks that go along with Fullstack JavaScript development. However, I am not opposed to using or hearing solutions from other languages/stacks.

Also, I believe my problem is dependent on the complexity of the CSV file, (something I do not know yet).

Thanks in advance for any help! It is much appreciated.

Community
  • 1
  • 1
A. Werner
  • 355
  • 1
  • 2
  • 14

1 Answers1

0

If it's just one CSV and ONE flat table. I don't see how SQL or noSQL matters.

For the CSV, if the records contains a unique key or combined unique key to identify individual record, keep a record of the unique key of last record imported to db; if the csv file is only appended to, without any modification to previous records imported last time, just take the size of the file at each import is enough.

Use these information to quickly seek to end of last imported record and create a delta csv. for the appended records that can be easily import to database with command like COPY e.g. How to import CSV file data into a PostgreSQL table?

One of the javascript datagrid I think you may be interested in. https://github.com/6pac/SlickGrid/wiki/Examples

Forgot to mention, once the csv is imported, query can be directly converted to json within postgresql, e.g. PostgreSQL return result set as JSON array?

Ben
  • 866
  • 10
  • 26