I am building the back-end for my web app; it would act as an API for the front-end and it will be written in Python (Flask, to be precise).
After taking some decisions regarding design and implementation, I got to the database part. And I started thinking whether NoSQL data storage may be more appropriate for my project than traditional SQL databases. Following is a basic functionality description which should be handled by the database and then a list of pros and cons I could come up with regarding to which type of storage should I opt for. Finally some words about why I have considered RethinkDB over other NoSQL data storages.
Basic functionality of the API
The API consists of only a few models: Artist
, Song
, Suggestion
, User
and UserArtists
.
I would like to be able to add a User
with some associated data and link some Artist
s to it. I would like to add Song
s to Artist
s on request, and also generate a Suggestion
for a User
, which will contain an Artist
and a Song
.
Maybe one of the most important parts is that Artist
s will be periodically linked to User
s (and also Artist
s can be removed from the system -- hence from User
s too -- if they don't satisfy some criteria). Song
s will also be dynamically added to Artist
s. All this means is that User
s don't have a fixed set of Artist
s and nor do Artist
s have a fixed set of Song
s -- they will be continuously updating.
Pros
for NoSQL:
- Flexible schema, since not every
Artist
will have a FacebookID orSong
a SoundcloudID; - While a JSON API, I believe I would benefit from the fact that records are stored as JSON;
- I believe the number of
Song
s, but especiallySuggestion
s will raise quite a bit, hence NoSQL will do a better job here;
for SQL:
- It's fixed schema may come in handy with relations between models;
- Flask has support for SQLAlchemy which is very helpful in defining models;
Cons
for NoSQL:
- Relations are harder to implement and updating models transaction-like involves a bit of code;
- Flask doesn't have any wrapper or module to ease things, hence I will need to implement some kind of wrapper to help me make the code more readable while doing database operations;
- I don't have any certainty on how should I store my records, especially
UserArtist
s
for SQL:
- Operations are bulky, I have to define schemas, check whether columns have defaults, assign defaults, validate data, begin/commit transactions -- I believe it's too much of a hassle for something simple like an API;
Why RethinkDB?
I've considered RehinkDB for a possible implementation of NoSQL for my API because of the following:
- It looks simpler and more lightweight than other solutions;
- It has native Python support which is a big plus;
- It implements table joins and other things which could come in handy in my API, which has some relations between models;
- It is rather new, and I see a lot of implication and love from the community. There's also the will to continuously add new things that leverage database interaction.
All these being considered, I would be glad to hear any advice on whether NoSQL or SQL is more appropiate for my needs, as well as any other pro/con on the two, and of course, some corrections on things I haven't stated properly.