My requirement is to store data into S3 and perform queries on S3 data using Amazon Redshift Spectrum. My data is modeled with one-to-many and many-to-many. For example consider the following SQL schema
user (id, name)
user_phoes (id, phone_type, user_id)
user_roles (id, role_type, user_id)
user_role_activities (id, type, user_role_id)
I need a better approach to store this data in S3. So that I can easily load these in Redshift through Redshift Spectrum for performing JOIN queries.
NOTE: Data will be inserted into S3 on scheduled basis. And Redshift should maintain the same foreign key constraint what I have in my model. Data may be inserted into S3 in any order. That is user_phones data before users data.
Expecting a better approach to store S3 and compute data in Redshift