Questions tagged [amazon-redshift-spectrum]

Using Amazon Redshift Spectrum, you can query and retrieve structured and semistructured data from files in Amazon S3 without having to load the data into Amazon Redshift tables. Redshift Spectrum queries employ massive parallelism to execute very fast against large datasets.Multiple clusters can concurrently query the same dataset in Amazon S3 without the need to make copies of the data for each cluster.

194 questions
0
votes
0 answers

How to store one-to-many entities data in S3 for Amazon Redshift Spectrum

My requirement is to store data into S3 and perform queries on S3 data using Amazon Redshift Spectrum. My data is modeled with one-to-many and many-to-many. For example consider the following SQL schema user (id, name) user_phoes (id, phone_type,…
0
votes
1 answer

Unable to create external schema for Amazon Redshift Spectrum

Trying to follow the https://docs.aws.amazon.com/redshift/latest/dg/c-getting-started-using-spectrum.html to query s3 usage from redshift via athena. Running into an error when attempting to create schema in Step 3: "create external schema…
0
votes
1 answer

How to creating an external table in redshift spectrum, where file location will change everyday?

We are planning to source data from another AWS account's S3 by using AWS redshift spectrum. But Source informed that bucket key will change every day and latest data will be available in the bucket key location with latest timestamp. Can anyone…
Rajib Kar
  • 21
  • 1
0
votes
1 answer

SQL query on redshift to get the first and the last value

I have a data set like this. I need to write a query which gives me the below output for every SessionID and VisitID, it should sort based on the date_time Column and provide me with the First Category and the Last Category. I have used the…
0
votes
1 answer

how to specify s3 config for spectrify python package?

how to specify this s3_config object for python spectrify package ? from spectrify.export import RedshiftDataExporter RedshiftDataExporter(sa_engine, s3_config).export_to_csv('my_table')
0
votes
1 answer

Can an external table be created in Redshift in specific directories?

I have created an external table that reads the files of all the folders that are in the specified path using the following script: CREATE EXTERNAL TABLE spectrum.eventos_ne9 ( event_date varchar(300), event_timestamp varchar(300), event_name…
0
votes
0 answers

Redshift spectrum timestamp column issues

I have few files in s3. Used glue data catalog to get the table definition. I have field called log_time and I manually set the datatype to timestamp in glue catalog. Now when I query that table from Athena I can see the timestamp values correctly.…
0
votes
1 answer

Amazon Spectrum incremental load directly from string

I have take a field as 'filename Pro_180913_171842' from spectrum. Tried the function in sql like `select fields from spectrum.ex where cast(SPLIT_PART('filename Pro_180913_171842','Pro_',2)as …
0
votes
0 answers

Unloading & reloading data between S3 and Redshift with schema changes

I'm interested in setting up some automated jobs that will periodically export data from our Redshift instance and store it on S3, where ideally it will then be bubbled back up into Redshift via an external table running in Redshift Spectrum. One…
0
votes
1 answer

Redshift Spectrum Query - Request ran out of memory in the S3 query layer

I am trying to execute a query with grouping on 26 columns. Data is stored in S3 in parquet format partitioned by day. Redshift Spectrum query is returning below error. I am not able to find any relevant documentation in aws regarding this. Request…
0
votes
1 answer

Amazon Redshift C# client to query data without ODBC/JDBC

Is there is any way to fetch the data from Amazon Redshift from C# without using JDBC/ODBC drivers?
Vinod Kumar
  • 378
  • 1
  • 16
0
votes
1 answer

How to identify a person or id if it contains more than one row for a different column in SQL

I have a table in which a person contains same values multiple times in another column. For example: person product portal count indicator ----------------------------------------------- 1 10 5 2 y …
0
votes
1 answer

AWS Glue: How to ETL non-scalar JSON with varying schemas

Objective I have an S3 folder full of json files with varying schemas, including arrays (a dynamodb backup, as it happens). However, while the schemas vary, all files contain some common elements, such as 'id' or 'name', as well as nested arrays of…
0
votes
1 answer

Spectrum Same External Table Shows in Multiple Schemas (svv_external_tables)

It's a really simple test actually. I create a couple external schemas and create an external table in one of the schemas and then querying svv_external_tables shows the table exists in ALL schemas!! What am I missing? create external schema…
0
votes
1 answer

how to view data catalog table in S3 using redshift spectrum

I created external schema for my database in aws glue. I can see the list of table but I cannot look into the json data. redshift throws me this errors. [Amazon](500310) Invalid operation: S3 Query Exception (Fetch) Details: …
beni
  • 83
  • 3
  • 10
1 2 3
12
13