Questions tagged [amazon-redshift-spectrum]

Using Amazon Redshift Spectrum, you can query and retrieve structured and semistructured data from files in Amazon S3 without having to load the data into Amazon Redshift tables. Redshift Spectrum queries employ massive parallelism to execute very fast against large datasets.Multiple clusters can concurrently query the same dataset in Amazon S3 without the need to make copies of the data for each cluster.

194 questions
0
votes
0 answers

Column names containing dots in Spectrum

I created a customers table with columns has account_id.cust_id, account_id.ord_id and so on. My create external table query was as follows: CREATE EXTERNAL TABLE spectrum.customers ( "account_id.cust_id" numeric, "account_id.ord_id" numeric ) row…
0
votes
2 answers

Presto equivalent for Redshift's PERCENTILE_DISC

Given a query below in Redshift: select distinct cast(joinstart_ev_timestamp as date) as session_date, PERCENTILE_DISC(0.02) WITHIN GROUP (ORDER BY join_time) over(partition by trunc(joinstart_ev_timestamp))/1000 as mini, median(join_time)…
Bhuvi007
  • 111
  • 2
  • 11
0
votes
1 answer

Can I convert CSV files sitting on Amazon S3 to Parquet format using Athena and without using Amazon EMR

I would like to convert the csv data files that are right now sitting on Amazon S3 into Parquet format using Amazon Athena and push them back to Amazon S3 without taking any help from Amazon EMR. Is this possible to do it? Has anyone experienced…
0
votes
1 answer

How to create an external table for nested Parquet type in redshift spectrum

I know redshift and redshift spectrum doesn't support nested type, but I want to know is there any trick that we can bypass that limitation and query our nested data in S3 with Redshift Spectrum? In this post the guy shows how we can do it for JSON…
Am1rr3zA
  • 5,777
  • 15
  • 66
  • 112
0
votes
1 answer

Cannot connect to aws redshift

I created a redshift in aws console. the I went to cluster created and based on the information I got in the console I used them in SQL Workbench/J. To set up sql workbench/J I used the…
0
votes
2 answers

Spectrum in us-west-1 and Glue in us-west-2 is it possible?

I am using the Redshift Cluster in us-west-1 (NCAL) s3 file location is in us-west-1 (NCAL) Glue data catalog is in us-west-2 (Oregon) When I try to query the table select count(*) from spectrum_schema.table_name; I get the error below. [Code:…
0
votes
0 answers

How to specify row delimiter for Redshift Spectrum

I'm trying to mount csv files that have a CRLF as a row terminator, into Redshift Spectrum. However, it seems like I can only specify a single character as a row terminator. Does anyone know how to get around this?
0
votes
0 answers

data distribution in redshift for star schema model?

I have big fact table 2 billions rows and 19 dimensions ( product dimension is big 450 millions, another two dimensions are 100 millions each rest small dimensions table) Can some one help me on data distribution for this scenarios ?
0
votes
1 answer

AWS Spectrum giving blank result for parquet files generated by AWS Glue

We are building a ETL with AWS Glue. And to optimise the query performance we are storing data in apache parquet. Once the data is saved on S3 in parquet format. We are using AWS Spectrum to query on that data. We successfully tested the entire…
jimy
  • 4,652
  • 3
  • 30
  • 49
0
votes
2 answers

How to convert a varchar data type field to a timestamp with time zone type field in redshift?

I have a table where the timestamp is stored as a varchar. I need to convert it to timestamp with timezone but every time I get "Invalid Operation" error. The format of the field is: 2017-10-30 10:12:34:154 +1100 I tried the following: '2017-10-30…
0
votes
2 answers

How can I use Psycopg2 to add Partition in Redshift Spectrum -

We have a Redshift Spectrum table built on top of S3 data - we are trying to automate the partition addition in this table - I can run the following ALTER statement in a redshift client or psql shell: ALTER TABLE analytics_spectrum.page_view ADD…
Hussain Bohra
  • 704
  • 7
  • 12
0
votes
1 answer

Execute COPY command on Redshift database from a Linux server outside AWS cluster

I want to load data into Redshift database from amazon S3 using 'COPY' command.But I want to execute it from a shell/perl script present in a Linux machine present outside AWS cluster.I wanted to know if there is any Redshift client that can be…
-1
votes
1 answer

Lock table redshift

how can I do a real LOCK of a table when inserting in redshift, I think it's like that but I'm not sure and aws documentation as always zero input begin;lock table sku_stocks;insert into sku_stocks select facility_alias_id, facility_name, CAST(…
-1
votes
1 answer

How to load CDC into Redshift database?

Can anyone tell me CDC /incremental load methods in Redshift using SQL? I know one method upsert but other than this there are another methods to do like insert followed by delete etc..
1 2 3
12
13