cannnot copy CSV to REDSHIFT

Question

I converted dataframe to csv

df.write.format('com.databricks.spark.csv')\
    .option("inferSchema", "true")\
    .option("delimiter", "|") \
    .save('res.csv')

Above code generates a partitioned csv file. 'res.csv' is a directory that contains several files (.SUCCESS.crc, .part-xxxxx.crc)

I uploaded the res.csv to s3(s3://path). And then I tried to copy the res.csv to REDSHIFT using copy command.

copy tableName from 's3://path' 
CREDENTIALS  'aws_access_key_id=************;aws_secret_access_key=*********** 
delimiter '|'

but it doesnot work with following error. I got this error from table 'stl_load_errors'

100 1 165722 2016-07-29 21:43:42 7490 1968765 s3://r630166/res.csv.gz/._SUCCESS.crc
1 0 crc
1216 Missing newline: Unexpected character 0x63 found at location 2**

100 0 165722 2016-07-29 21:43:42 7490 1968765 s3://r630166/res.csv.gz/.part-00000.crc 1 0 crc
1216 Missing newline: Unexpected character 0x63 found at location 2

copy commands work when I point to the exact file inside res.csv directory which has data in it

At first step when I generated csv from dataframe, I thought res.csv would be a file. Is there a way I can generate convert a dataframe to normal csv file — user1564657, Jul 29 '16 at 21:59
Does this help? [how to export a table dataframe in pyspark to csv?](https://stackoverflow.com/questions/31385363/how-to-export-a-table-dataframe-in-pyspark-to-csv) — John Rotenstein, Jul 30 '16 at 06:07

cannnot copy CSV to REDSHIFT

0 Answers0