I converted dataframe to csv
df.write.format('com.databricks.spark.csv')\
.option("inferSchema", "true")\
.option("delimiter", "|") \
.save('res.csv')
Above code generates a partitioned csv file. 'res.csv' is a directory that contains several files (.SUCCESS.crc, .part-xxxxx.crc)
I uploaded the res.csv to s3(s3://path). And then I tried to copy the res.csv to REDSHIFT using copy command.
copy tableName from 's3://path'
CREDENTIALS 'aws_access_key_id=************;aws_secret_access_key=***********
delimiter '|'
but it doesnot work with following error. I got this error from table 'stl_load_errors'
100 1 165722 2016-07-29 21:43:42 7490 1968765 s3://r630166/res.csv.gz/._SUCCESS.crc
1 0 crc
1216 Missing newline: Unexpected character 0x63 found at location 2**100 0 165722 2016-07-29 21:43:42 7490 1968765 s3://r630166/res.csv.gz/.part-00000.crc 1 0 crc
1216 Missing newline: Unexpected character 0x63 found at location 2
copy commands work when I point to the exact file inside res.csv directory which has data in it