0

I converted dataframe to csv

df.write.format('com.databricks.spark.csv')\
    .option("inferSchema", "true")\
    .option("delimiter", "|") \
    .save('res.csv')

Above code generates a partitioned csv file. 'res.csv' is a directory that contains several files (.SUCCESS.crc, .part-xxxxx.crc)

I uploaded the res.csv to s3(s3://path). And then I tried to copy the res.csv to REDSHIFT using copy command.

copy tableName from 's3://path' 
CREDENTIALS  'aws_access_key_id=************;aws_secret_access_key=*********** 
delimiter '|'

but it doesnot work with following error. I got this error from table 'stl_load_errors'

100 1 165722 2016-07-29 21:43:42 7490 1968765 s3://r630166/res.csv.gz/._SUCCESS.crc
1 0 crc
1216 Missing newline: Unexpected character 0x63 found at location 2**

100 0 165722 2016-07-29 21:43:42 7490 1968765 s3://r630166/res.csv.gz/.part-00000.crc 1 0 crc
1216 Missing newline: Unexpected character 0x63 found at location 2

copy commands work when I point to the exact file inside res.csv directory which has data in it

lokusking
  • 7,000
  • 13
  • 38
  • 51
user1564657
  • 97
  • 2
  • 5

0 Answers0