2

I have a json file placed in s3. The s3 url is similar to the below one:

https://s3-eu-region-1.amazonaws.com/dir-resources/sample.json

But in pyspark when pass the same, It is not reading the file.

path = "https://s3-eu-region-1.amazonaws.com/dir-resources/sample.json"
df=spark.read.json(path)

But I am able to download it through browser.

Tom J Muthirenthi
  • 2,382
  • 4
  • 32
  • 49

1 Answers1

0

Assuming that dir-resources is the name of your bucket, you should be able to access to the file with the following URI:

path = "s3://dir-resources/sample.json"

In some cases, you may have to use the s3n protocol instead:

path = "s3n://dir-resources/sample.json"

carrdelling
  • 1,545
  • 1
  • 15
  • 16