0

I created external schema for my database in aws glue. I can see the list of table but I cannot look into the json data. redshift throws me this errors.

[Amazon](500310) Invalid operation: S3 Query Exception (Fetch)
Details: 
 -----------------------------------------------
  error:  S3 Query Exception (Fetch)
  code:      15001
  context:   Task failed due to an internal error. Error occured during Ion/JSON extractor match: IERR_INVALID_SYNTAX

  query:     250284
  location:  dory_util.cpp:717
  process:   query2_124_250284 [pid=12336]
  -----------------------------------------------;
1 statement failed.

I dont want to create external tables because I will create a view combining the external tables in the data catalog in aws glue.

Just an update:

I used aws glue crawler in creating the tables in the data catalog. They are in json format. If I use a job that will upload this data in redshift they are loaded as flat file (except arrays) in redshift table.

Example of json data:

{
  "array": [
    1,
    2,
    3
  ],
  "boolean": true,
  "null": null,
  "number": 123,
  "object": {
    "a": "b",
    "c": "d",
    "e": "f"
  },
  "string": "Hello World"
}

If I upload them using a job in aws glue the output will be like (as table)

see image

Now, I have trmendous amount of tables crawled in data catalog. I am struggling creating the individual script of this tables that is why an amazon redshift spectrum external schema can be helpful.

However when I query the external table in the external schema I am getting the error as posted above. I do not encounter problems with external tables from the data catalog if they are loaded as csv but the format files I need to read in redshift spectrum should be in json.

Is it possible to view the external table in redshift spectrum the same format when it is loaded using a job?

beni
  • 83
  • 3
  • 10
  • try to share a lot more info to get help on this. include your sql, your definition etc – Jon Scott Jun 05 '18 at 16:47
  • thanks I edited my post – beni Jun 06 '18 at 05:07
  • I have the same problem but with parquet format – DJo Jul 02 '18 at 10:46
  • beni, The errors thrown by Redshift Spectrum may not always be accurate. I can only confirm the querying with JSON should work similar to other data formats. I will suggest to refer [this][https://forums.aws.amazon.com/thread.jspa?threadID=287753&tstart=0] – SunSmiles Feb 12 '19 at 20:10

1 Answers1

0

beni,
The errors thrown by RedShift Spectrum may not always be accurate. I can only confirm the querying with JSON should work similar to other data formats. By the way, the external table needs to be corrected through SQL client within spectrum database.

So, I will suggest to refer this and this to review your steps

SunSmiles
  • 186
  • 9