0

The following is the error message generated when I try to run a dataflow job. More specifically, I execute the dataflow job using a template that is created by executing a flow in dataprep.

The command I am running in the gcloud shell is:

gcloud dataflow jobs run_template       
--gcs-location gs://[bucket]/templates/sample_template        
--parameters   
   inputLocations=gs://[bucket]/input/input_file.csv,   
   outputLocations=gs://[bucket]/output/my _output  
  • Error message:

"java.lang.RuntimeException: Cannot get value for location1"

  • Detailed description of the error:

"Caused by: com.fasterxml.jackson.core.JsonParseException: Unrecognized token 'gs': was expecting ('true', 'false' or 'null') at [Source: (String)"gs://[bucket]/input_file.csv"]

So, I'd like to know the correct command to run the job?

Note:
When I used inputFile and outputFile in --parameters as mentioned in the documentation below, it throwed me an error. So, I instead used inputLocations and outputLocations which solved the error.
https://cloud.google.com/dataflow/docs/guides/templates/executing-templates#example-1-custom-template-batch-job_1

1 Answers1

0

The --parameters flag is a dictionary-type, you should no allow spaces between the comma separated parameters. Also inputLocations/outputLocations parameters takes objects. You will have to enclose the objects in curly braces {}, enquote it's fileds with "" and escape the commas. It's quite tricky to get this to work on the CLI. You can find references in this documentation, but a complete explanation on how to make it work was provided in this stackoverflow answer.

dhauptman
  • 864
  • 6
  • 13