I'm getting an error in the Cloudera QuickStart VM I downloaded from http://www.cloudera.com/content/cloudera-content/cloudera-docs/DemoVMs/Cloudera-QuickStart-VM/cloudera_quickstart_vm.html.
I am trying a toy example from Tom White's Hadoop: The Definitive Guide book called map_temp.pig
, which "finds the maximum temperature by year".
I created a file called temps.txt
that contains (year, temperature, quality) entries on each line:
1950 0 1
1950 22 1
1950 -11 1
1949 111 1
Using the example code in the book, I typed the following Pig code into the Grunt terminal:
records = LOAD '/home/cloudera/Desktop/temps.txt'
AS (year:chararray, temperature:int, quality:int);
DUMP records;
After I typed DUMP records;
, I got the error:
2014-05-22 11:33:34,286 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066: Unable to open iterator for alias records. Backend error : org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application with id 'application_1400775973236_0006' doesn't exist in RM.
…
Details at logfile: /home/cloudera/Desktop/pig_1400782722689.log
I attempted to find out what was causing the error through a Google search: https://www.google.com/search?q=%22application+with+id%22+%22doesn%27t+exist+in+RM%22
.
The results there weren't helpful. For example, http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-troubleshoot-error-vpc.html mentioned this bug and said "To solve this problem, you must configure a VPC that includes a DHCP Options Set whose parameters are set to the following values..."
Amazon's suggested fix doesn't seem to be the problem because I'm not using using AWS.
EDIT:
I think the HDFS file path is correct.
[cloudera@localhost Desktop]$ ls
Eclipse.desktop gnome-terminal.desktop max_temp.pig temps.txt
[cloudera@localhost Desktop]$ pwd
/home/cloudera/Desktop