4

The following code on execution pops up an error saying ERROR 2017 : Internal error creating job configuration. in PIG.

data = LOAD 'info.txt' USING PigStorage();

name_col_one = FOREACH data GENERATE $0 AS timeStamp, $1 AS one, $2 AS two, $3 AS info, $4 AS four, $5 AS five, $6 AS six, $7 AS seven, $8 AS eight, $9 AS nine, $10 AS ten, $11  AS eleven;

process_col_one = FOREACH name_col_one GENERATE FLATTEN(STRSPLIT(timeStamp,'\\s+',2)) AS (time:chararray, date:chararray), one, two;

new_timestamp = FOREACH process_col_one GENERATE CONCAT(date,CONCAT(' ',time)), one, two;

sys_info = FOREACH name_col_one GENERATE info;

split_  = FOREACH sys_info GENERATE REPLACE(info, '\\[', '') AS new_split;
split_again  = FOREACH split_ GENERATE REPLACE(new_split, ']', '\t') AS final_split;

others = FOREACH name_col_one GENERATE four, five, six, seven, eight, nine, ten, eleven;

r1 = RANK new_timestamp;
r2 = RANK split_again;
r3 = RANK others;

final = JOIN r1 BY rank_new_timestamp, r2 BY rank_split_again;
DUMP final;

SAMPLE DATA in info.txt

23:58:19 02/23/2015 good 1042559519 [Linux][Baseline][lrtp2nosqlprod1][FileSystem][/tmp] FileSystems/tmp\Use%=1% 9:5603 0 1

23:58:15 02/23/2015 good 1042559519 [Linux][Baseline][lrtp2nosqlprod1][FileSystem][/boot] FileSystems/boot\Use%=37% 3:5603 0 37

23:58:15 02/23/2015 good 1042559537 [Linux][Baseline][lrtp2nosqlprod1][Process][srmclient][SiSExclude] running 3:5599 running true no data 1 0 0

23:58:15 02/23/2015 good 1042559537 [Linux][Baseline][lrtp2nosqlprod1][Process][OSWatcher][SiSExclude] running, 2 processes 4:5599 running true no data 2 0 0

Relations new_timestamp is reversing the timestamp from the input dat, split_again is removing square brackets in $3 and delimiting them by '\t'.

Pig Stack Trace
---------------
ERROR 2017: Internal error creating job configuration.

org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open iterator for alias final
    at org.apache.pig.PigServer.openIterator(PigServer.java:880)
    at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:774)
    at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:372)
    at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:198)
    at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:173)
    at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69)
    at org.apache.pig.Main.run(Main.java:541)
    at org.apache.pig.Main.main(Main.java:156)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Caused by: org.apache.pig.PigException: ERROR 1002: Unable to store alias final
    at org.apache.pig.PigServer.storeEx(PigServer.java:982)
    at org.apache.pig.PigServer.store(PigServer.java:942)
    at org.apache.pig.PigServer.openIterator(PigServer.java:855)
    ... 12 more
Caused by: org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobCreationException: ERROR 2017: Internal error creating job configuration.
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(JobControlCompiler.java:873)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.compile(JobControlCompiler.java:298)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:190)
    at org.apache.pig.PigServer.launchPlan(PigServer.java:1322)
    at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1307)
    at org.apache.pig.PigServer.storeEx(PigServer.java:978)
    ... 14 more
Caused by: java.lang.NullPointerException
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(JobControlCompiler.java:817)
    ... 19 more
================================================================================

Any help is welcome. Thanks in advance.

Baxiz
  • 51
  • 1
  • 8
  • Edit your post to add a sample of your input data and the complete stack trace please. Which version of Pig are you using? – Balduz Jun 05 '15 at 07:45
  • I can DUMP r1, r2 and r3. I get the error only when I dump final. – Baxiz Jun 05 '15 at 08:21
  • ok but can you please add it? And which version of Pig are you using? We need more information to be able to help you. – Balduz Jun 05 '15 at 08:23
  • It works for me using 0.14.0. Back in Pig 0.12.0, when Pig couldn't find a file, instead of showing a decent error message it displayed just what you posted... Are you sure you are using the correct path in the HDFS? If you type in your console `hadoop fs -cat info.txt` you see the contents of the file? – Balduz Jun 05 '15 at 08:47
  • Yes, I can see the file – Baxiz Jun 05 '15 at 08:50
  • Must be a bug that was fixed in 0.13 or 0.14 then. We will need to check the stack trace to see where is the error coming from. Can you please add it? – Balduz Jun 05 '15 at 09:00
  • [This bug](https://issues.apache.org/jira/browse/PIG-3985) seems to be the problem. Try running the script with multiquery disabled: `pig -no_multiquery -f your_script.pig` – Balduz Jun 05 '15 at 09:53
  • The mapReduceLauncher is stuck at 97% since then :P – Baxiz Jun 05 '15 at 10:35
  • did it end? Did it work? – Balduz Jun 05 '15 at 11:50
  • final = JOIN r1 BY rank_new_timestamp, r2 BY rank_split_again; is it rank_new_timestamp or new_timestamp,rank_split_again or split_again – Aman Jun 05 '15 at 12:20
  • It is still running @Balduz It is rank_new_timestamp – Baxiz Jun 05 '15 at 12:23
  • For people who found this post when looking for [ERROR 1066: Unable to open iterator for alias](http://stackoverflow.com/questions/34495085/error-1066-unable-to-open-iterator-for-alias-in-pig-generic-solution) here is a [generic solution](http://stackoverflow.com/a/34495086/983722). – Dennis Jaheruddin Dec 28 '15 at 14:38

1 Answers1

1

This problem has been reported before (https://issues.apache.org/jira/browse/PIG-3469) and has been fixed, maybe try using the latest version of pig.

This problem can sometimes be fixed by specifying the path to the input data file e.g. '/home/user/doc/info.txt'

jasonC
  • 33
  • 4