2

For some reason adding a filter to the statement below causes a couple of errors. In the console output I find Failed to read data from "...". Also in the log I found this:

Backend error message
---------------------
java.lang.NullPointerException
    at org.apache.pig.builtin.Utf8StorageConverter.consumeTuple(Utf8StorageConverter.java:185)
    at org.apache.pig.builtin.Utf8StorageConverter.consumeBag(Utf8StorageConverter.java:94)
    at org.apache.pig.builtin.Utf8StorageConverter.bytesToBag(Utf8StorageConverter.java:331)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POCast.getNext(POCast.java:1562)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:334)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.processInput(POUserFunc.java:228)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:282)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:416)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POCast.getNext(POCast.java:3

Pig Stack Trace
---------------
ERROR 1066: Unable to open iterator for alias limited

org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open iterator for alias limited
    at org.apache.pig.PigServer.openIterator(PigServer.java:838)
    at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:696)
    at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:320)
    at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:194)
    at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:170)
    at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
    at org.apache.pig.Main.run(Main.java:604)
    at org.apache.pig.Main.main(Main.java:157)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
Caused by: java.io.IOException: Couldn't retrieve job.
    at org.apache.pig.PigServer.store(PigServer.java:902)
    at org.apache.pig.PigServer.openIterator(PigServer.java:813)
    ... 12 more

The code that I'm using is as follows:

--- Read the input 
records = LOAD 'data' AS (id1, id2, link, tags:bag{}, dates); 

counted = FOREACH records GENERATE (chararray) id1, (int) COUNT(tags) as amountOfTags;

filtered = FILTER counted BY amountOfTags > 0;

limited = limit filtered 10;

--- Save the result 
dump limited;

Everything works fine until I add the filtered... line and try to output it.

Can anyone tell me why?

Manjunath Ballur
  • 5,842
  • 3
  • 31
  • 44
  • Perhaps you should show some code. – Josh M Dec 18 '13 at 23:10
  • What code would you like to see? The code that I'm using is at the bottom – Ivo Van de Grift Dec 18 '13 at 23:23
  • 1
    Probably there might be a problem with the bag loading. Can you provide a tiny sample input file and also do DESCRIBE counted; DUMP counted; before the FILTER? – Ruslan Dec 19 '13 at 04:14
  • 1
    Can you check if `dump counted` gives expected result? And as Ruslan said - can you provide any sample data? – psmith Apr 30 '14 at 12:45
  • What happens when you try to dump intermediate results?! -- For people who found this post when looking for [ERROR 1066: Unable to open iterator for alias](http://stackoverflow.com/questions/34495085/error-1066-unable-to-open-iterator-for-alias-in-pig-generic-solution) here is a [generic solution](http://stackoverflow.com/a/34495086/983722). – Dennis Jaheruddin Dec 28 '15 at 14:48

0 Answers0