1

Trying to split a column from originaldata and need to join back. For that I created a rowid along with originaldata and seperated a col from originaldata along with concatenating the rowid

originaldata = load '$input' using PigStorage('$delimiter');
rankedoriginaldata = rank originaldata;
numericdata = foreach rankedoriginaldata generate CONCAT($0,$split);

But I am not able to do this statement

numericdata = foreach rankedoriginaldata generate CONCAT($0,$split);

Command

pig -x local -f seperator.pig -param input=data/StringNum.csv -param output=OUT/Numericfile -param delimiter="," -param split='$3'

It shows the following stack tree

Pig Stack Trace
---------------
ERROR 1066: Unable to open iterator for alias numericdata

org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open iterator for alias numericdata
    at org.apache.pig.PigServer.openIterator(PigServer.java:838)
    at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:696)
    at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:320)
    at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:194)
    at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:170)
    at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
    at org.apache.pig.Main.run(Main.java:475)
    at org.apache.pig.Main.main(Main.java:157)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:601)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
Caused by: java.io.IOException: Job terminated with anomalous status FAILED
    at org.apache.pig.PigServer.openIterator(PigServer.java:830)
    ... 12 more
================================================================================

But when I did

numericdata = foreach originaldata generate CONCAT($0,$split);

I am getting the expected output.

Doubt: While loading a data do the order of tuple change? If we are loading a data say

1,4,6
3,8,9
2,4,5

How will be the ordering whether it shuffles like

1,6,4
8,9,3...
USB
  • 5,397
  • 11
  • 56
  • 79
  • i am giving a field as an argument,ie if i am giving $3 then i will get 4 th field of tuple. CONCAT($0,$3) – USB Apr 08 '14 at 11:04
  • For people who found this post when looking for [ERROR 1066: Unable to open iterator for alias](http://stackoverflow.com/questions/34495085/error-1066-unable-to-open-iterator-for-alias-in-pig-generic-solution) here is a [generic solution](http://stackoverflow.com/a/34495086/983722). – Dennis Jaheruddin Dec 28 '15 at 15:30

1 Answers1

2

Try casting your arguments for CONCAT to chararray first:

numericdata = foreach originaldata generate CONCAT((chararray)$0,(chararray)$split);

I think the cast is necessary because CONCAT expects two chararrays. RANK however produces a Long (which you pass as $0 to CONCAT).

Concerning your doubt: order of fields in your tuples is not going to change. The order of tuples in the relation may change however.

Frederic
  • 3,254
  • 1
  • 18
  • 35
  • Yes Fred Thanks it works.Why we need to cast?What about the doubt i mentioned in question? – USB Apr 08 '14 at 11:44
  • I am using Pig version 0.11.0-cdh4.6.0.So I am able to do multiple concat.CONCAT($0,\t,$3).For that should I covert it tochararray. – USB Apr 08 '14 at 12:03
  • Sorry I did nt get U.The order of tuples in the relation may change however.means? – USB Apr 08 '14 at 12:04
  • 1
    if you have a relation with tuples `a1,a2\r\n b1,b2\r\n c1,c2` you cannot be sure to see the lines in the same order. The order of the fields within each line will not change. – Frederic Apr 08 '14 at 12:08
  • That means lines can shuffle.But not shuffling between fields .Am I right?If any operation done to the bag can change the order,Am I right? – USB Apr 09 '14 at 04:23