0

I have a sub (child) class that extends from a super (parent) class. I want a way to provide a general type for the input value of the Mapper, so that I can provide both the child and parent as valid values like this:

public static class MyMapper extends Mapper<..., MyParentClass, ..., ...>

I want MyChildClass,which extends from MyParentClass, to be valid also.

However when I am running the program if the value is a child class I am getting an exception:

type mismatch in value from map: expected MyParentClass, recieved MyChildClass

How can I enable both the child and the parent classes to be a valid input/output value to/from the mapper?

Update:

package hipi.examples.dumphib;

import hipi.image.FloatImage;
import hipi.image.ImageHeader;
import hipi.imagebundle.mapreduce.ImageBundleInputFormat;
import hipi.util.ByteUtils;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;

import java.io.IOException;
import java.util.Iterator;

public class DumpHib extends Configured implements Tool {

  public static class DumpHibMapper extends Mapper<ImageHeader, FloatImage, IntWritable, Text> {

    @Override
    public void map(ImageHeader key, FloatImage value, Context context) throws IOException, InterruptedException  {

      int imageWidth = value.getWidth();
      int imageHeight = value.getHeight();

      String outputStr = null;

      if (key == null) {
    outputStr = "Failed to read image header.";
      } else if (value == null) {
    outputStr = "Failed to decode image data.";
      } else {
    String camera = key.getEXIFInformation("Model");
    String hexHash = ByteUtils.asHex(ByteUtils.FloatArraytoByteArray(value.getData()));
    outputStr = imageWidth + "x" + imageHeight + "\t(" + hexHash + ")\t  " + camera;
      }

      context.write(new IntWritable(1), new Text(outputStr));
    }

  }

  public static class DumpHibReducer extends Reducer<IntWritable, Text, IntWritable, Text> {

    @Override
    public void reduce(IntWritable key, Iterable<Text> values, Context context) throws IOException, InterruptedException {
      for (Text value : values) {
    context.write(key, value);
      }
    }

  }

  public int run(String[] args) throws Exception {

    if (args.length < 2) {
      System.out.println("Usage: dumphib <input HIB> <output directory>");
      System.exit(0);
    }

    Configuration conf = new Configuration();

    Job job = Job.getInstance(conf, "dumphib");

    job.setJarByClass(DumpHib.class);
    job.setMapperClass(DumpHibMapper.class);
    job.setReducerClass(DumpHibReducer.class);

    job.setInputFormatClass(ImageBundleInputFormat.class);
    job.setOutputKeyClass(IntWritable.class);
    job.setOutputValueClass(Text.class);

    String inputPath = args[0];
    String outputPath = args[1];

    removeDir(outputPath, conf);

    FileInputFormat.setInputPaths(job, new Path(inputPath));
    FileOutputFormat.setOutputPath(job, new Path(outputPath));

    job.setNumReduceTasks(1);

    return job.waitForCompletion(true) ? 0 : 1;

  }

  private static void removeDir(String path, Configuration conf) throws IOException {
    Path output_path = new Path(path);
    FileSystem fs = FileSystem.get(conf);
    if (fs.exists(output_path)) {
      fs.delete(output_path, true);
    }
  }

  public static void main(String[] args) throws Exception {
    int res = ToolRunner.run(new DumpHib(), args);
    System.exit(res);
  }

}

FloatImage is a super class and I have ChildFloatImage class that extends from it. When ChildFloatImage is returned from RecordReader it is throwing the previous exception.

Mosab Shaheen
  • 868
  • 9
  • 20
  • Please post your mapper code if you can. – Amit Feb 10 '17 at 18:26
  • @Amit Could you check the code above. You can check also on any mapper using simple types like "Text" class and one class that extends it, and you will see that when the child class is returned an exception will be thrown. – Mosab Shaheen Feb 11 '17 at 08:22
  • Could you try using "? extends FloatImage" as your Generic type definition. Also I think the answer below will help you understand the Generic types and their usages. Here is one more resource for Generics and Inheritance understanding - https://docs.oracle.com/javase/tutorial/java/generics/inheritance.html – Amit Feb 11 '17 at 15:45
  • @Amit Dear If you write "Mapper<.... ...="" extends="" floatimage="">" it will give you a compile error. Please try your suggestion on a working example and kindly inform me if it is working. – Mosab Shaheen Feb 12 '17 at 21:19
  • @Amit I answered below. Pls. take a look. – Mosab Shaheen Feb 23 '17 at 09:16

2 Answers2

0

Background

The reason for this is that type erasure makes it impossible for Java to (at runtime) check that your MyMapper actually extends the correct type (in terms of the generic type parameters on Mapper).

Java basically compiles:

List<String> list = new ArrayList<String>();
list.add("Hi");
String x = list.get(0);

into

List list = new ArrayList();
list.add("Hi");
String x = (String) list.get(0);

Credits for this example go here.

So you are inputting MyMapper where Java wants to see Mapper<A, B, C, D> of specific A, B, C and D - not possible at runtime. So we have to force that check at compile time.

Solution

You can do the following for all your custom subclasses:

job.setMapperClass(DumpHibMapper.class);

using java.lang.Class#asSubclass

and doing this instead:

job.setMapperClass(DumpHibMapper.class.asSubclass(Mapper.class));
DeiDei
  • 9,225
  • 6
  • 46
  • 69
Armin Braun
  • 3,379
  • 1
  • 14
  • 33
  • Thanks for reply, but actually the exception: "type mismatch in value from map: expected FloatImage, recieved MyChildFloatImage" is related to the "FloatImage" not to the "DumpHibMapper". So I don't think we should fix "DumpHibMapper" rather we should make the IS-A (Child/parent) relationship, related to "FloatImage", accepted. What do you suggest dear? – Mosab Shaheen Feb 12 '17 at 21:24
  • I answered below. Pls. take a look. – Mosab Shaheen Feb 23 '17 at 09:16
0

The solution, I followed, is to create a container/wrapper class that delegates all the required functions to the origional object as follows:

public class FloatImageContainer implements Writable, RawComparator<BinaryComparable> {

    private FloatImage floatImage;

    public FloatImage getFloatImage() {
        return floatImage;
    }

    public void setFloatImage(FloatImage floatImage) {
        this.floatImage = floatImage;
    }

    public FloatImageContainer() {
        this.floatImage = new FloatImage();
    }

    public FloatImageContainer(FloatImage floatImage) {
        this.floatImage = floatImage;
    }

    @Override
    public int compare(BinaryComparable o1, BinaryComparable o2) {
        // TODO Auto-generated method stub
        return floatImage.compare(o1, o2);
    }

    @Override
    public int compare(byte[] b1, int s1, int l1, byte[] b2, int s2, int l2) {
        // TODO Auto-generated method stub
        return floatImage.compare(b1, s1, l1, b2, s2, l2);
    }

    @Override
    public void write(DataOutput out) throws IOException {
        // TODO Auto-generated method stub
        floatImage.write(out);
    }

    @Override
    public void readFields(DataInput in) throws IOException {
        // TODO Auto-generated method stub
        floatImage.readFields(in);
    }

}

And in the Mapper:

public static class MyMapper extends Mapper<..., FloatImageContainer, ..., ...> {

In this case both the FloatImage and ChildFloatImage can be encapsulated in FloatImageContainer and you get rid of inheretance problems in Hadoop, because there is only one class used directly FloatImageContainer which is not parent/child of any.

Mosab Shaheen
  • 868
  • 9
  • 20
  • This looks good at first instance and I was able to get my mapper work, where mapper happily write the data for reducer. But when we have more than one child and child having additional properties, will the default constructor still work when reading back the fields into a reducer? Container do not have any idea of which child it might be at runtime, so deserialization process would only read back the parent's properties and miss any child one. Please let me know your thoughts? – Gyanendra Dwivedi Jun 22 '17 at 22:54