14

I have some pretty standard code which takes in a serialized object from a stream, which bascially looks like this:

  Object getObjectFromStream(InputStream is) {
    ObjectInputStream ois = new ObjectInputStream(is);
    return ois.readObject();
  }

I then have a file in my resources folder, so on my development machine, I can either reference it as a File, or as a JarResource:

  InputStream is = new FileInputStream("/home/.../src/main/resources/serializedObjects/testObject");
  InputStream is = this.getClass().getResourceAsStream("/serializedObjects/testObject");

In my head, both should do the exact same thing. As it happens however, both resolve to a valid (non-null) stream, but the FileInputStream correctly returns an Object from my getObjectFromStream(InputStream) method, while the getResourceAsStream version throws this exception:

  java.io.StreamCorruptedException: invalid stream header: EFBFBDEF
    at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:800)
    at java.io.ObjectInputStream.(ObjectInputStream.java:297)

Mostly, I would like to know how to fix this, but I'd also appreciate an understanding of the difference between the two InputStreams ...

barryred
  • 1,071
  • 1
  • 16
  • 24
  • I suggest you change the name of the file to check it is reading the same file in both cases. Don't forget to close you stream when you have finished with it. ;) – Peter Lawrey Mar 24 '11 at 15:55
  • Are you using ant? Check out [this post](http://www.coderanch.com/t/278717/Streams/java/StreamCorruptedException-invalid-stream-header). – Bala R Mar 24 '11 at 15:56

5 Answers5

18

EFBFBD is a UTF-8 representation of Unicode replacement character U+FFFD. So, it looks like file was passed through some encoding conversion process.

Maven can be a suspect, especially its resource filtering feature.

axtavt
  • 228,184
  • 37
  • 489
  • 472
  • I am indeed using Maven's resource filtering system. I'll see if I can disable it for those files... – barryred Mar 24 '11 at 16:03
  • 4
    @barryred: See warning at the end of http://maven.apache.org/plugins/maven-resources-plugin/examples/filter.html – axtavt Mar 24 '11 at 16:04
  • Yep - that's definitely the problem alright, removed the filtering, and it works, just have to figure out selective filtering now, but I'm 90% there. Thanks. – barryred Mar 24 '11 at 16:14
  • 1
    Git did this to some files for me as well. – Drew Noakes May 20 '12 at 11:09
7

In your case it was Maven that was messing with your files, however I found the same thing for a different reason and so am documenting it here as this is the only useful search result on Google.

I was saving serialised objects as data sets for unit tests, and storing them in version control. Whether this was a good idea or not is up for debate, but another time.

The files started with:

AC ED 00 05 ...

After storing them in Git, they become:

EF BF BD EF BF BD 00 05 ...

This causes the error:

java.io.StreamCorruptedException: invalid stream header: EFBFBDEF
    at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:782)
    at java.io.ObjectInputStream.<init>(ObjectInputStream.java:279)

Git not only changes these opening bytes, but many bytes throughout the file. It's attempting to convert between Windows and Unix style line endings. The heuristic being used to identify whether the file contains text is failing.

The solution was to add a .gitattributes file that specified some files to exclude from this processing:

*.bytes -crlf

I also ensured my .git/config file has the following:

[core]
    autocrlf = false

With those changes, I deleted the index and forced a reset:

rm .git/index
git reset      # force rescan of the index
git status     # any files listed here will experience changes
git add -u
git commit -m "Line ending normalisation changes."

Hope that helps someone out. I'm not a guru of Git, so it may be that some of these steps are not needed, but they worked for me.

Drew Noakes
  • 266,361
  • 143
  • 616
  • 705
6

This one worked for me.

        <plugin>
          <artifactId>maven-resources-plugin</artifactId>
          <version>2.5</version>
          <configuration>
            <encoding>UTF-8</encoding>
            <nonFilteredFileExtensions>
              <nonFilteredFileExtension>xls</nonFilteredFileExtension>
              <nonFilteredFileExtension>xlsx</nonFilteredFileExtension>
              <nonFilteredFileExtension>jrxml</nonFilteredFileExtension>
              <nonFilteredFileExtension>jasper</nonFilteredFileExtension>
            </nonFilteredFileExtensions>
          </configuration>
        </plugin>
Hari
  • 61
  • 1
  • 1
1

One issue is that maven tries to filter everything in the resource folder. Make a separate folder and then instruct maven not to filter it.

<resources>
   <resource>
       <directory>${basedir}/bin</directory>
       <filtering>false</filtering>
       <includes>
           <include>**/*</include>
       </includes>
   </resource>
</resources>
Pankrates
  • 2,925
  • 1
  • 19
  • 27
0

There should be no difference at all -- the path you're using for getResourceAsStream() must be finding some other file. Do a search for other files stored as serializedObjects/testObject, and see if you can't find it. Remember that the FileInputStream is going to be looking relative to the current directory, while the getResourceAsStream() is relative to the class path.

Ernest Friedman-Hill
  • 77,245
  • 10
  • 138
  • 182