4

Sorry if I'm missing something obvious here...but please take a look at this code snippet:

String readString;
String writeString = "O hai world.";
BufferedReader br = new BufferedReader(
    new InputStreamReader( 
        new ByteArrayInputStream(writeString.getBytes()),
        "UTF-8"),
    1024);
readString = br.readLine();
System.out.println("readString: " + readString);

I'd expect this to print "readString: null" since I thought the BufferedReader would encounter an EOF before detecting a valid EOL, but instead this prints "readString: O hai world". This seems contrary to what the Javadocs for BufferedReader say readLine() will do:

Reads a line of text. A line is considered to be terminated by any one of a line feed ('\n'), a carriage return ('\r'), or a carriage return followed immediately by a linefeed.

Returns: A String containing the contents of the line, not including any line-termination characters, or null if the end of the stream has been reached

I can't see any reason why my string would be re-interpreted to terminate with '\n' and/or '\r'...can someone please illuminate me? Thanks!

EDIT: To provide some context, I'm trying to write JUnit tests to validate a Reader class that I wrote that's designed to read on System.in. Using ByteArrayInputStreams seemed like a reasonable way to simulate System.in (see this relevant SO post).

When my Reader captures a line, it currently relies on BufferedReader.readLine(). For my purposes, my Reader's lines MUST all have been terminated with '\n' or '\r'; encountering EOF without an EOL should not resolve into a valid line. So I guess my question(s) at this point are really as follows (I'll try to test these myself in greater detail when I have time, but hoping you smart folks can help me out):

  • Is BufferedReader.readLine() broken/misdocumented? Or is ByteArrayInputStream returning something erroneous when its byte array is exhausted?
  • Is this method of testing my Reader erroneous, and should I expect readLine() to function properly when used against System.in? I'm inclined to believe the answer to this is yes.
  • Are there better ways to simulate System.in for unit testing?
  • If I need to strictly discriminate against '\n' and '\r' when reading from an InputStream, am I better off writing my own readLine() method? I'd be very surprised if this is the case.

Thanks again!

Community
  • 1
  • 1
jasterm007
  • 143
  • 2
  • 8
  • You gave it a string, you got back a string. What's not to love ;)? – paulsm4 Jul 20 '12 at 20:43
  • Shouldn't readLine() be blocking for an EOL character? Did I magically insert one somewhere? – jasterm007 Jul 20 '12 at 20:45
  • +1 because after reading the docs, I agree with you. readline() method should also mention a line terminator could be EOF. – goat Jul 20 '12 at 20:47
  • It is an interesting question. I wonder if it would behave the same if read from a file with that content. – Austin Heerwagen Jul 20 '12 at 20:48
  • Thanks, @rambocoder and austin-heerwagen. I guess I'll try to dig a little deeper and see if ByteArrayInputStream is maybe returning an EOL when its byte array is exhausted, or if readLine() is actually broken/misdocumented. I'll leave the question open until I find something conclusive, or if someone else comes up with something. – jasterm007 Jul 20 '12 at 21:28
  • @njhwang: You really should have mentioned the this in your original post: "I was trying to test if my Reader will successfully read a line correctly if I feed it part of a line, delay, then give it the rest of the line, including EOL". See my response below. – paulsm4 Jul 21 '12 at 16:40

5 Answers5

3

The ByteArrayInputStream doesn't return EOL when it's exhausted. It only returns -1 which might be considered EOF.

The thing is that the BufferedReader buffers all it reads from the input stream and if the EOF(-1) is encountered before any EOL character showed up, it returns the string buffered up to that point.

So, if you want to be very strict, you can say that readLine() is either broken according to the current documentation or that it should be documented differently if this was the intended behavior.

In my opinion, considering that the last line in a stream doesn't have to end with an EOL character (EOF being enough) the current behavior of readLine is correct, i.e a line was read because EOF was encountered. So, the documentation should be changed.

Pierre
  • 1,064
  • 11
  • 25
Razvan
  • 9,372
  • 4
  • 35
  • 49
  • Thanks. Do you happen to have a suggestion for how to perform unit tests on something that's supposed to rely on System.in input? I was trying to test if my Reader will successfully read a line correctly if I feed it part of a line, delay, then give it the rest of the line, including EOL. Otherwise, I'll post a separate question. – jasterm007 Jul 20 '12 at 22:08
  • If I understand correctly you want to simulate a user who pauses while writing something on the System.in and then continues writing. The only way, I think, you could simulate this behavior would be to extend ByteArrayInputStream and to rewrite the read method such that occasionally(or at least once) sleeps before returning the next byte. – Razvan Jul 20 '12 at 22:27
1

I would imagine that this would block would you be reading from a true stream (e.g. a network socket). But since the underlying input is an array, the reader knows that the true end of data has been reached, so blocking is unnecessary since no new data is forthcoming. So blocking would be a wrong course of action. Returning a null where actual data was read would also be a wrong thing to do.

Dmitry B.
  • 8,229
  • 3
  • 37
  • 57
  • I agree that blocking would be the wrong course of action when the ByteArrayInputStream returns EOF upon a read. But I disagree with null being the incorrect return value. The point of a BufferedReader is for it to buffer information until it is valid to return it. In this case, readLine() should have buffered "O hai world." and then returned null when no EOL was encountered, since a line was not actually read. – jasterm007 Jul 20 '12 at 21:17
  • think of a text editor. Even if a text file doesn't have the trailing EOL, you'll still see the last line of text when you open the file. Had readLine() been implemented they way you suggest, you couldn't use it to implement a text editor. Changing documentation to include EOF as a line terminating condition is, IMO, what's needed here. – Dmitry B. Jul 20 '12 at 21:35
  • Mm well that seems like a use case where BufferedReader.readLine() should let you read all lines in the file but the last, whereupon it would return null since there's an incomplete line, and then you could use your favorite variant of BufferedReader.read() to get you the rest of the way. In any case, thanks for confirming that you think the documentation does not reflect reality. – jasterm007 Jul 20 '12 at 21:50
1

I believe you want a "Robot" to emulate keystrokes for testing purposes:

This class is used to generate native system input events for the purposes of test automation, self-running demos, and other applications where control of the mouse and keyboard is needed. The primary purpose of Robot is to facilitate automated testing of Java platform implementations.

Here's an article that discusses it further:

paulsm4
  • 99,714
  • 15
  • 125
  • 160
0

What would you expect to happen with this version of your code?

String readString;
String writeString = "O\nhai\nworld.";
BufferedReader br = new BufferedReader(
    new InputStreamReader( 
        new ByteArrayInputStream(writeString.getBytes()),
        "UTF-8"),
    1024);
while (true) {
    readString = br.readLine();
    if (readString == null) break;
    System.out.println("readString: " + readString);
}
  • 1) Print "O", 2) print "hai", 3) break. Instead, it prints "O" then "hai" then "world.". – jasterm007 Jul 20 '12 at 21:53
  • If "3) break" then you would never get "world". This data would have disappeared for ever. The point of readLine() is just to break the data up into nice chunks, or lines. But you should still get all the data back, then you get null. Makes sense? Yes, the doc could be clearer. –  Jul 20 '12 at 22:01
  • I still disagree, but it seems like I'm in the minority here. If readLine() worked liked the docs says it does, then it would return null, and I would expected BufferedReader to still have the data buffered. As I said to dmitry-beransky, I feel like at that point you could use a regular read() to get the rest of the data until you hit EOF. Inconvenient for a lot of use cases, I guess, but very troublesome for the unit tests that I'm trying to build that simulate delayed System.in input. – jasterm007 Jul 20 '12 at 22:12
  • @njhwang - Have you considered that the documentation didn't "say it" ... because they *assumed* you'd simply "know" it? Let's say I'm writing a manual for your car. Step 1: put key in ignition. Is my manual "broken" because I didn't say "open the door and get in" first? I guess we could debate it ... – paulsm4 Jul 21 '12 at 16:36
0

The only alternative to what it does now is to throw the final incomplete line away. Not desirable.

user207421
  • 289,834
  • 37
  • 266
  • 440