33

This is really a curiosity more than a problem...

Why doesn't the Scanner class have a nextChar() method? It seems like it should when you consider the fact that it has next, nextInt, nextLine etc method.

I realize you can simply do the following:

userChar = in.next().charAt(0);
System.out.println( userChar  );

But why not have a nextChar() method?

Katana24
  • 7,554
  • 16
  • 67
  • 110
  • 3
    As a workaround you could try `next(".")`. – Joachim Sauer Sep 11 '13 at 16:11
  • Looking at the source, it appears that `next(".")` will skip over delimiters first (whitespace by default). Maybe that's what is desired. If not, you'll need to use `sc.delimiter()` to save the current delimiter pattern, `sc.useDelimiter(??)` to set it to something that won't match anything (maybe an empty pattern, but I haven't tested it); _then_ `next(".")`; then `sc.useDelimiter` to restore the previous delimiter. – ajb Sep 11 '13 at 16:47

5 Answers5

15

The reason is that the Scanner class is designed for reading in whitespace-separated tokens. It's a convenience class that wraps an underlying input stream. Before scanner all you could do was read in single bytes, and that's a big pain if you want to read words or lines. With Scanner you pass in System.in, and it does a number of read() operations to tokenize the input for you. Reading a single character is a more basic operation. Source

You can use (char) System.in.read();.

Frithjof
  • 2,083
  • 1
  • 12
  • 37
  • Yes, you can use it, but the result could be rather confusing because `Scanner` will read ahead (and buffer) characters ... depending on the preceding `Scanner` API calls. – Stephen C Apr 13 '21 at 01:28
4

According to the javadoc a Scanner does not seem to be intended for reading single characters. You attach a Scanner to an InputStream (or something else) and it parses the input for you. It also can strip of unwanted characters. So you can read numbers, lines, etc. easily. When you need only the characters from your input, use a InputStreamReader for example.

Marc Hauptmann
  • 688
  • 4
  • 17
2

To get a definitive reason, you'd need to ask the designer(s) of that API.

But one possible reason is that the intent of a (hypothetical) nextChar would not fit into the scanning model very well.

  • If nextChar() to behaved like read() on a Reader and simply returned the next unconsumed character from the scanner, then it is behaving inconsistently with the other next<Type> methods. These skip over delimiter characters before they attempt to parse a value.

  • If nextChar() to behaved like (say) nextInt then:

    • the delimiter skipping would be "unexpected" for some folks, and

    • there is the issue of whether it should accept a single "raw" character, or a sequence of digits that are the numeric representation of a char, or maybe even support escaping or something1.

No matter what choice they made, some people wouldn't be happy. My guess is that the designers decided to stay away from the tarpit.


1 - Would vote strongly for the raw character approach ... but the point is that there are alternatives that need to be analysed, etc.

Stephen C
  • 632,615
  • 86
  • 730
  • 1,096
1

The Scanner class is bases on logic implemented in String next(Pattern) method. The additional API method like nextDouble() or nextFloat(). Provide the pattern inside.

Then class description says:

A simple text scanner which can parse primitive types and strings using regular expressions.

A Scanner breaks its input into tokens using a delimiter pattern, which by default matches whitespace. The resulting tokens may then be converted into values of different types using the various next methods.

From the description it can be sad that someone has forgot about char as it is a primitive type for sure.

But the concept of class is to find patterns, a char has no pattern is just next character. And this logic IMHO caused that nextChar has not been implemented.

If you need to read a filed char by char you can used more efficient class.

1

I would imagine that it has to do with encoding. A char is 16 bytes and some encodings will use one byte for a character whereas another will use two or even more. When Java was originally designed, they assumed that any Unicode character would fit in 2 bytes, whereas now a Unicode character can require up to 4 bytes (UTF-32). There is no way for Scanner to represent a UTF-32 codepoint in a single char.

You can specify an encoding to Scanner when you construct an instance, and if not provided, it will use the platform character-set. But this still doesn't handle the issue with 3 or 4 byte Unicode characters, since they cannot be represented as a single char primitive (since char is only 16 bytes). So you would end up getting inconsistent results.

Vivin Paliath
  • 87,975
  • 37
  • 202
  • 284
  • 2
    I don't get it. If `Scanner` can't retrieve characters because it can't figure out the encoding, how can it implement _any_ of its scan methods? After all, those methods have to look at characters, no? – ajb Sep 11 '13 at 16:28
  • Did you mean methods where it returns `String`? The problem is that if you have a 4-byte unicode character, how would you represent that as a `char`? If it was a `String`, it can internally be represented as a `char` array with two `char`s inside it. But there is no way to get a meaningful response from `nextChar` if you're dealing with 3 or 4 byte unicode characters. – Vivin Paliath Sep 11 '13 at 16:30
  • As far as I know, a `Reader` is responsible for dealing with the encoding. Have a look at http://docs.oracle.com/javase/1.5.0/docs/api/index.html?java/io/InputStream.html – Marc Hauptmann Sep 11 '13 at 16:33
  • @VivinPaliath If you use `sc.next(".")` as Joachim suggested, it will return a 1-character `String` if there are any characters to return. If `Scanner` can't return a character because of encoding issues, it won't be able to return a 1-character String either. – ajb Sep 11 '13 at 16:36
  • @ajb A one "character" String is still internally composed of `char[]`. If you were reading in a UTF-32 character, what should `nextChar` return? `String` can figure out that it needs 4 bytes per character if it is encoded as `UTF-32` (because you can specify the encoding to `String`). – Vivin Paliath Sep 11 '13 at 16:39
  • @ajb Also, see [here](http://www.oracle.com/technetwork/articles/javase/supplementary-142654.html); it's a good read. If you're working on a character-by-character basis, you will have to deal with handling supplementary characters properly. If you are dealing with `char[]` or `CharSequence` (which is what `String` is dealing with) you won't have to worry as much. – Vivin Paliath Sep 11 '13 at 16:42
  • @VivinPaliath I looked at the `Scanner` source, and Marc is right. `Scanner` expects to be working with `char` values, which **by definition** are 16-bit Unicode characters. If a `Readable` is returning data where multiple `chars` represent some single Unicode character, the `Readable` is violating its contract. If you want a scanner that works with 32-bit characters you can't use `Scanner`. If you work in C, you have to get used to misusing types that look like one thing to mean something else. But please, not in Java. – ajb Sep 11 '13 at 17:02
  • @ajb It is 16-bit in Java because of a historical goof-up; they should have gone with 32-bit. `Scanner` internally uses a `CharBuffer` which is uses a `char[]` array internally. This is exactly my point: you cannot represent UTF-32 with a single `char` in Java, which is probably why `Scanner` doesn't have a `nextChar`. A `Readable` reads bytes into a `CharBuffer` which can hold multiple `char`s, so a UTF-32 codepoint is not an issue. The semantics of `nextChar` is encoding-dependent. – Vivin Paliath Sep 11 '13 at 17:20
  • @ajb I just realized we were talking past each other. I've removed the bit about "not knowing the encoding". I can see that it is problematic and makes the answer confusing, and it wasn't relevant anyway. I was talking about encoding being relevant as far as UTF-32 is concerned, not encoding in general. – Vivin Paliath Sep 11 '13 at 17:23