22

I'm designing a public interface (API) for a package. I wonder, should I use CharSequence generally instead of String. (I'm mainly talking about the public interfaces).

Are there any drawbacks of doing so? Is it considered a good practice?

What about using it for identifier-like purposes (when the value is matched against a set in a hash-based container)?

Basil Bourque
  • 218,480
  • 72
  • 657
  • 915
vbence
  • 19,252
  • 8
  • 61
  • 111
  • 1
    That depends on what the API is supposed to do. Usually people want to act on `String`s, but they might want to be able to pass in a `StringBuilder`, so being more permissive is nice. But if you're going to need to copy the value into a String in your implementation, then you might have created an API that is slow by design. – Mark Peters Nov 05 '12 at 15:03
  • For more discussion, see also the Question, [CharSequence VS String in Java?](http://stackoverflow.com/q/1049228/642706) and its duplicate, [Exact difference between CharSequence and String in java](http://stackoverflow.com/q/11323962/642706). And my [class diagram](http://i.stack.imgur.com/PIFk9.png). – Basil Bourque Jul 01 '15 at 18:40
  • 1
    possible duplicate of [Choosing between CharSequence and String for an API](http://stackoverflow.com/questions/8445311/choosing-between-charsequence-and-string-for-an-api) – Basil Bourque Jul 01 '15 at 18:42
  • @BasilBourque I think this one has better quality answers. – vbence Jul 02 '15 at 07:10

5 Answers5

33

CharSequence is rarely used in general purpose libraries. It should usually be used when your main use case is string handling (manipulation, parsing, ...).

Generally speaking you can do anything with a CharSequence that you could do with a String (trivially, since you can convert every CharSequence into a String). But there's one important difference: A CharSequence is not guaranteed to be immutable! Whenever you handle a String and inspect it at two different points in time, you can be sure that it will have the same value every time.

But for a CharSequence that's not necessarily true. For example someone could pass a StringBuilder into your method and modify it while you do something with it, which can break a lot of sane code.

Consider this pseudo-code:

public Object frobnicate(CharSequence something) {
  Object o = getFromCache(something);
  if (o == null) {
    o = computeValue(something);
    putIntoCache(o, something);
  }
  return o;
}

This looks harmless enough and if you'd had used String here it would mostly work (except maybe that the value might be calculated twice). But if something is a CharSequence then its content could change between the getFromCache call and the computeValue call. Or worse: between the computeValue call and the putIntoCache call!

Therefore: only accept CharSequence if there are big advantages and you know the drawbacks.

If you accept CharSequence you should document how your API handles mutable CharSequence objects. For example: "Modifying an argument while the method executes results in undefined behaviour."

Joachim Sauer
  • 278,207
  • 54
  • 523
  • 586
  • 1
    *For example someone could pass a StringBuilder into your method and modify it while you do something with it* Couldn't this be said about many classes, notably `List`? Should my library code only accept `ImmutableList`? Yeah, somebody could change the list while your method is working on it, but they'd be dumb to. And if it risks your invariants, then do a defensive copy before validating the input. – Mark Peters Nov 05 '12 at 15:17
  • 6
    @MarkPeters: yes, this is generally true. The *big* difference here is that for `String` the assumption that it never changes is pretty hard-coded into every Java developers brain (while most assume a `List` to be mutable). So when switching from a `String` to `CharSequence` they might miss the fact that a `CharSequence` doesn't necessarily share that nice property with a `String`. – Joachim Sauer Nov 05 '12 at 15:21
  • 5
    i'd say another primary use case for CharSequence over String is "large" sequences of characters, as the CharSequence impl could potentially be working with data which is not all in memory at the same time. – jtahlborn Nov 05 '12 at 15:39
  • @jtahlborn: that's right, but I consider that a subset of "string manipulation". – Joachim Sauer Nov 05 '12 at 15:40
  • @JoachimSauer - i don't follow what you mean about "string manipulation". why would i prefer CharSequence over String for "string manipulation", considering the CharSequence interface itself does not provide any modification methods. – jtahlborn Nov 05 '12 at 16:54
  • 2
    @jtahlborn: more precisely I mean "string handling", i.e. for example if you do output or append to some log file, then accepting a `CharSequence` might be useful (to be able to log a `StringBuilder`, for example). A parser is another example of something that should accept a `CharSequence`: it doesn't need everything in-memory, it only needs to be able to iterate over each character. – Joachim Sauer Nov 05 '12 at 16:59
  • I wish I could retract my up-vote. The first comment by Mark Peters is absolutely correct. `CharSequence` should definitely be used instead of `String` for public API. Just as with `List`, `Set`, and `Map`, we know to consider mutability. The only reason we see `String` so often in various APIs is because it did not exist until Java 4. Java was originally rushed out the door during the early Internet frenzy, and many oversights were made — implementing the concrete class like `String` without an interface like `CharSequence` was one of those oversights. – Basil Bourque May 26 '19 at 00:51
6

This does depend on what you need, I'd like to state two advantages of String, however.

From CharSequence's documentation:

Each object may be implemented by a different class, and there is no guarantee that each class will be capable of testing its instances for equality with those of the other. It is therefore inappropriate to use arbitrary CharSequence instances as elements in a set or as keys in a map.

Thus, whenever you need a Map or reliable equals/hashCode, you need to copy instances into a String (or whatever).

Moreover, I think CharSequence does not explicitly mention that implementations must be immutable. You may need to do defensive copying which may slow down your implementations.

Matthias Meid
  • 12,080
  • 6
  • 41
  • 73
5

Java CharSequence is an interface. As the API says, CharSequence has been implemented in CharBuffer, Segment, String, StringBuffer, StringBuilder classes. So if you want to access or accept your API from all these classes thenCharSequence is your choice. If not then String is very good for a public API because it is very easy & everybody knows about it. Remember CharSequence only gives you 4 method, so if you are accepting a CharSequence object through a method, then your input manipulation ability will be limited.

Sajith Janaprasad
  • 437
  • 1
  • 11
  • 22
  • 1
    I disagree with your closing sentence. One of the methods on CharSequence is `toString()`, so anything that can be done with a String can be done with an arbitrary CharSequence too (just call toString() on it and use whatever manipulation ability you were thinking of). – Andrzej Doyle Nov 05 '12 at 15:10
  • 5
    @AndrzejDoyle: `toString()` might be really expensive for a particular implementation though. Most implementations need to copy the entire sequence of characters into a new array. If your first step is to obtain a `String` from the `CharSequence`, you're providing flexibility at the cost of hiding the performance hit. There's not much value in that, and might as well take a `String` and let the user do the conversion so they're well aware of the penalty. – Mark Peters Nov 05 '12 at 15:13
  • @AndrzejDoyle: Mark Peters has answered for me and I like to add one thing to it. You are suggesting to use `CharSequence.toString()` method to manipulate input further, then why don't you just accept your input as a `String`. Then you don't have to convert your `CharSequence` to a `String`. – Sajith Janaprasad Nov 05 '12 at 15:34
  • "So if you want to access or accept your API from all these classes then `CharSequence` is your choice." I think that sentence is backwards; your API should accept CharSequence, not the other way around. "...then String is very good for a public API because it is very easy". Really? I think there is plenty to learn from the String API (if just because of regular expressions). Final note: only two of those methods of `CharSequence` *may* give you advantages over `String`: `subSequence` and `charAt`. – Maarten Bodewes Feb 10 '16 at 14:56
4

If a parameter is conceptually a sequence of chars, use CharSequence.

A string is technically a sequence of chars, but most often we don't think of it like that; a string is more atomic / holistic, we don't usually care about individual chars.

Think about int - though an int is technically a sequence of bits, we don't usually care about individual bits. We manipulate ints as atomic things.

So if the main work you are going to do on a parameter is to iterate through its chars, use CharSequence. If you are going to manipulate the parameter as an atomic thing, use String.

irreputable
  • 42,827
  • 9
  • 59
  • 89
0

You can implement CharSequenceto hold your passwords, because the usage of String is discouraged for that purpose. The implementation should have a dispose method that wipes out the plain text data.

Community
  • 1
  • 1
SpaceTrucker
  • 11,729
  • 6
  • 48
  • 95