9

I have read several books and seen several blogs discussing how returning an empty collection is better than returning null. I completely understand trying to avoid the check, but I don't understand why returning an empty collection is better than returning null. For example:

public class Dog{

   private List<String> bone;

   public List<String> get(){
       return bone;
   }

}

vs

 public class Dog{

   private List<String> bone;

   public List<String> get(){
       if(bone == null){
        return Collections.emptyList();
       }
       return bone;
   }

}

Example one will throw a NullPointerException and example two will throw an UnsupportedOperation exception, but they are both very generic exceptions. What makes one better or worse than the other?

Also a third option would be to do something like this:

 public class Dog{

   private List<String> bone;

   public List<String> get(){
       if(bone == null){
        return new ArrayList<String>();
       }
       return bone;
   }

}

but the problem with this is that you're adding unexpected behavior to your code which others may have to maintain.

I'm really looking for a solution to this predicament. Many people on blogs tend to just say it is better without an detailed explanation as to why. If returning an immutable list is best practice I am ok doing it, but I would like to understand why it is better.

  • 1
    If there is an empty array you can always access it and iterate it. No need to check for nulls. It's easier. But whether you should use it is opinion based issue. – Sami Kuhmonen Apr 21 '16 at 20:13
  • `the problem with this is that you're adding unexpected behavior to your code` I don't understand. Could you elaborate? – markspace Apr 21 '16 at 20:14
  • Why are you stating in the last option _but the problem with this is that you're adding unexpected behavior to your code which others may have to maintain._? Personally I usually go for this last option since you don't require to check for null and the receiving part can just traverse the returned collection. Within the javadoc you can remark that an empty list is returned if there is nothing found, etc. – uniknow Apr 21 '16 at 20:16
  • I think it depends on what users will do with the collection. If you're returning a 'read-only' collection, I guess I don't see how you expect to get an UnsupportedOperationException? (What code are you assuming in your caller?) – JVMATL Apr 21 '16 at 20:17
  • Returning an empty collection is an example of the [Null Object Pattern](https://en.wikipedia.org/wiki/Null_object_pattern). – Andy Turner Apr 21 '16 at 20:25
  • 1
    Note that it is frequently undesirable for classes to expose mutable references to their innards, since they lose control over what then happens to those values. If you want to be able to "add bones", it is better to provide an `addBone(String)` method; the `get()` method is better returning either an immutable value or a mutable copy. – Andy Turner Apr 21 '16 at 20:27

5 Answers5

9

If you return an empty collection (but not necessarily Collections.emptyList()), you avoid surprising downstream consumers of this method with an unintentional NPE.

This is preferable to returning null because:

  • The consumer doesn't have to guard against it
  • The consumer can operate on the collection irrespective of how many elements are in it

I say not necessarily Collections.emptyList() since, as you point out, you're trading one runtime exception for another in that adding to this list will be unsupported and once again surprise the consumer.

The most ideal solution to this: eager initialization of the field.

private List<String> bone = new ArrayList<>();

The next solution to this: make it return an Optional and do something in case it doesn't exist. Instead of throwing you could also provide the empty collection here if you so desired.

Dog dog = new Dog();
dog.get().orElseThrow(new IllegalStateException("Dog has no bones??"));
Makoto
  • 96,408
  • 24
  • 164
  • 210
  • 1
    I agree that eager initialization would be better, but then some might debate that initializing a list before it's used is using up memory for no good reason (albeit a very small amount) . –  Apr 21 '16 at 20:29
  • 2
    @Adam: I'd *gladly* trade a little bit of memory here to avoid a stupid and completely irresponsible NPE elsewhere. – Makoto Apr 21 '16 at 20:30
  • Thank you for this, makes sense =) –  Apr 21 '16 at 20:56
  • But what if a little bit becomes a lot? Say for example you are serializing/deserializing hundreds of thousands of objects (maybe millions) where the lists have been eagerly initialized. How would the strategy change? This is more hypothetical but I'm just curious. –  Apr 21 '16 at 21:54
  • @Adam: At that time you should consider formally profiling your application. Chances are, if there are hundreds of thousands of objects floating around at once, there's more than one bottleneck (and, one that would prove more fruitful to fix than a simple eager initialization). – Makoto Apr 21 '16 at 21:58
2

Because the alternative to returning an empty collection is generally returning null; and then callers have to add guards against NullPointerException. If you return an empty collection that class of error is mitigated. In Java 8+ there is also an Optional type, which can serve the same purpose without a Collection.

Elliott Frisch
  • 183,598
  • 16
  • 131
  • 226
  • Right, but my point is that if you try to add something to an immutable list you are still going to get an UnsupportedOperationException, so you have to guard against that, so I'm wondering where the benefit lies. –  Apr 21 '16 at 20:23
  • 2
    Who says an empty collection **must** be an *immutable* list? If you are planning on adding items to it, don't return an *immutable* list. – Elliott Frisch Apr 21 '16 at 20:24
1

The reason you're confused about this is because you're not initializing bone in the first place. If you create a new List<string> in your constructor, you'll find the check in get() is unwarranted.

Though if you approach the question mathematically, you would want to reserve returning null values for collections that don't exist as opposed to collections that are simply empty.

Nonexistent collections don't have much practical application in programming, but this is still a good way of thinking about collections.

Miles Smith
  • 124
  • 3
0

I don't think I understand why you object to empty collections, but I'll point out in the meantime that I think your code needs improvement. Maybe that's the issue?

Avoid unnecessary null checks in your own code:

public class Dog{

   private List<String> bone = new ArrayList<>();

   public List<String> get(){
       return bone;
   }
}

Or consider not creating a new list each time:

 public class Dog{

   private List<String> bone;

   public List<String> get(){
       if(bone == null){
        return Collections.EMPTY_LIST;
       }
       return bone;
   }
}
markspace
  • 9,246
  • 2
  • 20
  • 35
  • Your first approach was better. Your second approach surprises one in that all of a sudden, the `get` method of a class instantiates a variable. – Makoto Apr 21 '16 at 20:18
0

The following answer could be of interest regarding your question: should-functions-return-null-or-an-empty-object.

Summarized:


Returning null is usually the best idea if you intend to indicate that no data is available.

An empty object implies data has been returned, whereas returning null clearly indicates that nothing has been returned.

Additionally, returning a null will result in a null exception if you attempt to access members in the object, which can be useful for highlighting buggy code - attempting to access a member of nothing makes no sense. Accessing members of an empty object will not fail meaning bugs can go undiscovered.


Also from clean code:


The problem with using null is that the person using the interface doesn't know if null is a possible outcome, and whether they have to check for it, because there's no not null reference type.


From Martin Fowler's Special Case pattern


Nulls are awkward things in object-oriented programs because they defeat polymorphism. Usually you can invoke foo freely on a variable reference of a given type without worrying about whether the item is the exact type or a sub-class. With a strongly typed language you can even have the compiler check that the call is correct. However, since a variable can contain null, you may run into a runtime error by invoking a message on null, which will get you a nice, friendly stack trace.

If it's possible for a variable to be null, you have to remember to surround it with null test code so you'll do the right thing if a null is present. Often the right thing is same in many contexts, so you end up writing similar code in lots of places - committing the sin of code duplication.

Nulls are a common example of such problems and others crop up regularly. In number systems you have to deal with infinity, which has special rules for things like addition that break the usual invariants of real numbers. One of my earliest experiences in business software was with a utility customer who wasn't fully known, referred to as "occupant." All of these imply altering the usual behavior of the type.

Instead of returning null, or some odd value, return a Special Case that has the same interface as what the caller expects.


And finally from Billion Dollar Mistake!


I call it my billion-dollar mistake. It was the invention of the null reference in 1965. At that time, I was designing the first comprehensive type system for references in an object oriented language (ALGOL W).

My goal was to ensure that all use of references should be absolutely safe, with checking performed automatically by the compiler. But I couldn't resist the temptation to put in a null reference, simply because it was so easy to implement.

This has led to innumerable errors, vulnerabilities, and system crashes, which have probably caused a billion dollars of pain and damage in the last forty years.

In recent years, a number of program analysers like PREfix and PREfast in Microsoft have been used to check references, and give warnings if there is a risk they may be non-null. More recent programming languages like Spec# have introduced declarations for non-null references. This is the solution, which I rejected in 1965.

Tony Hoare


Hope this provides enough reasons on why it is regarded better to return an empty collection or special return value instead of null.

Community
  • 1
  • 1
uniknow
  • 916
  • 6
  • 5
  • This really isn't *your* answer, though. You've copied it verbatim from at least two other places. – Makoto Apr 21 '16 at 21:59