31.05.2015 / By Tomasz Perek

Null, NullPointerException and dealing with it

Null is evil

The concept of Null Reference is sometimes referred to as “The Billion Dollar Mistake”. Such pejorative description has been forged by Sir Anthony Hoare, (you can learn more here) probably most widely known for developing Quicksort, but ironically, also the man who first introduced nulls to ALGOL. But why is null A Bad Thing actually?

First of all, let’s give a simplistic definition of man under trial. Null is a reference that points to nowhere. It is an indicator, that there’s is no value where the reference points. Despite being said to be a root of all evil, it is quite natural concept. It is easy to imagine and justify it – simply because there sometimes is no result. Finding a substring in a string? None found? Here you go, a null.

Get a value from a map with a key that’s not really there? Another example. Yet, null is bad. Mostly because there’s no way to tell that a reference is null until you explicitly check it and – especially in object oriented languages – you can’t safely do any operation with it until you do the check.

Why? Because you’d reach for the data (or: dereference) that resides… exactly, nowhere, or noone-knows-where. Most languages hence throw an equivalent of Java’s NullPointerException (or short – NPE). Practice shows, that it can be raised in much more places than we anticipate for and this is where those Billion Dollar loses come from.

This has been recognized as a problem some time ago. Different languages address it with different techniques. Let’s just walk through them.

Static code analysis

There are tools for most popular languages that analyze the code and look for cases where you can possibly dereference null. Cppcheck for C++, Findbugs for Java, Code Contracts for C#/VB.Net to name a few.

Besides finding possible NPEs, they also do many other helpful things. But as long as NPEs are concerned point here is not only about finding places in code that should be preceded with if(x == null) {... but also using the means they provide to avoid having to do such things.

A programmer can code in a null-safe way and indicate the intention of being null-free using annotations, such as @NonNull, or @Nullable in case the threat is known and expected. IDEs and mentioned tools can then help you by instantly providing hints or even marking code erroneous and avoid NPEs.

Nullable and non-nullable types

Some languages, like Ceylon, Fantom or Kotlin forbid variable of any type to have possibly value of null unless it is explicitly declared as nullable type. In Kotlin it would look like:

https://gist.github.com/tomaszperek/c1dccd3ab3f759ada671

In Ceylon:

https://gist.github.com/tomaszperek/773eec4534f6f23e1781

In pair with this approach comes set of operators that can be used to deal with nullability. One of the most is safe dereference operator, ?., without which concept of nullable types wouldn’t be so shiny is . Just a short reminder – what happens when you call a method on null? NPE, exactly.

This is where ?. comes in. When you replace a good old dot with it, language handles checking null condition and if the reference happens to be null, the result of the expression is also null. In groovy this is just a convenience utility, but in Ceylon or Kotlin you actually can’t fully interact with nullable types without it.

Ceylon:

https://gist.github.com/tomaszperek/81993df5ce35e9afc519

Kotlin:

https://gist.github.com/tomaszperek/50febde2bd4d9acdd767

Another one, possibly more famous, probably because of its name, is so called ‘Elvis operator’. More formal name is ‘Null coalescing operator’. While it’s more of general purpose tool, it makes providing null-safety checks and safety belts much more clear and compact. Here’s how it works in Kotlin:

https://gist.github.com/tomaszperek/7a72bd3e43f007f9fced

It exists also in languages that don’t feature non-nullable types, like groovy (where it was called ‘Elvis’ for the first time) and also javascript and ruby (where its functionality is handled by || operator). Basically the function of the operator is to provide a default value that would be result of the expression in case the left-hand side part of the expression is null.

Monadic approach, the Option type.

The other approach modern languages take on Null problem is to have special type, that represents encapsulation of optional value. It is especially popular in functional languages, but Java and C++ also have it, although it’s neither idiomatic nor widely accepted there (at least not yet). Let’s see how different languages define it: Haskell:

https://gist.github.com/tomaszperek/622a40edcd7dd7b114d8

OCaml:

https://gist.github.com/tomaszperek/619051ca4e4e4d0b3c78

Scala:

https://gist.github.com/tomaszperek/e69dae9556813c2d29c2

It’s sometimes referred to as monadic approach, because Option happens to satisfy definition of a monad and functional programmers just love monads so much.

Some opponents of this approach claim, that Option type doesn’t address all possible issues and still there are holes and possibility of NPE. This is especially true when it comes to Scala and it interaction with Java.

First of all, it’s because you need to explicitly use Option, second, since a null is unfortunately still present in scala because val x: Option[Int] = null is perfectly valid, third, you can still construct grotesque Some(null), which is well… grotesque. You’d probably want to model things otherwise. This charge is less valid in languages where there is no null. Programmers of Standard ML (let’s suppose such exist) or Haskell are not that much concerned with this problem.

But there’s much more to monadic approach than just dealing with NPE – and it’s monadic composition. Monads in general and Option in particular provide programmers with many ways to elegantly apply functions to them, compose operation flows etc.

In other words, they not only inform you that there can be null and force you to deal with it, but also provide abundance of ways to do so. Idiomatic way of dealing with options is not to do pattern matching or explicit checks for value presence, but to use map(), flatMap() and all the other functions it provides – which is not like with nullable types and their relatively primitive constructs like ?. or ?:

Let’s see some examples in scala:

https://gist.github.com/tomaszperek/6b10b81514d256899b3f

In the debate on Option type versus nullable types an example of cache is often raised. Suppose you want to cache values you’d fetch from the database and now you ask your cache (which is in form of, say, Java / Scala map) about some value.

How would you model the difference between “I haven’t cached this value yet” and “I did cache it and it happens to be NULL in the db”? Option type basically allows you to have type Option[Option[T]], while there are no T?? nullable types.

Nil-punning

Some languages support concept of truthy and falsey values. Simplified definition would say truthy value is something, that when passed as a condition to if or while is evaluated to true as opposed to falsey, which would be treated as false.

But you can take it even beyond this. Clojure, which is modern Lisp clone working on JVM, also assigns nil (Clojure’s name for null) additional meaning, which is ‘empty seq’ , or, depending on context also ‘empty map’. This opens the gate for some shortcuts and interesting patters.

Few examples

https://gist.github.com/tomaszperek/ac79bf0c79a5a461e45c

In clojure nils flow through code back and forth and (provided this is pure native clojure code) most of the time are considered first-class citizens and do no harm. As expected, problems appear most often when it comes to interact with Java.

Summary

As you can see, there are quite a few approaches programming languages take on dealing with NullPointerException and it’s kin. To great extent the way you will have to prevent NPEs depends on what language you are using, but perhaps what you have read here could influence what will the next language you are going to learn will be.

Addressing “The Billion Dollar Mistake” is in fact quite fundamental thing and tells a lot about the language and its design. Anyway, good luck, and stay away from NPEs ;)

Notes

In comments to this article Gavin King pointed out that Ceylon’s nullable types are actually union types, where T? is just a syntactic sugar over T | Null. That makes is somewhat different than Kotlin’s nullable types, but in my opinion conceptually and in the scope of nullability they still behave similarly, so I didn’t delve into details there.

For anyone interested in them though, I recommend reading excellent ceylon documentation on the subject, because there’s obviously more to ceylon type system than simply adding non-nullability to java.

In another comment Bartek Andrzejczak reminded that in OOP another design pattern is often used, a “Null Object Pattern”. It indeed looks like a topic that I could spend more time on. Readers interested in it could start with Wikipedia article which covers the pattern quite nicely.

Thanks for feedback!

Do you like this post? Want to stay updated? Follow us on Twitter or subscribe to our Feed.