Monday, October 6, 2008

Avoiding Nulls

One of the reasons I like Scala better than Java is because I believe it promotes better programming practices. Avoiding null is one of these practices.

In Java, null is often used as a marker to indicate a missing value for an object. For example, System.getProperty returns null if the property is not defined.

In Scala, the accepted way to do this is to use an instance of the Option class.

Contents

The Option Class

Scala's Option class has two subclasses: None and Some. Option itself is an abstract class, so can't be instantiated. Thus an object of type Option must contain either None or Some (well, it is possible for it to be null, but that's generally considered bad form). The None value is used in the same way as null is typically used in Java, to indicate a missing value. If the object is an instance of Some, it contains a value which is the data of interest.

Using Option has these advantages over using null to indicate a missing value:
  • A None value unambiguously means the optional value is missing. A null value may mean a missing value or it may mean the variable was not initialized.
  • The fact that Option contains explicit methods to get at the actual value makes you think about the possibility that it might not be there and how to handle that situation, so you are less likely to write code that mistakenly assumes there is a value when there is not.
  • If you do write code that assumes there is a value there and it is executed when there is not a value, the NoSuchElementException exception you get when using an Option is more specific than a NullPointerException and so should be easier to interpret, track down, and fix.
  • Option contains an assortment of methods to get at or manipulate the optional value, which make for more concise coding.
Let's take a look at that last point in more detail.

How would we want to define getProperty in Scala? We would of course make it return Option[String] rather than a String value which might be null.

Here is a code sample that shows how a Scala implementation of getProperty could be used if that hypothetical method returned an Option that can contain a String (remember that Scala uses [] for types, the way Java uses <> in generics):
val opt : Option[String] = getProperty("MY_PROPERTY") opt match { case None => println("MY_PROPERTY is not defined") case Some(val) => println("MY_PROPERTY is "+val) }
The Option class has methods such as isEmpty, isDefined, get, and getOrElse that can be used to test and use the contained value. It also has some more interesting methods such as map and filter that are less familiar to imperative programmers but can be used in very nice ways, especially when more than one level of Option is involved.

For example, say we want to have a method that returns an Int value for a property, so we define a method getIntProperty that takes a String property name argument and returns Option[Int]. In other words, it returns None if the property is not defined, or Some[Int] (such as Some(123)) if the property is defined. Assuming we already have the getProperty method mentioned above that returns Option[String], we can build our getIntProperty method from getProperty plus a method that accepts an Option[String] and parses it as an integer to produce an Option[Int]. Let's call this method parseOptionalInt. With this parseOptionalInt, our getIntProperty method is simple:
def getIntProperty(s:String) : Option[Int] = parseOptionalInt(getProperty(s))
We could write parseOptionalInt like this:
def parseOptionalInt(sOpt:Option[String]) : Option[Int] = { if (sOpt.isEmpty) return None else return Some(sOpt.get.toInt) }
Or, using pattern matching, we could write it like this:
def parseOptionalInt(sOpt:Option[String]) : Option[Int] = { sOpt match { case None => None case Some(s) => Some(s.toInt) } }
But there is another interesting way of writing it, using the map method.

Using map and flatMap

Using the map method of Option, our method definition looks like this:
def parseOptionalInt(sOpt:Option[String]) : Option[Int] = sOpt map (_.toInt)
Using map allows us to reduce the body of our parseOptionalInt method from four lines of code down to one line.

The map method on Option has the property that None always maps to None. It only applies the mapping function to Some() values. The flatMap and filter methods have this same behavior. This allows you to chain operations together, and if any of them generate None from their input, the result will be None.

As a more extended example of this chaining, assume we have the following functions available to us, each of which returns None if it can't load the requested item:
//Like Java's System.getProperty def getSystemProperty(key:String) : Option[String] //Given a filename, load a .properties file into a Scala Map def loadPropertyFile(filename:String) : Option[Map[String,String]]
We also take advantage of Map.get, which returns an Option.

Here's what we want to do: read a System property called PROPFILE that is the name of a properties file, load that properties file, read the value of the TIMEOUT property from it and convert it to an integer. If the System property is not set, or the file does not exist, or the property does not exist in the file, then return a default value of 60. Here's the Scala code to do that:
val x = (getSystemProperty("PROPFILE") flatMap loadPropertyFile flatMap (_.get("TIMEOUT")) map (_.toInt) getOrElse 60)
Because Option implements map, flatMap and filter, we can also use the for syntax on Options, as with any of the Scala classes that implement those functions.

Legacy Java

Java has a lot of methods that return null or accept null as an argument to indicate a missing value. Scala can easily call these methods directly and use null in this way, but it would be nicer if those Java methods could take and return instances of Option instead.

We could implement an Option class in Java, with subclasses Some and None as in Scala and many of the Option methods such as isEmpty, isDefined, get, and getOrElse. Unfortunately, we can't implement map and filter as elegantly as in Scala because Java does not have function literals, although this may change in a future version of Java.

The other approach is to write wrapper functions in Scala that call the Java functions and translate between null values and None values. There are two halves to this: accepting an Option and passing the contents of that Option or null to Java, and accepting a value that might be null from Java and converting it to an Option that might be None.

I implemented these two conversions by creating an object SomeOrNone with an apply method that creates an Option and an implicit conversion to convert from an Option to a raw value or null.
object SomeOrNone { class OptionOrNull[T](x:Option[T]) { def getOrNull():T = if (x.isDefined) x.get else null.asInstanceOf[T] } implicit def optionOrNull[T](x:Option[T]) = new OptionOrNull(x) def apply[T](x:T) = if (x==null) None else Some(x) }
The application includes the following line to pick up the implicit conversion:
import SomeOrNone.optionOrNull
With the implicit conversion in scope, the getOrNull method can be applied to any Option, making it easy to convert from the Option to a value or null when passing an argument to a legacy Java method.

On the return side, SomeOrNone can be applied as a function to the value returned by the Java method in order to get an Option that will be None if the value was null.

With this simple helper class, we can now trivially write the Scala version of getProperty that we mentioned above:
def getProperty(key:String) = SomeOrNone(System.getProperty(key))
As an example of passing in a value, which may be null, that is taken from an Option, here is how we would call the two-argument version of Java's System.getProperty, where the second argument is a default value, which may be null, to be returned if the specified property value is not found:
import SomeOrNone.optionOrNull def getProperty(key:String, dflt:Option[String]) = SomeOrNone(System.getProperty(key, dflt.getOrNull))
We can write wrapper methods as shown above, or we can just use the SomeOrNone and getOrNull functions in-line when calling a legacy Java method from our Scala code. Either way, we need never deal with those pesky null-as-missing-value markers again.

Other Blogs

Here are some other blogs about Scala's Option type or dealing with nulls:
  • David Pollack (2007-04-13) on how Option is used in his Lift web framework, including how Option fits in to for comprehensions.
  • Tony Morris (2008-01-16) on using higher-order functions with Option to replace pattern matching.
  • Debasish (2008-03-11) comparing to the Maybe Monad.
  • Daniel Wellman (2008-03-30) on using Option to avoid NullPointerExceptions.
  • Daniel Spiewak (2008-04-07), including a Java implementation of Option. Read the comments for some good info (and links) about Option as a Monad (which Daniel mentions at the start of his article, but he does not discuss that aspect much), and a pointer to another Java implementation.
  • Luke Plant (2008-05-07) on why Haskell Maybe is better than null pointers.
  • Ted Neward (2008-06-27) on Option as a container, with a sidebar comparing it to C# 2.0 nullable types.
  • Stephan Schmidt (2008-08-06) on a hack to implement an Option monad in Java.
  • Tom Adams (2008-08-20) comparing the handling of nulls in Java and Scala.

3 comments:

James Iry said...

getOrNull can be written more cleanly with "getOrElse". Here I've unwrapped it from the object wrapper, but same idea.

scala> scala> def getOrNull[T](o:Option[T]) : T = o getOrElse null.asInstanceOf[T]
getOrNull: [T](Option[T])T


Fair warning, this kind of getOrNull doesn't work properly for "AnyVals" due to the peculiar way casting null to an AnyVal works.

scala> getOrNull(Some(45))
res1: Int = 45

That's fine, but watch what happens with None

scala> val x : Option[Int] = None
x: Option[Int] = None

scala> getOrNull(x)
res2: Int = 0

One alternative is to constrain the types getOrNull it works on, thus getting rid of the cast

scala> def getOrNull[T >: Null](o:Option[T]) : T = o getOrElse null

The AnyVals are still a bit peculiar, but not quite so bad.

scala> getOrNull(Some(45))
res3: Any = 45

scala> getOrNull(x)
res4: Any = null

Jim McBeath said...

James: thanks for pointing out how it works for an AnyVal, I had not tried that and it is an interesting corner case to know about.

Given that I called my method getOrNull and that it is intended for calls to methods that can accept a null, I should probably define it only to accept an AnyRef:

def getOrNull[T<:AnyRef](o:Option[T]):T = o getOrElse null.asInstanceOf[T]

James Iry said...

Indeed. Or put both upper and lower bounds to get rid of the cast.

def getOrNull[T >: Null <: AnyRef](o:Option[T]) = o getOrElse null