Chapter 1. Zero to Sixty: Introducing Scala

Why Scala?

Today’s enterprise and Internet applications must balance a number of concerns. They must be implemented quickly and reliably. New features must be added in short, incremental cycles. Beyond simply providing business logic, applications must support secure access, persistence of data, transactional behavior, and other advanced features. Applications must be highly available and scalable, requiring designs that support concurrency and distribution. Applications are networked and provide interfaces for both people and other applications to use.

To meet these challenges, many developers are looking for new languages and tools. Venerable standbys like Java, C#, and C++ are no longer optimal for developing the next generation of applications.

If You Are a Java Programmer…

Java was officially introduced by Sun Microsystems in May of 1995, at the advent of widespread interest in the Internet. Java was immediately hailed as an ideal language for writing browser-based applets, where a secure, portable, and developer-friendly application language was needed. The reigning language of the day, C++, was not suitable for this domain.

Today, Java is more often used for server-side applications. It is one of the most popular languages in use for the development of web and enterprise applications.

However, Java was a child of its time. Now it shows its age. In 1995, Java provided a syntax similar enough to C++ to entice C++ developers, while avoiding many of that language’s deficiencies and “sharp edges”. Java adopted the most useful ideas for the development problems of its era, such as object-oriented programming (OOP), while discarding more troublesome techniques, such as manual memory management. These design choices struck an excellent balance that minimized complexity and maximized developer productivity, while trading-off performance compared to natively-compiled code. While Java has evolved since its birth, many people believe it has grown too complex without adequately addressing some newer development challenges.

Developers want languages that are more succinct and flexible to improve their productivity. This is one reason why so-called “scripting” languages like Ruby and Python have become more popular recently.

The never-ending need to scale is driving architectures towards pervasive concurrency. However, Java’s concurrency model, which is based on synchronized access to shared, mutable state, results in complex and error-prone programs.

While the Java language is showing its age, the Java Virtual Machine (JVM) on which it runs continues to shine. The optimizations performed by today’s JVM are extraordinary, allowing byte code to outperform natively-compiled code in many cases. Today, many developers believe that using the JVM with new languages is the path forward. Sun is embracing this trend by employing many of the lead developers of JRuby and Jython, which are JVM ports of Ruby and Python, respectively.

The appeal of Scala for the Java developer is that it gives you a newer, more modern language, while leveraging the JVM’s amazing performance and the wealth of Java libraries that have been developed for over a decade.

If You Are a Ruby, Python, etc. Programmer…

Dynamically typed languages like Ruby, Python, Groovy, JavaScript, and Smalltalk offer very high productivity due to their flexibility, powerful metaprogramming, and elegance.

Despite their productivity advantages, dynamic languages may not be the best choices for all applications, particularly for very large code bases and high-performance applications. There is a longstanding, spirited debate in the programming community about the relative merits of dynamic vs. static typing. Many of the points of comparison are somewhat subjective. We won’t go through all the arguments here, but we will offer a few thoughts for consideration.

Optimizing the performance of a dynamic language is more challenging than for a static language. In a static language, optimizers can exploit the type information to make decisions. In a dynamic language, fewer such clues are available for the optimizer, making optimization choices harder. While recent advancements in optimizations for dynamic languages are promising, they lag behind the state of the art for static languages. So, if you require very high performance, static languages are probably a safer choice.

Static languages can also benefit the development process. IDE features like autocompletion (sometimes called code sense) are easier to implement for static languages, again because of the extra type information available. The more explicit type information in static code promotes better “self-documentation”, which can be important for communicating intent among developers, especially as a project grows.

When using a static language, you have to think about appropriate type choices more often, which forces you to weigh design choices more carefully. While this may slow down daily design decisions, thinking through the types in the application can result in a more coherent design over time.

Another small benefit of static languages is the extra checking the compiler performs. We think this advantage is often oversold, as type mismatch errors are a small fraction of the runtime errors you typically see. The compiler can’t find logic errors, which are far more significant. Only a comprehensive, automated test suite can find logic errors. For dynamically typed languages, the tests must cover possible type errors, too. If you are coming from a dynamically typed language, you may find that your test suites are a little smaller, as a result, but not that much smaller.

Many developers who find static languages too verbose often blame static typing for the verbosity when the real problem is a lack of type inference. In type inference, the compiler infers the types of values based on the context. For example, the compiler will recognize that x = 1 + 3 means that x must be an integer. Type inference reduces verbosity significantly, making the code feel more like code written in a dynamic language.

We have worked with both static and dynamic languages, at various times. We find both kinds of languages compelling for different reasons. We believe the modern software developer must master a range of languages and tools. Sometimes, a dynamic language will be the right tool for the job. At other times, a static language like Scala is just what you need.

Introducing Scala

Scala is a language that addresses the major needs of the modern developer. It is a statically typed, mixed-paradigm, JVM language with a succinct, elegant, and flexible syntax, a sophisticated type system, and idioms that promote scalability from small, interpreted scripts to large, sophisticated applications. That’s a mouthful, so let’s look at each of those ideas in more detail.

Statically Typed

As we described in the previous section, a statically typed language binds the type to a variable for the lifetime of that variable. In contrast dynamically typed languages bind the type to the actual value referenced by a variable, meaning that the type of a variable can change along with the value it references.

Of the set of newer JVM languages, Scala is one of the few that is statically typed and it is the best known among them.

Mixed Paradigm - Object Oriented Programming

Scala fully supports Object Oriented Programming (OOP). Scala improves upon Java’s support for OOP with the addition of traits, a clean way of implementing classes using mixin composition. Scala’s traits work much like Ruby’s modules. If you’re a Java programmer, think of traits as unifying interfaces with their implementations.

In Scala, everything is really an object. Scala does not have primitive types, like Java. Instead all numeric types are true objects. However, for optimal performance, Scala uses the underlying primitives types of the runtime whenever possible. Also, Scala does not support “static” or class-level members of types, since they are not associated with an actual instance. Instead, Scala supports a singleton object construct to support those cases where exactly one instance of a type is needed.

Mixed Paradigm - Functional Programming

Scala fully supports Functional Programming (FP). FP is a programming paradigm that is older than OOP, but it has been sheltered in the ivory towers of academia until recently. Interest in FP is increasing because of the ways it simplifies certain design problems, especially concurrency. “Pure” functional languages don’t allow for any mutable state, thereby avoiding the need for synchronization on shared access to mutable state. Instead, programs written in pure functional languages communicate by passing messages between concurrent, autonomous processes. Scala supports this model with its Actors library, but it allows for both mutable and immutable variables.

Functions are “first-class” citizens in FP, meaning they can be assigned to variables, passed to other functions, etc., just like other values. This feature promotes composition of advanced behavior using primitive operations. Because Scala adheres to the dictum that everything is an object, functions are themselves objects in Scala.

Scala also offers closures, a feature that dynamic languages like Python and Ruby have adopted from the functional programming world, and one sadly absent from recent versions of Java. Closures are functions that reference variables from the scope enclosing the function definition. That is, the variables aren’t passed in as arguments or defined as local variables within the function. A closure "closes around" these references, so the function invocation can safely refer to the variables even when the variables have gone out of scope! Closures are such a powerful abstraction that object systems and fundamental control structures are often implemented using them.

A JVM and .NET Language

While Scala is primarily known as a JVM language, meaning that Scala generates JVM byte code, a .NET version of Scala is also under development that generates CLR byte code. When we refer to the underlying “runtime”, we will usually discuss the JVM, but most of what we will say applies equally to both runtimes. When we discuss JVM-specific details, they generalize to the .NET version, except where noted.

The Scala compiler uses clever techniques to map Scala extensions to valid byte-code idioms. From Scala, you can easily invoke byte code that originated as Java source (for the JVM) or C# source (for .NET). Conversely, you can invoke Scala code from Java, C#, etc.. Running on the JVM and CLR allows the Scala developer to leverage available libraries and to interoperate with other languages hosted on those runtimes.

A Succinct, Elegant, and Flexible Syntax

Java syntax can be verbose. Scala uses a number of techniques to minimize unnecessary syntax, making Scala code as succinct as code in most dynamically typed languages. Type inference minimizes the need for explicit type information in many contexts. Declarations of types and functions are very concise.

Scala allows function names to include non-alphanumeric characters. Combined with some syntactic sugar, this feature permits the user to define methods that look and behave like operators. As a result, libraries outside the core of the language can feel “native” to users.

A Sophisticated Type System
Scala extends the type system of Java with more flexible generics and a number of more advanced typing constructs. The type system can be intimidating at first, but most of the time you won’t need to worry about the advanced constructs. Type inference helps by automatically inferring type signatures, so that the user doesn’t have to provide trivial type information manually. When you need them, though, the advanced type features provide you with greater flexibility for solving design problems in a type-safe way.

Scalable - Architectures

Scala is designed to scale from small, interpreted scripts to large, distributed applications. Scala provides four language mechanisms that promote scalable composition of systems: 1) explicit selftypes, 2) abstract type members and generics, 3) nested classes, and 4) mixin composition using traits.

No other language provides all these mechanisms. Together, they allow applications to be constructed from reusable “components” in a type-safe and succinct manner. As we will see, many common design patterns and architectural techniques like dependency injection are easy to implement in Scala without the boilerplate code or lengthy XML configuration files that can make Java development tedious.

Scalable - Performance
Because Scala code runs on the JVM and the CLR, it benefits from all the performance optimizations provided by those runtimes and all the third-party tools that support performance and scalability, such as profilers, distributed cache libraries, clustering mechanisms, etc. If you trust Java’s and C#'s performance, you can trust Scala’s performance. Of course, some particular constructs in the language and some parts of the library may perform significantly better or worse than alternative options in other languages. As always, you should profile your code and optimize it when necessary.

It might appear that OOP and FP are incompatible. In fact, a design philosophy of Scala is that OOP and FP are more synergistic than opposed. The features of one approach can enhance the other.

In FP, functions have no side effects and variables are immutable, while in OOP, mutable state and side effects are common, even encouraged. Scala lets you choose the approach that best fits your design problems. Functional programming is especially useful for concurrency, since it eliminates the need to synchronize access to mutable state. However, “pure” FP can be restrictive. Some design problems are easier to solve with mutable objects.

The name Scala is a contraction of the words scalable language. While this suggests that the pronunciation should be scale-ah, the creators of Scala actually pronounce it scah-lah, like the Italian word for “stairs”. That is, the two “a’s” are pronounced the same.

Scala was started by Martin Odersky in 2001. Martin is a professor in the School of Computer and Communication Sciences at the Ecole Polytechnique Fédérale de Lausanne (EPFL). He spent his graduate years working in the group headed by Niklaus Wirth, of Pascal fame. Martin worked on Pizza, an early functional language on the JVM. He later worked on GJ, a prototype of what later became Generics in Java, with Philip Wadler of Haskell fame. Martin was hired by Sun Microsystems to produce the reference implementation of javac, the Java compiler that ships with the Java Developer Kit (JDK) today.

Martin Odersky’s background and experience are evident in the language. As you learn Scala, you come to understand that it is the product of carefully considered design decisions, exploiting the state of the art in type theory, OOP and FP. Martin’s experience with the JVM is evident in Scala’s elegant integration with that platform. The synthesis it creates between OOP and FP is an excellent "best of both worlds" solution.

The Seductions of Scala

Today, our industry is fortunate to have a wide variety of language options. The power, flexibility, and elegance of dynamically typed languages have made them very popular again. Yet, the wealth of Java and .NET libraries and the performance of the JVM and CLR meet many practical needs for enterprise and Internet projects.

Scala is compelling because it feels like a dynamically typed scripting language, due to its succinct syntax and type inference. Yet, Scala gives you all the benefits of static typing, a modern object model, functional programming, and an advanced type system. These tools let you build scalable, modular applications that can reuse legacy Java and .NET APIs and leverage the performance of the JVM and CLR.

Scala is a language for professional developers. Compared to languages like Java and Ruby, Scala is a more difficult language to master, because it requires competency with OOP, FP, and static typing to use it most effectively. It is tempting to prefer the relative simplicity of dynamically-typed languages. Yet, this simplicity can be deceptive. In a dynamically typed language, it is often necessary to use metaprogramming features to implement advanced designs. While metaprogramming is powerful, using it well takes experience and the resulting code tends to be hard to understand, maintain, and debug. In Scala, many of the same design goals can be achieved in a type-safe manner by exploiting its type system and mixin composition through traits.

We feel that the extra effort required day to day to use Scala will promote more careful reflection about your designs. Over time, this discipline will yield more coherent, modular, and maintainable applications. Fortunately, you don’t need all of the sophistication of Scala all of the time. Much of your code will have the simplicity and clarity of code written in your favorite dynamically-typed language.

An alternative strategy is to combine several, simpler languages, e.g., Java for object-oriented code and Erlang for functional, concurrent code. Such a decomposition can work, but only if your system decomposes cleanly into such discrete parts and your team can manage a heterogeneous environment. Scala is attractive for situations where a single, all-in-one language is preferred. That said, Scala code can happily coexist with other languages, especially on the JVM or .NET.

Installing Scala

To get up and running as quickly as possible, this section describes how to install the command line tools for Scala, which are all you need to work with the examples in the book. For details on using Scala in various editors and IDEs, see Integration with IDEs in Chapter 14, Scala Tools, Libraries and IDE Support. The examples used in this book were written and compiled using Scala version 2.7.5.final, the latest release at the time of this writing, and “nightly builds” of Scala version 2.8.0, which may be finalized by the time you read this.

Note

Version 2.8 introduces many new features, which we will highlight throughout the book.

We will work with the JVM version of Scala in this book. First, you must have Java 1.4 or greater installed (1.5 or greater is recommended). If you need to install Java, go to http://www.java.com/en/download/manual.jsp and follow the instructions to install Java on your machine.

The official Scala web site is http://www.scala-lang.org/. To install Scala, go to the downloads page, http://www.scala-lang.org/downloads. Download the installer for your environment and follow the instructions on the downloads page.

The easiest cross-platform installer is the IzPack installer. Download the Scala jar file, either scala-2.7.5.final-installer.jar or scala-2.8.0.N-installer.jar, where N is the latest release of the 2.8.0 version. Go to the download directory in a terminal window, and install Scala with the java command. Assuming you downloaded scala-2.8.0.final-installer.jar, run the following command, which will guide you through the process.

java -jar scala-2.8.0.final-installer.jar

Tip

On Mac OS X, the easiest route to a working Scala installation is via MacPorts. Follow the installation instructions at http://www.macports.org/, then sudo port install scala. You’ll be up and running in a few minutes.

Throughout this book, we will use the symbol scala-home to refer to the “root” directory of your Scala installation.

Note

On Unix, Linux, and Mac OS X systems, you will need to run this command as the root user or using the sudo command if you want to install Scala under a system directory, e.g., scala-home = /usr/local/scala-2.8.0.final.

As an alternative, you can download and expand the compressed tar file (e.g., scala-2.8.0.final.tgz) or zip file (scala-2.8.0.final.zip). On Unix-like systems, expand the compressed file into a location of your choosing. Afterwards, add the scala-home/bin subdirectory in the new directory to your PATH. For example, if you installed into /usr/local/scala-2.8.0.final, then add /usr/local/scala-2.8.0.final/bin to your PATH.

To test your installation, run the following command on a command line.

scala -version

We’ll learn more about the scala command-line tool later. You should get something like the following output:

Scala code runner version 2.8.0.final -- Copyright 2002-2009, LAMP/EPFL

Of course, the version number you see will be different if you installed a different release. From now on, when we show command output that contains the version number, we’ll show it as version 2.8.0.final.

Congratulations, you have installed Scala! If you get an error message along the lines of scala: command not found, make sure your environment’s PATH is set properly to include the correct bin directory.

Note

Scala versions 2.7.X and earlier are compatible with JDK 1.4 and later. Scala version 2.8 drops 1.4 compatibility. Note that Scala uses many JDK classes as its own, for example, the String class. On .NET, Scala uses the corresponding .NET classes.

You can also find downloads for the API documentation and the sources for Scala itself on the same downloads page.

For More Information

As you explore Scala, you will find other useful resources that are available on http://scala-lang.org. You will find links for development support tools and libraries, tutorials, the language specification [ScalaSpec2009], and academic papers that describe features of the language.

The documentation for the Scala tools and APIs are especially useful. You can browse the API at http://www.scala-lang.org/docu/files/api/index.html. This documentation was generated using the scaladoc tool, analogous to Java’s javadoc tool. See the section called “The scaladoc Command Line Tool” in Chapter 14, Scala Tools, Libraries and IDE Support for more information.

You can also download a compressed file of the API documentation for local browsing using the appropriate link on the downloads page, http://www.scala-lang.org/downloads, or you can install it with the sbaz package tool, as follows.

sbaz install scala-devel-docs

sbaz is installed in the same bin directory as the scala and scalac command-line tools. The installed documentation also includes details on the scala tool chain (including sbaz) and code examples. For more information on the Scala command-line tools and other resources, see Chapter 14, Scala Tools, Libraries and IDE Support.

A Taste of Scala

It’s time to whet your appetite with some real Scala code. In the following examples, we’ll describe just enough of the details so you understand what’s going on. The goal is to give you a sense of what programming in Scala is like. We’ll explore the details of the features in subsequent chapters.

For our first example, you could run it one of two ways, interactively or as a “script”.

Let’s start with the interactive mode. Start the scala interpreter by typing scala and the return key on your command line. You’ll see the following output. (Some of the version numbers may vary.)

Welcome to Scala version 2.8.0.final (Java ...).
Type in expressions to have them evaluated.
Type :help for more information.

scala>

The last line is the prompt that is waiting for your input. The interactive mode of the scala command is very convenient for experimentation (see the section called “The scala Command Line Tool” in Chapter 14, Scala Tools, Libraries and IDE Support for more details). An interactive interpreter like this is called a REPL: Read, Evaluate, Print, Loop.

Type in the following two lines of code.

val book = "Programming Scala"
println(book)

The actual input and output should look like the following.

scala> val book = "Programming Scala"
book: java.lang.String = Programming Scala

scala> println(book)
Programming Scala

scala>

The first line uses the val keyword to declare a read-only variable named book. Note that the output returned from the interpreter shows you the type and value of book. This can be very handy for understanding complex declarations. The second line prints the value of book, which is “Programming Scala”.

Tip

Experimenting with the scala command in the interactive mode (REPL) is a great way to learn the details of Scala.

Many of the examples in this book can be executed in the interpreter like this. However, it’s often more convenient to use the second option we mentioned, writing Scala scripts in a text editor or IDE and executing them with the same scala command. We’ll do that for most of the remaining examples in this chapter.

In your text editor of choice, save the Scala code in the following example to a file named upper1-script.scala in a directory of your choosing.

// code-examples/IntroducingScala/upper1-script.scala

class Upper {
  def upper(strings: String*): Seq[String] = {
    strings.map((s:String) => s.toUpperCase())
  }
}

val up = new Upper
Console.println(up.upper("A", "First", "Scala", "Program"))

This Scala script converts strings to upper case.

By the way, that’s a comment on the first line (with the name of the source file for the code example). Scala follows the same comment conventions as Java, C#, C++, etc. A // comment goes to the end of a line, while a /* comment */ can cross line boundaries.

To run this script, go to a command window, change to the same directory and run the following command.

scala upper1-script.scala

The file is interpreted, meaning it is compiled and executed in one step. You should get the following output:

Array(A, FIRST, SCALA, PROGRAM)

In this example, the upper method in the Upper class (no pun intended) converts the input strings to upper case and returns them in an array. The last line in the example converts four strings and prints the resulting Array.

Let’s examine the code in detail, so we can begin to learn Scala syntax. There are a lot of details in just six lines of code! We’ll explain the general ideas here. All the ideas used in this example will be explained more thoroughly in later sections of the book.

In the example, the Upper class begins with the class keyword. The class body is inside the outer most curly braces {…}.

The upper method definition begins on the second line with the def keyword, followed by the method name and an argument list, the return type of the method, an equals sign ‘=’, then the method body.

The argument list in parentheses is actually a variable-length argument list of String's, indicated by the String* type following the colon. That is, you can pass in as many comma-separated strings as you want (including an empty list). These strings are stored in a parameter named strings. Inside the method, strings is actually an Array.

Note

When explicit type information for variables is written in the code, these type annotations follow the colon after the item name (i.e., Pascal-like syntax). Why doesn’t Scala follow Java conventions? Recall that type information is often inferred in Scala (unlike Java), meaning we don’t always show type annotations explicitly. Compared to Java’s type item convention, the item: type convention is easier for the compiler to analyze unambiguously when you omit the colon and the type annotation and just write item.

The method return type appears after the argument list. In this case, the return type is Seq[String], where Seq ("sequence") is a particular kind of collection. It is a parameterized type (like a generic type in Java), parameterized here with String. Note that Scala uses square brackets […] for parameterized types, while Java uses angle brackets <…>.

Note

Scala allows angle brackets to be used in method names, e.g., naming a “less than” method < is common. So, to avoid ambiguities, Scala uses square brackets instead for parameterized types. They can’t be used in method names. Allowing < and > in method names is why Scala doesn’t follow Java’s convention for angle brackets.

The body of the upper method comes after the equals sign ‘=’. Why an equals sign? Why not just curly braces {…}, like in Java? Because semicolons, function return types, method arguments lists, and even the curly braces are sometimes omitted, using an equals sign prevents several possible parsing ambiguities. Using an equals sign also reminds us that even functions are values in Scala, which is consistent with Scala’s support of functional programming, described in more detail in Chapter 8, Functional Programming in Scala.

The method body calls the map method on the strings array, which takes a function literal as an argument. Function literals are “anonymous” functions. They are similar to lambdas, closures, blocks, or procs in other languages. In Java, you would have to use an anonymous inner class here that implements a method defined by an interface, etc.

In this case, we passed in the following function literal.

(s:String) => s.toUpperCase()

It takes an argument list with a single String argument named s. The body of the function literal is after the “arrow”, =>. It calls toUpperCase() on s. The result of this call is returned by the function literal. In Scala, the last expression in a function is the return value, although you can have return statements elsewhere, too. The return keyword is optional here and it is rarely used, except when returning out of the middle of a block (e.g., in an if statement).

Note

The value of the last expression is the default return value of a function. No return is required.

So, map passes each String in strings to the function literal and builds up a new collection with the results returned by the function literal.

To exercise the code, we create a new Upper instance and assign it to a variable named up. As in Java, C#, and similar languages, the syntax new Upper creates a new instance. The up variable is declared as a read-only “value” using the val keyword.

Finally, we call the upper method on a list of strings, and print out the result with Console.println(…), which is equivalent to Java’s System.out.println(…).

We can actually simplify our script even further. Consider this simplified version of the script.

// code-examples/IntroducingScala/upper2-script.scala

object Upper {
  def upper(strings: String*) = strings.map(_.toUpperCase())
}

println(Upper.upper("A", "First", "Scala", "Program"))

This code does exactly the same thing, but with a third fewer characters.

On the first line, Upper is now declared as an object, which is a singleton. We are declaring a class, but the Scala runtime will only ever create one instance of Upper. (You can’t write new Upper, for example.) Scala uses objects for situations where other languages would use “class-level” members, like statics in Java. We don’t really need more than one instance here, so a singleton is fine.

Note

Why doesn’t Scala support statics? Since everything is an object in Scala, the object construct keeps this policy consistent. Java’s static methods and fields are not tied to an actual instance.

Note that this code is fully thread safe. We don’t declare any variables that might cause thread-safety issues. The API methods we use are also thread-safe. Therefore, we don’t need multiple instances. A singleton object works fine.

The implementation of upper on the second line is also simpler. Scala can usually infer the return type of the method (but not the types of the method arguments), so we drop the explicit declaration. Also, because there is only one expression in the method body, we drop the braces and put the entire method definition on one line. The equals sign before the method body tells the compiler, as well as the human reader, where the method body begins.

We have also exploited a short-hand for the function literal. Previously we wrote it as follows.

(s:String) => s.toUpperCase()

We can shorten it to the following expression.

_.toUpperCase()

Because map takes one argument, a function, we can use the “placeholder” indicator _ instead of a named parameter. That is, the _ acts like an anonymous variable, to which each string will be assigned before toUpperCase is called. Note that the String type is inferred for us, too. As we will see, Scala uses _ as a “wild card” in several contexts.

You can also use this short-hand syntax in some more complex function literals, as we will see in Chapter 3, Rounding Out the Essentials.

On the last line, using an object rather than a class simplifies the code. Instead of creating an instance with new Upper, we can just call the upper method on the Upper object directly (note how this looks like the syntax you would use when calling static methods in a Java class).

Finally, Scala automatically imports many methods for I/O, like println, so we don’t need to call Console.println(). We can just use println by itself. (See the section called “The Predef Object” in Chapter 7, The Scala Object System for details on the types and methods that are automatically imported or defined.)

Let’s do one last refactoring; let’s convert the script into a compiled, command-line tool.

// code-examples/IntroducingScala/upper3.scala

object Upper {
  def main(args: Array[String]) = {
    args.map(_.toUpperCase()).foreach(printf("%s ",_))
    println("")
  }
}

Now the upper method has been renamed main. Because Upper is an object, this main method works exactly like a static main method in a Java class. It is the entry point to the Upper application.

Note

In Scala, main must be a method in an object. (In Java, main must be a static method in a class.) The command line arguments for the application are passed to main in an array of strings, e.g., args: Array[String].

The first line inside the main method uses the same short-hand notation for map that we just examined.

args.map(_.toUpperCase())...

The call to map returns a new collection. We iterate through it with foreach. We use a _ placeholder shortcut again in another function literal that we pass to foreach. In this case, each string in the collection is passed as an argument to printf.

...foreach(printf("%s ",_))

To be clear, these two uses of ‘_’ are completely independent of each other. Method chaining and function-literal shorthands, as in this example, can take some getting used to, but once you are comfortable with them, they yield very readable code with minimal use of temporary variables.

The last line in main adds a final linefeed to the output.

This time, you must first compile the code to a JVM .class file using scalac.

scalac upper3.scala

You should now have a file named Upper.class, just as if you had just compiled a Java class.

Note

You may have noticed that the compiler did not complain when the file was named upper3.scala and the object was named Upper. Unlike Java, the file name doesn’t have to match the name of the type with public scope. (We’ll explore the visibility rules in the section called “Visibility Rules” in Chapter 5, Basic Object-Oriented Programming in Scala.) In fact, unlike Java, you can have as many public types in a single file as you want. Furthermore, the directory location of a file doesn’t have to match the package declaration. However, you can certainly follow the Java conventions, if you want to.

Now, you can execute this command for any list of strings. Here is an example.

scala -cp . Upper Hello World!

The -cp . option adds the current directory to the search “class path”. You should get the following output.

 HELLO WORLD!

Therefore, we have met the requirement a programming language book must start with a “hello world” program.

A Taste of Concurrency

There are many reasons to be seduced by Scala. One reason is the Actors API included in the Scala library, which is based on the robust Actors concurrency model built into Erlang [Haller2007]. Here is an example to whet your appetite.

In the Actor model of concurrency [Agha1987], independent software entities called actors share no state information with each other. Instead, they communicate by exchanging messages. By eliminating the need to synchronize access to shared, mutable state, it is far easier to write robust, concurrent applications.

In this example, instances in a geometric Shape hierarchy are sent to an actor for drawing on a display. Imagine a scenario where a rendering “farm” generates scenes in an animation. As the rendering of a scene is completed, the shape “primitives” that are part of the scene are sent to an actor for a display subsystem.

To begin, we define a Shape class hierarchy.

// code-examples/IntroducingScala/shapes.scala

package shapes {
  class Point(val x: Double, val y: Double) {
    override def toString() = "Point(" + x + "," + y + ")"
  }

  abstract class Shape() {
    def draw(): Unit
  }

  class Circle(val center: Point, val radius: Double) extends Shape {
    def draw() = println("Circle.draw: " + this)
    override def toString() = "Circle(" + center + "," + radius + ")"
  }

  class Rectangle(val lowerLeft: Point, val height: Double, val width: Double)
        extends Shape {
    def draw() = println("Rectangle.draw: " + this)
    override def toString() =
      "Rectangle(" + lowerLeft + "," + height + "," + width + ")"
  }

  class Triangle(val point1: Point, val point2: Point, val point3: Point)
        extends Shape {
    def draw() = println("Triangle.draw: " + this)
    override def toString() =
      "Triangle(" + point1 + "," + point2 + "," + point3 + ")"
  }
}

The Shape class hierarchy is defined in a shapes package. You can declare the package using Java syntax, but Scala also supports a syntax similar to C#'s “namespace” syntax, where the entire declaration is scoped using curly braces, as used here. The Java-style package declaration syntax is far more commonly used, however, being both compact and readable.

The Point class represents a two-dimensional point on a plane. Note the argument list after the class name. Those are constructor parameters. In Scala, the whole class body is the constructor, so you list the arguments for the primary constructor after the class name and before the class body. (We’ll see how to define auxiliary constructors in the section called “Constructors in Scala” in Chapter 5, Basic Object-Oriented Programming in Scala.) Because we put the val keyword before each parameter declaration, they are automatically converted to read-only fields with the same names with public reader methods of the same name. That is, when you instantiate a Point instance, e.g., point, you can read the fields using point.x and point.y. If you want mutable fields, then use the keyword var. We’ll explore variable declarations and the val and var keywords in the section called “Variable Declarations” in Chapter 2, Type Less, Do More.

The body of Point defines one method, an override of the familiar toString method in Java (like ToString in C#). Note that Scala, like C#, requires the override keyword whenever you override a concrete method. Unlike C#, you don’t have to use a virtual keyword on the original concrete method. In fact, there is no virtual keyword in Scala. As before, we omit the curly braces "{…}" around the body of toString, since it has only one expression.

Shape is an abstract class. Abstract classes in Scala are similar to those in Java and C#. We can’t instantiate instances of abstract classes, even when all their field and method members are concrete.

In this case, Shape declares an abstract draw method. We know it is abstract because it has no body. No abstract keyword is required on the method. Abstract methods in Scala are just like abstract methods in Java and C#. (See the section called “Overriding Members of Classes and Traits” in Chapter 6, Advanced Object-Oriented Programming In Scala for more details.)

The draw method returns Unit, which is a type that is roughly equivalent to void in C-derived languages like Java, etc. (See the section called “The Scala Type Hierarchy” in Chapter 7, The Scala Object System for more details.)

Circle is declared as a concrete subclass of Shape. It defines the draw method to simply print a message to the console. Circle also overrides toString.

Rectangle is also a concrete subclass of Shape that defines draw and overrides toString. For simplicity, we assume it is not rotated relative to the X and Y axes. Hence, all we need is one point, the lower left-hand point will do, and the height and width of the rectangle.

Triangle follows the same pattern. It takes three Points as its constructor arguments.

Both draw methods in Circle, Rectangle, and Triangle use this. As in Java and C#, this is how an instance refers to itself. In this context, where this is the right-hand side of a String concatenation expression (using the plus sign), this.toString is invoked implicitly.

Note

Of course, in a real application, you would not implement drawing in “domain model” classes like this, since the implementations would depend on details like the operating system platform, graphics API, etc. We will see a better design approach when we discuss traits in Chapter 4, Traits.

Now that we have defined our shapes types, let’s return to actors. We define an Actor that receives “messages” that are shapes to draw.

// code-examples/IntroducingScala/shapes-actor.scala

package shapes {
  import scala.actors._
  import scala.actors.Actor._

  object ShapeDrawingActor extends Actor {
    def act() {
      loop {
        receive {
          case s: Shape => s.draw()
          case "exit"   => println("exiting..."); exit
          case x: Any   => println("Error: Unknown message! " + x)
        }
      }
    }
  }
}

The actor is declared to be part of the shapes package. Next, we have two import statements.

The first import statement imports all the types in the scala.actors package. In Scala, the underscore _ is used the way the star * is used in Java.

Note

Because * is a valid character for a function name, it can’t be used as the import wild card. Instead, _ is reserved for this purpose.

All the methods and public fields from Actor are imported. These are not static imports from the Actor type, as they would be in Java. Rather, they are imported from an object that is also named Actor. The class and object can have the same name, as we will see in the section called “Companion Objects” in Chapter 6, Advanced Object-Oriented Programming In Scala.

Our actor class definition, ShapeDrawingActor, is an object that extends Actor (the type, not the object). The act method is overridden to do the unique work of the actor. Because act is an abstract method, we don’t need to explicitly override it with the override keyword. Our actor loops indefinitely, waiting for incoming messages.

During each pass in the loop, the receive method is called. It blocks until a new message arrives. Why is the code after receive enclosed in curly braces ({…}) and not parentheses ((…))? We will learn later that there are cases where this substitution is allowed and it is quite useful (see Chapter 3, Rounding Out the Essentials). For now, what you need to know is that the expressions inside the braces constitute a single function literal that is passed to receive. This function literal does a pattern match on the message instance to decide how to handle the message. Because of the case clauses, it looks like a typical switch statement in Java, for example, and the behavior is very similar.

The first case does a type comparison with the message. (There is no explicit variable for the message instance in the code; it is inferred.) If the message is of type Shape, the first case matches. The message instance is cast to a Shape and assigned to the variable s, then the draw method is called on it.

If the message is not a Shape, the second case is tried. If the message is the string "exit", the actor prints a message and terminates execution. Actors should usually have a way to exit gracefully!

The last case clause handles any other message instance, thereby functioning as the default case. The actor reports an error and then drops the message. Any is the parent of all types in the Scala type hierarchy, like Object is the root type in Java and other languages. Hence, this case clause will match any message of any type. Pattern matching is eager; we have to put this case clause at the end, so it doesn’t consume the messages we are expecting!

Recall that we declared draw as an abstract method in Shape and we implemented draw in the concrete subclasses. Hence, the code in the first case statement invokes a polymorphic operation.

Finally, here is a script that uses the ShapeDrawingActor actor.

// code-examples/IntroducingScala/shapes-actor-script.scala

import shapes._

ShapeDrawingActor.start()

ShapeDrawingActor ! new Circle(new Point(0.0,0.0), 1.0)
ShapeDrawingActor ! new Rectangle(new Point(0.0,0.0), 2, 5)
ShapeDrawingActor ! new Triangle(new Point(0.0,0.0),
                                 new Point(1.0,0.0),
                                 new Point(0.0,1.0))
ShapeDrawingActor ! 3.14159

ShapeDrawingActor ! "exit"

The shapes in the shapes package are imported.

The ShapeDrawingActor actor is started. By default, it runs in its own thread (there are alternatives we will discuss in Chapter 9, Robust, Scalable Concurrency with Actors), waiting for messages.

Five messages are sent to the actor, using the syntax actor ! message. The first message sends a Circle instance. The actor “draws” the circle. The second message sends a Rectangle message. The actor “draws” the rectangle. The third message does the same thing for a Triangle. The fourth message sends a Double that is approximately equal to Pi. This is an unknown message for the actor, so it just prints an error message. The final message sends an “exit” string, which causes the actor to terminate.

To try out the actor example, start by compiling the first two files. You can get the sources from the O’Reilly download site (see the section called “Getting the Code Examples” in the Preface for details) or you can create them yourself.

Use the following command to compile the files.

 scalac shapes.scala shapes-actor.scala

While the source file names and locations don’t have to match the file contents, you will notice that the generated class files are written to a shapes directory and there is one class file for each class we defined. The class file names and locations must conform to the JVM requirements.

Now you can run the script to see the actor in action.

 scala -cp . shapes-actor-script.scala

You should see the following output.

Circle.draw: Circle(Point(0.0,0.0),1.0)
Rectangle.draw: Rectangle(Point(0.0,0.0),2.0,5.0)
Triangle.draw: Triangle(Point(0.0,0.0),Point(1.0,0.0),Point(0.0,1.0))
Error: Unknown message! 3.14159
exiting...

For more on Actors, see Chapter 9, Robust, Scalable Concurrency with Actors.

Recap and What’s Next

We made the case for Scala and got you started with two sample Scala programs, one of which gave you a taste of Scala’s Actors library for concurrency. Next, we’ll dive into more Scala syntax, emphasizing various keystroke-economical ways of getting lots of work done.

You must sign in or register before commenting

Chapter 2. Type Less, Do More

In This Chapter

We ended the last chapter with a few “teaser” examples of Scala code. This chapter discusses uses of Scala that promote succinct, flexible code. We’ll discuss organization of files and packages, importing other types, variable declarations, miscellaneous syntax conventions and a few other concepts. We’ll emphasize how the concise syntax of Scala helps you work better and faster.

Scala’s syntax is especially useful when writing scripts. Separate compile and run steps aren’t required for simple programs that have few dependencies on libraries outside of what Scala provides. You compile and run such programs in one shot with the scala command. If you’ve downloaded the example code for the book, many of the smaller examples can be run using the scala command, e.g., scala filename.scala. See the README.txt files in each chapter’s code examples for more details. See also the section called “Command Line Tools” in Chapter 14, Scala Tools, Libraries and IDE Support for more information about using the scala command.

Semicolons

You may have already noticed that there were very few semicolons in the code examples in the previous chapter. You can use semicolons to separate statements and expressions, as in Java, C, PHP, and similar languages. In most cases, though, Scala behaves like many scripting languages in treating the end of the line as the end of a statement or an expression. When a statement or expression is too long for one line, Scala can usually infer when you are continuing on to the next line, as shown in this example.

// code-examples/TypeLessDoMore/semicolon-example-script.scala

// Trailing equals sign indicates more code on next line
def equalsign = {
  val reallySuperLongValueNameThatGoesOnForeverSoYouNeedANewLine =
    "wow that was a long value name"

  println(reallySuperLongValueNameThatGoesOnForeverSoYouNeedANewLine)
}

// Trailing opening curly brace indicates more code on next line
def equalsign2(s: String) = {
  println("equalsign2: " + s)
}

// Trailing comma, operator, etc. indicates more code on next line
def commas(s1: String,
           s2: String) = {
  println("comma: " + s1 +
          ", " + s2)
}

When you want to put multiple statements or expressions on the same line, you can use semicolons to separate them. We used this technique in the ShapeDrawingActor example in the section called “A Taste of Concurrency” in Chapter 1, Zero to Sixty: Introducing Scala.

case "exit" => println("exiting..."); exit

This code could also be written as follows.

...
case "exit" =>
  println("exiting...")
  exit
...

You might wonder why you don’t need curly braces ({…}) around the two statements after the case … => line. You can put them in if you want, but the compiler knows when you’ve reached the end of the “block” when it finds the next case clause or the curly brace (}) that ends the enclosing block for all the case clauses.

Omitting optional semicolons means fewer characters to type and fewer characters to clutter your code. Breaking separate statements onto their own lines increases your code’s readability.

Variable Declarations

Scala allows you to decide whether a variable is immutable (read-only) or not (read-write) when you declare it. An immutable “variable” is declared with the keyword val (think value object).

val array: Array[String] = new Array(5)

To be more precise, the array reference cannot be changed to point to a different Array, but the array itself can be modified, as shown in the following scala session.

scala> val array: Array[String] = new Array(5)
array: Array[String] = Array(null, null, null, null, null)

scala> array = new Array(2)
<console>:5: error: reassignment to val
       array = new Array(2)
             ^

scala> array(0) = "Hello"

scala> array
res3: Array[String] = Array(Hello, null, null, null, null)

scala>

An immutable val must be initialized, that is defined, when it is declared.

A mutable variable is declared with the keyword var.

scala> var stockPrice: Double = 100.
stockPrice: Double = 100.0

scala> stockPrice = 10.
stockPrice: Double = 10.0

scala>

Scala also requires you to initialize a var when it is declared. You can assign a new value to a var as often as you want. Again, to be precise, the stockPrice reference can be changed to point to a different Double object (e.g., 10.). In this case, the object that stockPrice refers to can’t be changed, because Doubles in Scala are immutable.

There are a few exceptions to the rule that you must initialize val's and var's when they are declared. Both keywords can be used with constructor parameters. When used as constructor parameters, the mutable or immutable variables specified will be initialized when an object is instantiated. Both keywords can be used to declare "abstract" (uninitialized) variables in abstract types. Also, derived types can override vals declared inside parent types. We’ll discuss these exceptions in Chapter 5, Basic Object-Oriented Programming in Scala.

Scala encourages you to use immutable values whenever possible. As we will see, this promotes better object-oriented design and it is consistent with the principles of “pure” functional programming. It may take some getting used to, but you’ll find a newfound confidence in your code when it is written in an immutable style.

Note

The var and val keywords only specify if the reference can be changed to refer to a different object (var) or not (val). They don’t specify whether or not the object they reference is mutable.

Method Declarations

We saw several examples in Chapter 1, Zero to Sixty: Introducing Scala of how to define methods, which are functions that are members of a class. Method definitions start with the def keyword, followed by optional argument lists, a colon character ‘:’ and the return type of the method, an equals sign ‘=’, and finally the method body. Methods are implicitly declared “abstract” if you leave off the equals sign and method body. The enclosing type is then itself abstract. We’ll discuss abstract types in more detail in Chapter 5, Basic Object-Oriented Programming in Scala.

We said “optional argument lists”, meaning more than one. Scala lets you define more than one argument list for a method. This is required for currying methods, which we’ll discuss in the section called “Currying” in Chapter 8, Functional Programming in Scala. It is also very useful for defining your own domain-specific languages (DSLs), as we’ll see in Chapter 11, Domain-Specific Languages in Scala. Note that each argument list is surrounded by parentheses and the arguments are separated by commas.

If a method body has more than one expression, you must surround it with curly braces {…}. You can omit the braces if the method body has just one expression.

Method Default and Named Arguments (Scala Version 2.8)

Many languages let you define default values for some or all of the arguments to a method. Consider the following script with a StringUtil object that lets you join a list of strings with a user-specified separator.

// code-examples/TypeLessDoMore/string-util-v1-script.scala
// Version 1 of "StringUtil".

object StringUtil {
  def joiner(strings: List[String], separator: String): String =
    strings.mkString(separator)

  def joiner(strings: List[String]): String = joiner(strings, " ")
}
import StringUtil._  // Import the joiner methods.

println( joiner(List("Programming", "Scala")) )

There are actually two, “overloaded” joiner methods. The second one uses a single space as the “default” separator. Having two methods seems a bit wasteful. It would be nice if we could eliminate the second joiner method and declare that the separator argument in the first joiner has a default value. In fact, in Scala version 2.8, you can now do this.

// code-examples/TypeLessDoMore/string-util-v2-v28-script.scala
// Version 2 of "StringUtil" for Scala v2.8 only.

object StringUtil {
  def joiner(strings: List[String], separator: String = " "): String =
    strings.mkString(separator)
}
import StringUtil._  // Import the joiner methods.

println(joiner(List("Programming", "Scala")))

There is another alternative for earlier versions of Scala. You can use implicit arguments, which we will discuss in the section called “Implicit Function Parameters” in Chapter 8, Functional Programming in Scala.

Scala version 2.8 offers another enhancement for method argument lists, named arguments. We could actually write the last line of the previous example in several ways. All of the following println statements are functionally equivalent.

println(joiner(List("Programming", "Scala")))
println(joiner(strings = List("Programming", "Scala")))
println(joiner(List("Programming", "Scala"), " "))   // #1
println(joiner(List("Programming", "Scala"), separator = " ")) // #2
println(joiner(strings = List("Programming", "Scala"), separator = " "))

Why is this useful? First, if you choose good names for the method arguments, then your calls to those methods document each argument with a name. For example, compare the two lines with comments #1 and #2. In the first line, it may not be obvious what the second, " " argument is for. In the second case, we supply the name separator, which suggests the purpose of the argument.

The second benefit is that you can specify the parameters in any order when you specify them by name. Combined with default values, you can write code like the following

// code-examples/TypeLessDoMore/user-profile-v28-script.scala
// Scala v2.8 only.

object OptionalUserProfileInfo {
  val UnknownLocation = ""
  val UnknownAge = -1
  val UnknownWebSite = ""
}

class OptionalUserProfileInfo(
  location: String = OptionalUserProfileInfo.UnknownLocation,
  age: Int         = OptionalUserProfileInfo.UnknownAge,
  webSite: String  = OptionalUserProfileInfo.UnknownWebSite)

println( new OptionalUserProfileInfo )
println( new OptionalUserProfileInfo(age = 29) )
println( new OptionalUserProfileInfo(age = 29, location="Earth") )

OptionalUserProfileInfo represents all the “optional” user profile data in your next Web 2.0, social networking site. It defines default values for all its fields. The script creates instances with zero or more named parameters. The order of those parameters is arbitrary.

The examples we have shown use constant values as the defaults. Most languages with default argument values only allow constants or other values that can be determined at parse-time. However, in Scala, any expression can be used as the default, as long as it can compile where used. For example, an expression could not refer to an instance field that will be computed inside the class or object body, but it could invoke a method on a singleton object.

Finally, another constraint on named parameters is that once you provide a name for a parameter in a method invocation, then the rest of the parameters appearing after it must also be named. For example, new OptionalUserProfileInfo(age = 29, "Earth") would not compile because the second argument is not invoked by name.

We’ll see another useful example of named and default arguments when we discuss case classes in the section called “Case Classes” in Chapter 6, Advanced Object-Oriented Programming In Scala.

Nesting Method Definitions

Method definitions can also be nested. Here is an implementation of a factorial calculator, where we use a conventional technique of calling a second, nested method to do the work.

// code-examples/TypeLessDoMore/factorial-script.scala

def factorial(i: Int): Int = {
  def fact(i: Int, accumulator: Int): Int = {
    if (i <= 1)
      accumulator
    else
      fact(i - 1, i * accumulator)
  }

  fact(i, 1)
}

println( factorial(0) )
println( factorial(1) )
println( factorial(2) )
println( factorial(3) )
println( factorial(4) )
println( factorial(5) )

The second method calls itself recursively, passing an accumulator parameter, where the result of the calculation is “accumulated”. Note that we return the accumulated value when the counter i reaches 1. (We’re ignoring invalid negative integers. The function actually returns 1 for i < 0.) After the definition of the nested method, factorial calls it with the passed-in value i and the initial accumulator value of 1.

Like a local variable declaration in many languages, a nested method is only visible inside the enclosing method. If you try to call fact outside of factorial, you will get a compiler error.

Did you notice that we use i as a parameter name twice, first in the factorial method and again in the nested fact method? As in many languages, the use of i as a parameter name for fact “shadows” the outer use of i as a parameter name for factorial. This is fine, because we don’t need the outer value of i inside fact. We only use it the first time we call fact, at the end of factorial.

What if we need to use a variable that is defined outside a nested function. Consider this contrived example.

// code-examples/TypeLessDoMore/count-to-script.scala

def countTo(n: Int):Unit = {
  def count(i: Int): Unit = {
    if (i <= n) {
      println(i)
      count(i + 1)
    }
  }
  count(1)
}

countTo(5)

Note that the nested count method uses the n value that is passed as a parameter to countTo. There is no need to pass n as an argument to count. Because count is nested inside countTo, n is visible to it.

The declaration of a field (member variable) can be prefixed with keywords indicating the visibility, just as in languages like Java and C#. Similarly the declaration of a non-nested method can be prefixed with the same keywords. We will discuss the visibility rules and keywords in the section called “Visibility Rules” in Chapter 5, Basic Object-Oriented Programming in Scala.

Inferring Type Information

Statically-typed languages can be very verbose. Consider this typical declaration in Java.

import java.util.Map;
import java.util.HashMap;
...
Map<Integer, String> intToStringMap = new HashMap<Integer, String>();

We have to specify the type parameters <Integer, String> twice. (Scala uses the term type annotations for explicit type declarations like HashMap<Integer, String>.)

Scala supports type inference (see, for example, [TypeInference] and [Pierce2002]). The language’s compiler can discern quite a bit of type information from the context, without explicit type annotations. Here’s the same declaration rewritten in Scala, with inferred type information.

import java.util.Map
import java.util.HashMap
...
val intToStringMap: Map[Integer, String] = new HashMap

Recall from Chapter 1 that Scala uses square brackets ([…]) for generic type parameters. We specify Map[Integer, String] on the left-hand side of the equals sign. (We are sticking with Java types for the example.) On the right-hand side, we instantiate the actual type we want, a HashMap, but we don’t have to repeat the type parameters.

For completeness, suppose we don’t actually care if the instance is of type Map (the Java interface type). It can be of type HashMap for all we care.

import java.util.Map
import java.util.HashMap
...
val intToStringMap2 = new HashMap[Integer, String]

This declaration requires no type annotations on the left-hand side because all of the type information needed is on the right-hand side. The compiler automatically makes intToStringMap2 a HashMap[Integer,String].

Type inference is used for methods, too. In most cases, the return type of the method can be inferred, so the ‘:’ and return type can be omitted. However, type annotations are required for all method parameters.

Pure functional languages like Haskell (see, e.g., [O'Sullivan2009]) use type inference algorithms like Hindley-Milner (see [Spiewak2008] for an easily digested explanation). Code written in these languages require type annotations less often than in Scala, because Scala’s type inference algorithm has to support object-oriented typing as well as functional typing. So, Scala requires more type annotations than languages like Haskell. Here is a summary of the rules for when explicit type annotations are required in Scala.

Note

The Any type is the root of the Scala type hierarchy (see the section called “The Scala Type Hierarchy” in Chapter 7, The Scala Object System for more details). If a block of code returns a value of type Any unexpectedly, chances are good that the type inferencer couldn’t figure out what type to return, so it chose the most generic possible type.

Let’s look at examples where explicit declarations of method return types are required. In the following script, the upCase method has a conditional return statement for zero-length strings.

// code-examples/TypeLessDoMore/method-nested-return-script.scala
// ERROR: Won't compile until you put a String return type on upCase.

def upCase(s: String) = {
  if (s.length == 0)
    return s    // ERROR - forces return type of upCase to be declared.
  else
    s.toUpperCase()
}

println( upCase("") )
println( upCase("Hello") )

Running this script gives you the following error.

... 6: error: method upCase has return statement; needs result type
        return s
         ^

You can fix this error by changing the first line of the method to the following.

def upCase(s: String): String = {

Actually, for this particular script, an alternative fix is to remove the return keyword from the line. It is not needed for the code to work properly, but it illustrates our point.

Recursive methods also require an explicit return type. Recall our factorial method in the section called “Nesting Method Definitions”, previously in this chapter. Let’s remove the : Int return type on the nested fact method.

// code-examples/TypeLessDoMore/method-recursive-return-script.scala
// ERROR: Won't compile until you put an Int return type on "fact".

def factorial(i: Int) = {
  def fact(i: Int, accumulator: Int) = {
    if (i <= 1)
      accumulator
    else
      fact(i - 1, i * accumulator)  // ERROR
  }

  fact(i, 1)
}

Now it fails to compile.

... 9: error: recursive method fact needs result type
            fact(i - 1, i * accumulator)
             ^

Overloaded methods can sometimes require an explicit return type. When one such method calls another, we have to add a return type to the one doing the calling, as in this example.

// code-examples/TypeLessDoMore/method-overloaded-return-script.scala
// Version 1 of "StringUtil" (with a compilation error).
// ERROR: Won't compile: needs a String return type on the second "joiner".

object StringUtil {
  def joiner(strings: List[String], separator: String): String =
    strings.mkString(separator)

  def joiner(strings: List[String]) = joiner(strings, " ")   // ERROR
}
import StringUtil._  // Import the joiner methods.

println( joiner(List("Programming", "Scala")) )

The two joiner methods concatenate a List of strings together. The first method also takes an argument for the separator string. The second method calls the first with a “default” separator of a single space.

If you run this script, you get the following error.

... 9: error: overloaded method joiner needs result type
def joiner(strings: List[String]) = joiner(strings, "")

Since the second joiner method calls the first, it requires an explicit String return type. It should look like this.

  def joiner(strings: List[String]): String = joiner(strings, " ")

The final scenario can be subtle, when a more general return type is inferred than what you expected. You usually see this error when you assign a value returned from a function to a variable with a more specific type. For example, you were expecting a String, but the function inferred an Any for the returned object. Let’s see a contrived example that reflects a bug where this scenario can occur.

// code-examples/TypeLessDoMore/method-broad-inference-return-script.scala
// ERROR: Won't compile. Method actually returns List[Any], which is too "broad".

def makeList(strings: String*) = {
  if (strings.length == 0)
    List(0)  // #1
  else
    strings.toList
}

val list: List[String] = makeList()  // ERROR

Running this script returns the following error.

...11: error: type mismatch;
 found   : List[Any]
 required: List[String]
val list: List[String] = makeList()
                          ^

We intended for makeList to return a List[String], but when strings.length equals zero, we returned List(0), incorrectly “assuming” that this expression is the correct way to create an empty list. In fact, we returned a List[Int] with one element, 0. We should have returned List(). Since the else expression returns a List[String], the result of strings.toList, the inferred return type for the method is the closest common super type of List[Int] and List[String], which is List[Any]. Note that the compilation error doesn’t occur in the function definition. We only see it when we attempt to assign the value returned from makeList to a List[String] variable.

In this case, fixing the bug is the solution. Alternatively, when there isn’t a bug, it may be that the compiler just needs the “help” of an explicit return type declaration. Investigate the method that appears to return the unexpected type. In our experience, you often find that you modified that method (or another one in the call path) in such a way that the compiler now infers a more general return type than necessary. Add the explicit return type in this case.

Another way to prevent these problems is to always declare return types for methods, especially when defining methods for a public API. Let’s revisit our StringUtil example and see why explicit declarations are a good idea (adapted from [Smith2009a]).

Here is our StringUtil “API” again with a new method, toCollection.

// code-examples/TypeLessDoMore/string-util-v3.scala
// Version 3 of "StringUtil" (for all versions of Scala).

object StringUtil {
  def joiner(strings: List[String], separator: String): String =
    strings.mkString(separator)

  def joiner(strings: List[String]): String = strings.mkString(" ")

  def toCollection(string: String) = string.split(' ')
}

The toCollection method splits a string on spaces and returns an Array containing the substrings. The return type is inferred, which is a potential problem, as we will see. The method is somewhat contrived, but it will illustrate our point. Here is a client of StringUtil that uses this method.

// code-examples/TypeLessDoMore/string-util-client.scala

import StringUtil._

object StringUtilClient {
  def main(args: Array[String]) = {
    args foreach { s => toCollection(s).foreach { x => println(x) } }
  }
}

If you compile these files with scala, you can run the client as follows.

$ scala -cp ... StringUtilClient "Programming Scala"
Programming
Scala

Note

For the -cp … class path argument, use the directory where scalac wrote the class files, which defaults to the current directory (i.e., use -cp .). If you used the build process in the downloaded code examples, the class files are written to the build directory (using scalac -d build ...). In this case, use -cp build.

Everything is fine at this point, but now imagine that the code base has grown. StringUtil and its clients are now built separately and bundled into different jars. Imagine also that the maintainers of StringUtil decide to return a List instead of the default.

object StringUtil {
  ...

  def toCollection(string: String) = string.split(' ').toList  // changed!
}

The only difference is the final call to toList that converts the computed Array to a List. You recompile StringUtil and redeploy its jar. Then you run the same client, without recompiling it first.

$ scala -cp ... StringUtilClient "Programming Scala"
java.lang.NoSuchMethodError: StringUtil$.toCollection(...
  at StringUtilClient$$anonfun$main$1.apply(string-util-client.scala:6)
  at StringUtilClient$$anonfun$main$1.apply(string-util-client.scala:6)
...

What happened? When the client was compiled, StringUtil.toCollection returned an Array. Then toCollection was changed to return List. In both versions, the method return value was inferred. Therefore, client should have been recompiled, too.

However, had an explicit return type of Seq been declared, which is a parent for both Array and List, then the implementation change would not have forced a recompilation of the client.

Note

When developing APIs that are built separately from their clients, declare method return types explicitly and use the most general return type you can. This is especially important when APIs declare abstract methods (see, e.g., Chapter 4, Traits).

There is another scenario to watch for when using declarations of collections like val map = Map(), as in this example.

val map = Map()

map.update("book", "Programming Scala")
... 3: error: type mismatch;
 found   : java.lang.String("book")
 required: Nothing
map.update("book", "Programming Scala")
            ^

What happened? The type parameters of the generic type Map were inferred as [Nothing,Nothing] when the map was created. (We’ll discuss Nothing in the section called “The Scala Type Hierarchy” in Chapter 7, The Scala Object System, but its name is suggestive!) We attempted to insert an incompatible key, value pair of types String and String. Call it a Map to nowhere! The solution is to parameterize the initial map declaration, e.g., val map = Map[String, String]() or to specify initial values so the map parameters are inferred, e.g., val map = Map("Programming" → "Scala")

Finally, there is a subtle behavior with inferred return types that can cause unexpected and baffling results [ScalaTips]. Consider the following example scala session.

scala> def double(i: Int) { 2 * i }
double: (Int)Unit

scala> println(double(2))
()

Why did the second command print () instead of 4? Look carefully at what the scala interpreter said the first command returned, double (Int)Unit. We defined a method named double that takes an Int argument and returns Unit. The method doesn’t return an Int as we would expect.

The cause of this unexpected behavior is a missing equals sign in the method definition. Here is the definition we actually intended.

scala> def double(i: Int) = { 2 * i }
double: (Int)Int

scala> println(double(2))
4

Note the equals sign before the body of double. Now, the output says we have defined double to return an Int and the second command does what we expect it to do.

There is a reason for this behavior. Scala regards a method with the equals sign before the body as a function definition and a function always returns a value in functional programming. On the other hand, when Scala sees a method body without the leading equals sign, it assumes the programmer intended the method to be a “procedure” definition, intended for performing side effects only with the return value Unit. In practice, it is more likely that the programmer simply forget to insert the equals sign!

Warning

When the return type of a method is inferred and you don’t use an equals sign before the opening parenthesis for the method body, Scala infers a Unit return type, even when the last expression in the method is a value of another type.

By the way, where did that () come from that was printed before we fixed the bug? It is actually the real name of the singleton instance of the Unit type! (This name is a functional programming convention.)

Literals

Often, a new object is initialized with a literal value, such as val book = "Programming Scala". Let’s discuss the kinds of literal values supported by Scala. Here, we’ll limit ourselves to lexical syntax literals. We’ll cover literal syntax for functions (used as values, not member methods), tuples, and certain types like Lists and Maps, as we come to them.

Integer Literals

Integer literals can be expressed in decimal, hexadecimal, or octal. The details are summarized in Table 2.1, “Integer literals.”.

Table 2.1. Integer literals.

Kind Format Examples

Decimal

0 or a nonzero digit followed zero or more digits (0-9)

0, 1, 321

Hexadecimal

0x followed by one or more hexadecimal digits (0-9, A-F, a-f)

0xFF, 0x1a3b

Octal

0 followed by one or more octal digits (0-7)

013, 077


For Long literals, it is necessary to append the L or l character at the end of the literal. Otherwise, an Int is used. The valid values for an integer literal are bounded by the type of the variable to which the value will be assigned. Table 2.2, “Ranges of allowed values for integer literals (boundaries are inclusive).” defines the limits, which are inclusive.

Table 2.2. Ranges of allowed values for integer literals (boundaries are inclusive).

Target Type Minimum (inclusive) Maximum (inclusive)

Long

−263

263 - 1

Int

−231

231 - 1

Short

−215

215 - 1

Char

0

216 - 1

Byte

−27

27 - 1


A compile-time error occurs if an integer literal number is specified that is outside these ranges, as in the following examples.

scala > val i = 12345678901234567890
<console>:1: error: integer number too large
       val i = 12345678901234567890
scala> val b: Byte = 128
<console>:4: error: type mismatch;
 found   : Int(128)
 required: Byte
       val b: Byte = 128
                     ^

scala> val b: Byte = 127
b: Byte = 127

Floating Point Literals

Floating point literals are expressions with zero or more digits, followed by a period ., followed by zero or more digits. If there are no digits before the period, i.e., the number is less than 1.0, then there must be one or more digits after the period. For Float literals, append the F or f character at the end of the literal. Otherwise, a Double is assumed. You can optionally append a D or d for a Double.

Floating point literals can be expressed with or without exponentials. The format of the exponential part is e or E, followed by an optional + or -, followed by one or more digits.

Here are some example floating point literals.

0.
.0
0.0
3.
3.14
.14
0.14
3e5
3E5
3.E5
3.e5
3.e+5
3.e-5
3.14e-5
3.14e-5f
3.14e-5F
3.14e-5d
3.14e-5D

Float consists of all IEEE 754 32-bit, single-precision binary floating point values. Double consists of all IEEE 754 64-bit, double-precision binary floating point values.

Warning

To avoid parsing ambiguities, you must have at least one space after a floating point literal, if it is followed by a token that starts with a letter. Also, the expression 1.toString returns the integer value 1 as a string, while 1. toString uses the operator notation to invoke toString on the floating point literal 1..

Boolean Literals

The boolean literals are true and false. The type of the variable to which they are assigned will be inferred to be Boolean.

scala> val b1 = true
b1: Boolean = true

scala> val b2 = false
b2: Boolean = false

Character Literals

A character literal is either a printable Unicode character or an escape sequence, written between single quotes. A character with Unicode value between 0 and 255 may also be represented by an octal escape, a backslash \ followed by a sequence of up to three octal characters. It is a compile time error if a backslash character in a character or string literal does not start a valid escape sequence.

Here are some examples.

’A’
’\u0041’  // 'A' in Unicode
’\n’
'\012'    // '\n' in octal
’\t’

The valid escape sequences are shown in Table 2.3, “Character escape sequences.”.

Table 2.3. Character escape sequences.

Sequence Unicode Meaning

\b

\u0008

backspace BS

\t

\u0009

horizontal tab HT

\n

\u000a

linefeed LF

\f

\u000c

form feed FF

\r

\u000d

carriage return CR

\"

\u0022

double quote "

\’

\u0027

single quote

\\

\u0009

backslash \


String Literals

A string literal is a sequence of characters enclosed in double quotes or triples of double quotes, i.e., """…""".

For string literals in double quotes, the allowed characters are the same as the character literals. However, if a double quote " character appears in the string, it must be “escaped” with a \ character. Here are some examples.

"Programming\nScala"
"He exclaimed, \"Scala is great!\""
"First\tSecond"

The string literals bounded by triples of double quotes are also called multi-line string literals. These strings can cover several lines; the line feeds will be part of the string. They can include any characters, including one or two double quotes together, but not three together. They are useful for strings with \ characters that don’t form valid Unicode or escape sequences, like the valid sequences listed in Table 2.3, “Character escape sequences.”. Regular expressions are a typical example, which we’ll discuss in Chapter 3, Rounding Out the Essentials. However, if escape sequences appear, they aren’t interpreted.

Here are three example strings.

"""Programming\nScala"""
"""He exclaimed, "Scala is great!" """
"""First line\n
Second line\t

Fourth line"""

Note that we had to add a space before the trailing """ in the second example to prevent a parse error. Trying to escape the second " that ends the "Scala is great!" quote, i.e., "Scala is great!\", doesn’t work.

Copy and paste these strings into the scala interpreter. Do the same for the previous string examples. How are they interpreted differently?

Symbol Literals

Scala supports symbols, which are interned strings, meaning that two symbols with the same “name”, i.e., the same character sequence, will actually refer to the same object in memory. Symbols are used less often in Scala than in some other languages, like Ruby, Smalltalk, and Lisp. They are useful as map keys instead of strings.

A symbol literal is a single quote ', followed by a letter, followed by zero or more digits and letters. Note that an expression like '1 is invalid, because the compiler thinks it is an incomplete character literal.

A symbol literal ’id is a shorthand for the expression scala.Symbol("id").

Note

If you want to create a symbol that contains whitespace, use e.g., scala.Symbol(" Programming Scala "). All the whitespace is preserved.

Tuples

How many times have you wanted to return two or more values from a method? In many languages, like Java, you only have a few options, none of which is very appealing. You could pass in parameters to the method that will be modified for all or some of the “return” values, which is ugly. Or, you could declare some small “structural” class that holds the two or more values, then return an instance of that class.

Scala, supports tuples, a grouping of two or more items, usually created with the literal syntax of a comma-separated list of the items inside parentheses, e.g., (x1, x2, …). The types of the xi elements are unrelated to each other, you can mix and match types. These literal “groupings” are instantiated as scala.TupleN instances, where the N is the number of items in the tuple. The Scala API defines separate TupleN classes for N between 1 and 22, inclusive. Tuple instances are immutable, first-class values, so you can assign them to variables, pass them as values, and return them from methods.

The following example demonstrates the use of tuples.

// code-examples/TypeLessDoMore/tuple-example-script.scala

def tupleator(x1: Any, x2: Any, x3: Any) = (x1, x2, x3)

val t = tupleator("Hello", 1, 2.3)
println( "Print the whole tuple: " + t )
println( "Print the first item:  " + t._1 )
println( "Print the second item: " + t._2 )
println( "Print the third item:  " + t._3 )

val (t1, t2, t3) = tupleator("World", '!', 0x22)
println( t1 + " " + t2 + " " + t3 )

Running this script with scala produces the following output.

Print the whole tuple: (Hello,1,2.3)
Print the first item:  Hello
Print the second item: 1
Print the third item:  2.3
World ! 34

The tupleator method simply returns a “3-tuple” with the input arguments. The first statement that uses this method assigns the returned tuple to a single variable t. The next four statements print t in various ways. The first print statement calls Tuple3.toString, which wraps parentheses around the item list. The following three statements print each item in t separately. The expression t._N retrieves the N item, starting at 1, not 0 (this choice follows functional programming conventions).

The last two lines show that we can use a tuple expression on the left-hand side of the assignment. We declare three vals, t1, t2, and t3, to hold the individual items in the tuple. In essence, the tuple items are extracted automatically.

Notice how we mixed types in the tuples. You can see the types more clearly if you use the interactive mode of the scala command, which we introduced in Chapter 1, Zero to Sixty: Introducing Scala.

Invoke the scala command with no script argument. At the scala> prompt, enter val t = ("Hello",1,2.3) and see that you get the following result, which shows you the types of each element in the tuple.

scala> val t = ("Hello",1,2.3)
t: (java.lang.String, Int, Double) = (Hello,1,2.3)

It’s worth noting that there’s more than one way to define a tuple. We’ve been using the more common parenthesized syntax, but you can also use the arrow operator between two values, as well as special factory methods on the tuple-related classes.

scala> 1 -> 2
res0: (Int, Int) = (1,2)

scala> Tuple2(1, 2)
res1: (Int, Int) = (1,2)

scala> Pair(1, 2)
res2: (Int, Int) = (1,2)

Option, Some, and None: Avoiding nulls

We’ll discuss the standard type hierarchy for Scala in the section called “The Scala Type Hierarchy” in Chapter 7, The Scala Object System. However, three useful classes to understand now are the Option class and its two subclasses, Some and None.

Most languages have a special keyword or object that’s assigned to reference variables when there’s nothing else for them to refer to. In Java, this is null; in Ruby, it’s nil. In Java, null is a keyword, not an object, and thus it’s illegal to call any methods on it. But this is a confusing choice on the language designer’s part. Why return a keyword when the programmer expects an object?

To be more consistent with the goal of making everything an object, as well as to conform with functional programming conventions, Scala encourages you to use the Option type for variables and function return values when they may or may not refer to a value. When there is no value, use None, an object that is a subclass of Option. When there is a value, use Some, which wraps the value. Some is also a subclass of Option.

Note

None is declared as an object, not a class, because we really only need one instance of it. In that sense, it’s like the null keyword, but it is a real object with methods.

You can see Option, Some, and None in action in the following example, where we create a map of state capitals in the United States.

// code-examples/TypeLessDoMore/state-capitals-subset-script.scala

val stateCapitals = Map(
  "Alabama" -> "Montgomery",
  "Alaska"  -> "Juneau",
  // ...
  "Wyoming" -> "Cheyenne")

println( "Get the capitals wrapped in Options:" )
println( "Alabama: " + stateCapitals.get("Alabama") )
println( "Wyoming: " + stateCapitals.get("Wyoming") )
println( "Unknown: " + stateCapitals.get("Unknown") )

println( "Get the capitals themselves out of the Options:" )
println( "Alabama: " + stateCapitals.get("Alabama").get )
println( "Wyoming: " + stateCapitals.get("Wyoming").getOrElse("Oops!") )
println( "Unknown: " + stateCapitals.get("Unknown").getOrElse("Oops2!") )

The convenient -> syntax for defining name-value pairs to initialize a Map will be discussed in the section called “The Predef Object” in Chapter 7, The Scala Object System. For now, we want to focus on the two groups of println statements, where we show what happens when you retrieve the values from the map. If you run this script with the scala command, you’ll get the following output.

Get the capitals wrapped in Options:
Alabama: Some(Montgomery)
Wyoming: Some(Cheyenne)
Unknown: None
Get the capitals themselves out of the Options:
Alabama: Montgomery
Wyoming: Cheyenne
Unknown: Oops2!

The first group of println statements invoke toString implicitly on the instances returned by get. We are calling toString on Some or None instances, because the values returned by Map.get are automatically wrapped in a Some, when there is a value in the map for the specified key. Note that the Scala library doesn’t store the Some in the map, it wraps the value in a Some upon retrieval. Conversely, when we ask for a map entry that doesn’t exist, the None object is returned, rather than null. This occurred in the last println of the three.

The second group of println statements go a step further. After calling Map.get, they call get or getOrElse on each Option instance to retrieve the value it contains. Option.get requires that the Option is not empty, that is, the Option instance must actually be a Some. In this case, get returns the value wrapped by the Some, as demonstrated in the println where we print the capital of Alabama. However, if the Option is actually None, then None.get throws a NoSuchElementException.

We also show the alternative method, getOrElse, in the last two println statements. This method returns either the value in the Option, if it is a Some instance, or it returns the second argument we passed to getOrElse, if it is a None instance. In other words, the second argument to getOrElse functions as the default return value.

So, getOrElse is the more defensive of the two methods. It avoids a potential thrown exception. We’ll discuss the merits of alternatives like get vs. getOrElse in the section called “Exceptions and the Alternatives” in Chapter 13, Application Design.

Note that because the Map.get method returns an Option, it automatically documents the fact that there may not be an item matching the specified key. The map handles this situation by returning a None. Most languages would return null (or the equivalent) when there is no “real” value to return. You learn from experience to expect a possible null. Using Option makes the behavior more explicit in the method signature, so it’s more self-documenting.

Also, thanks to Scala’s static typing, you can’t make the mistake of attempting to call a method on a value that might actually be null. While this mistake is easy to do in Java, it won’t compile in Scala because you must first extract the value from the Option. So, the use of Option strongly encourages more resilient programming.

Because Scala runs on the JVM and .NET and because it must interoperate with other libraries, Scala has to support null. Still, you should avoid using null in your code. Tony Hoare, who invented the null reference in 1965 while working on an object-oriented language called ALGOL W, called its invention his “billion dollar mistake” [Hoare2009]. Don’t contribute to that figure.

So, how would you write a method that returns an Option? Here is a possible implementation of get that could be used by a concrete subclass of of Map (Map.get itself is abstract). For a more sophisticated version, see the implementation of get in scala.collection.immutable.HashMap in the Scala library source code distribution.

def get(key: A): Option[B] = {
  if (contains(key))
    new Some(getValue(key))
  else
    None
}

The contains method is also defined for Map. It returns true if the map contains a value for the specified key. The getValue method is intended to be an internal method that retrieves the value from the underlying storage, whatever it is.

Note how the value returned by getValue is wrapped in a Some[B], where the type B is inferred. However, if the call to contains(key) returns false, then the object None is returned.

You can use this same idiom when your methods return an Option. We’ll explore other uses for Option in subsequent sections. Its pervasive use in Scala code makes it an important concept to grasp.

Organizing Code in Files and Namespaces

Scala adopts the package concept that Java uses for namespaces, but Scala offers a more flexible syntax. Just as file names don’t have to match the type names, the package structure does not have to match the directory structure. So, you can define packages in files independent of their “physical” location.

The following example defines a class MyClass in a package com.example.mypkg using the conventional Java syntax.

// code-examples/TypeLessDoMore/package-example1.scala

package com.example.mypkg

class MyClass {
  // ...
}

The next example shows a contrived example that defines packages using the nested package syntax in Scala, which is similar to the namespace syntax in C# and the use of modules as namespaces in Ruby.

// code-examples/TypeLessDoMore/package-example2.scala

package com {
  package example {
    package pkg1 {
      class Class11 {
        def m = "m11"
      }
      class Class12 {
        def m = "m12"
      }
    }

    package pkg2 {
      class Class21 {
        def m = "m21"
        def makeClass11 = {
          new pkg1.Class11
        }
        def makeClass12 = {
          new pkg1.Class12
        }
      }
    }

    package pkg3.pkg31.pkg311 {
      class Class311 {
        def m = "m21"
      }
    }
  }
}

Two packages pkg1 and pkg2 are defined under the com.example package. A total of three classes are defined between the two packages. The makeClass11 and makeClass12 methods in Class21 illustrate how to reference a type in the “sibling” package, pkg1. You can also reference these classes by their full paths, com.example.pkg1.Class11 and com.example.pkg1.Class12, respectively.

The package pkg3.pkg31.pkg311 shows that you can “chain” several packages together in one clause. It is not necessary to use a separate package clause for each package.

Following the conventions of Java, the root package for Scala’s library classes is named scala.

Warning

Scala does not allow package declarations in scripts that are executed directly with the scala interpreter. The reason has to do with the way the interpreter converts statements in scripts to valid Scala code before compiling to byte code. See the the section called “The scala Command Line Tool” section in Chapter 14, Scala Tools, Libraries and IDE Support for more details.

Importing Types and Their Members

To use declarations in packages, you have to import them, just as you do in Java and similarly for other languages. However, compared to Java, Scala greatly expands your options. The following example illustrates several ways to import Java types.

// code-examples/TypeLessDoMore/import-example1.scala

import java.awt._
import java.io.File
import java.io.File._
import java.util.{Map, HashMap}

You can import all types in a package, using the underscore _ as a wild card, as shown on the first line. You can also import individual Scala or Java types, as shown on the second line.

Java uses the “star” character * as the wild card for matching all types in a package or all static members of a type when doing “static imports”. In Scala, this character is allowed in method names, so _ is used as a wild card, as we saw previously.

As shown on the third line, you can import all the static methods and fields in Java types. If java.io.File were actually a Scala object, as discussed previously, then this line would import the fields and methods from the object.

Finally, you can selectively import just the types you care about. On the fourth line, we import just the java.util.Map and java.util.HashMap types from the java.util package. Compare this one-line import statement with the two-line import statements we used in our first example in the section called “Inferring Type Information”. They are functionally equivalent.

The next example shows more advanced options for import statements.

// code-examples/TypeLessDoMore/import-example2-script.scala

def writeAboutBigInteger() = {

  import java.math.BigInteger.{
    ONE => _,
    TEN,
    ZERO => JAVAZERO }

  // ONE is effectively undefined
  // println( "ONE: "+ONE )
  println( "TEN: "+TEN )
  println( "ZERO: "+JAVAZERO )
}

writeAboutBigInteger()

This example demonstrates two features. First, we can put import statements almost anywhere we want, not just at the top of the file, as required by Java. This feature allows us to scope the imports more narrowly. For example, we can’t reference the imported BigInteger definitions outside the scope of the method. Another advantage of this feature is that it puts an import statement closer to where the imported items are actually used.

The second feature shown is the ability to rename imported items. First, the java.math.BigInteger.ONE constant is renamed to the underscore wild card. This effectively makes it invisible and unavailable to the importing scope. This is a useful technique when you want to import everything except a few particular items.

Next, the java.math.BigInteger.TEN constant is imported without renaming, so it can be referenced simply as TEN.

Finally, the java.math.BigInteger.ZERO constant is given the alias JAVAZERO.

Aliasing is useful if you want to give the item a more convenient name or you want to avoid ambiguities with other items in scope that have the same name.

Imports are Relative

There’s one other important thing to know about imports; they are relative. Note the comments for the following imports:

// code-examples/TypeLessDoMore/relative-imports.scala

import scala.collection.mutable._
import collection.immutable._         // Since "scala" is already imported
import _root_.scala.collection.jcl._  // full path from real "root"

package scala.actors {
  import remote._                     // We're in the scope of "scala.actors"
}

Note that the last import statement nested in the scala.actor package scope is relative to that scope.

The [ScalaWiki] has other examples at http://scala.sygneca.com/faqs/language#how-do-i-import.

It’s fairly rare that you’ll have problems with relative imports, but the problem with this convention is that they sometimes cause surprises, especially if you are accustomed to languages like Java, where imports are absolute. If you get a mystifying compiler error that a package wasn’t found, check that the statement is properly relative to the last the import statement or add the _root_. prefix. Also, you might see an IDE or other tool insert an import _root_… statement in your code. Now you know what it means.

Warning

Remember that import statements are relative, not absolute. To create an absolute path, start with _root_.

Abstract Types And Parameterized Types

We mentioned in the section called “A Taste of Scala” in Chapter 1, Zero to Sixty: Introducing Scala that Scala supports parameterized types, which are very similar to generics in Java. (We could use the two terms interchangeably, but it’s more common to use “parameterized types” in the Scala community and “generics” in the Java community.) The most obvious difference is in the syntax, where Scala uses square brackets ([…]), while Java uses angle brackets (<…>).

For example, a list of strings would be declared as follows.

val languages: List[String] = ...

There are other important differences with Java’s generics, which we’ll explore in the section called “Understanding Parameterized Types” in Chapter 12, The Scala Type System.

For now, we’ll mention one other useful detail that you’ll encounter before we can explain it in depth in Chapter 12, The Scala Type System. If you look at the declaration of scala.List in the Scaladocs, you’ll see that the declaration is written as … class List[+A]. The ‘+’ in front of the A means that List[B] is a subtype of List[A] for any B that is a subtype of A. If there is a ‘-’ in front of a type parameter, then the relationship goes the other way, Foo[B] would be a supertype of Foo[A], if the declaration is Foo[-A].

Scala supports another type abstraction mechanism called abstract types, used in many functional programming languages, such as Haskell. Abstract types were also considered for inclusion in Java when generics were adopted. We want to introduce them now, because you’ll see many examples of them before we dive into their details in Chapter 12, The Scala Type System. For a very detailed comparison of these two mechanisms, see [Bruce1998].

Abstract types can be applied to many of the same design problems for which parameterized types are used. However, while the two mechanisms overlap, they are not redundant. Each has strengths and weaknesses for certain design problems.

Here is an example that uses an abstract type.

// code-examples/TypeLessDoMore/abstract-types-script.scala

import java.io._

abstract class BulkReader {
  type In
  val source: In
  def read: String
}

class StringBulkReader(val source: String) extends BulkReader {
  type In = String
  def read = source
}

class FileBulkReader(val source: File) extends BulkReader {
  type In = File
  def read = {
    val in = new BufferedInputStream(new FileInputStream(source))
    val numBytes = in.available()
    val bytes = new Array[Byte](numBytes)
    in.read(bytes, 0, numBytes)
    new String(bytes)
  }
}

println( new StringBulkReader("Hello Scala!").read )
println( new FileBulkReader(new File("abstract-types-script.scala")).read )

Running this script with scala produces the following output.

Hello Scala!
import java.io._

abstract class BulkReader {
...

The BulkReader abstract class declares three abstract members, a type named In, a val field source, and a read method. As in Java, instances in Scala can only be created from concrete classes, which must have definitions for all members.

The derived classes, StringBulkReader and FileBulkReader, provide concrete definitions for these abstract members. We’ll cover the details of class declarations in Chapter 5, Basic Object-Oriented Programming in Scala and the particulars of overriding member declarations in the section called “Overriding Members of Classes and Traits” in Chapter 6, Advanced Object-Oriented Programming In Scala.

For now, note that the type field works very much like a type parameter in a parameterized type. In fact, we could rewrite this example as follows, where we show only what would be different.

abstract class BulkReader[In] {
  val source: In
  ...
}

class StringBulkReader(val source: String) extends BulkReader[String] {...}

class FileBulkReader(val source: File) extends BulkReader[File] {...}

Just as for parameterized types, if we define the In type to be String, then the source field must also be defined as a String. Note that the StringBulkReader's read method simply returns the source field, while the FileBulkReader's read method reads the contents of the file.

As demonstrated by [Bruce1998], parameterized types tend to be best for collections, which is how they are most often used in Java code, while abstract types are most useful for type “families” and other type scenarios.

We’ll explore the details of Scala’s abstract types in Chapter 12, The Scala Type System. For example, we’ll see how to constrain the possible concrete types that can be used.

Reserved Words

Table 2.4, “Reserved Words.” lists the reserved words in Scala, which we sometimes call “keywords”, and briefly describes how they are used [ScalaSpec2009].

Table 2.4. Reserved Words.

Word Description See …

abstract

Makes a declaration abstract. Unlike Java, the keyword is usually not required for abstract members.

the section called “Class and Object Basics”, (Chapter 5, Basic Object-Oriented Programming in Scala)

case

Start a case clause in a match expression.

the section called “Pattern Matching” (Chapter 3, Rounding Out the Essentials)

catch

Start a clause for catching thrown exceptions.

the section called “Using try, catch, and finally Clauses” (Chapter 3, Rounding Out the Essentials)

class

Start a class declaration.

the section called “Class and Object Basics” (Chapter 5, Basic Object-Oriented Programming in Scala)

def

Start a method declaration.

the section called “Method Declarations” (Chapter 2, Type Less, Do More)

do

Start a do … while loop.

the section called “Other Looping Constructs” (Chapter 3, Rounding Out the Essentials)

else

Start an else clause for an if clause.

the section called “Scala if Statements” (Chapter 3, Rounding Out the Essentials)

extends

Indicates that the class or trait that follows is the parent type of the class or trait being declared.

the section called “Parent Classes” (Chapter 5, Basic Object-Oriented Programming in Scala)

false

Boolean false.

the section called “The Scala Type Hierarchy” (Chapter 7, The Scala Object System)

final

Applied to a class or trait to prohibit deriving child types from it. Applied to a member to prohibit overriding it in a derived class or trait.

the section called “Attempting to Override final Declarations” (Chapter 6, Advanced Object-Oriented Programming In Scala)

finally

Start a clause that is executed after the corresponding try clause, whether or not an exception is thrown by the try clause.

the section called “Using try, catch, and finally Clauses” (Chapter 3, Rounding Out the Essentials)

for

Start a for comprehension (loop).

the section called “Scala for Comprehensions” (Chapter 3, Rounding Out the Essentials)

forSome

Used in existential type declarations to constrain the allowed concrete types that can be used.

the section called “Existential Types” (Chapter 12, The Scala Type System)

if

Start an if clause.

the section called “Scala if Statements” (Chapter 3, Rounding Out the Essentials)

implicit

Marks a method as eligible to be used as an implicit type converter. Marks a method parameter as optional, as long as a type-compatible substitute object is in the scope where the method is called.

the section called “Implicit Conversions” (Chapter 8, Functional Programming in Scala)

import

Import one or more types or members of types into the current scope.

the section called “Importing Types and Their Members” (this chapter)

lazy

Defer evaluation of a val.

the section called “Lazy Vals” (Chapter 8, Functional Programming in Scala)

match

Start a pattern matching clause.

the section called “Pattern Matching” (Chapter 3, Rounding Out the Essentials)

new

Create a new instance of a class.

the section called “Class and Object Basics” (Chapter 5, Basic Object-Oriented Programming in Scala)

null

Value of a reference variable that has not been assigned a value.

the section called “The Scala Type Hierarchy” (Chapter 7, The Scala Object System)

object

Start a singleton declaration; a class with only one instance.

the section called “Classes and Objects: Where Are the Statics?” (Chapter 5, Basic Object-Oriented Programming in Scala)

override

Override a concrete member of a class or trait, as long as the original is not marked final.

the section called “Overriding Members of Classes and Traits” (Chapter 6, Advanced Object-Oriented Programming In Scala)

package

Start a package scope declaration.

the section called “Organizing Code in Files and Namespaces” (Chapter 2, Type Less, Do More)

private

Restrict visibility of a declaration.

the section called “Visibility Rules” (Chapter 5, Basic Object-Oriented Programming in Scala)

protected

Restrict visibility of a declaration.

the section called “Visibility Rules” (Chapter 5, Basic Object-Oriented Programming in Scala)

requires

Deprecated. Was used for self typing.

the section called “The Scala Type Hierarchy” (Chapter 7, The Scala Object System)

return

Return from a function

the section called “A Taste of Scala” (Chapter 1, Zero to Sixty: Introducing Scala)

sealed

Applied to a parent class to require all directly derived classes to be declared in the same source file.

the section called “Case Classes” (Chapter 6, Advanced Object-Oriented Programming In Scala)

super

Analogous to this, but binds to the parent type.

the section called “Overriding Abstract and Concrete Methods” (Chapter 6, Advanced Object-Oriented Programming In Scala)

this

How an object refers to itself. The method name for auxiliary constructors.

the section called “Class and Object Basics” (Chapter 5, Basic Object-Oriented Programming in Scala)

throw

Throw an exception.

the section called “Using try, catch, and finally Clauses” (Chapter 3, Rounding Out the Essentials)

trait

A mixin module that adds additional state and behavior to an instance of a class

Chapter 4, Traits

try

Start a block that may throw an exception.

the section called “Using try, catch, and finally Clauses” (Chapter 3, Rounding Out the Essentials)

true

Boolean true.

the section called “The Scala Type Hierarchy” (Chapter 7, The Scala Object System)

type

Start a type declaration

the section called “Abstract Types And Parameterized Types” (Chapter 2, Type Less, Do More)

val

Start a read-only “variable” declaration.

the section called “Variable Declarations” (Chapter 2, Type Less, Do More)

var

Start a read/write variable declaration.

the section called “Variable Declarations” (Chapter 2, Type Less, Do More)

while

Start a while loop.

the section called “Other Looping Constructs” (Chapter 3, Rounding Out the Essentials)

with

Include the trait that follows in the class being declared or the object being instantiated.

Chapter 4, Traits

yield

Return an element in a for comprehension that becomes part of a sequence.

the section called “Yielding” (Chapter 3, Rounding Out the Essentials)

_

A place holder, used in imports, function literals, etc.

Many

:

Separator between identifiers and type annotations.

the section called “A Taste of Scala” (Chapter 1, Zero to Sixty: Introducing Scala)

=

Assignment

the section called “A Taste of Scala” (Chapter 1, Zero to Sixty: Introducing Scala)

=>

Used in function literals to separate the argument list from the function body.

the section called “Function Literals and Closures” (Chapter 8, Functional Programming in Scala)

<-

Used in for comprehensions in generator expressions.

the section called “Scala for Comprehensions” (Chapter 3, Rounding Out the Essentials)

<:

Used in parameterized and abstract type declarations to constrain the allowed types.

the section called “Type Bounds” (Chapter 12, The Scala Type System)

<%

Used in parameterized and abstract type “view bounds” declarations.

the section called “Type Bounds” (Chapter 12, The Scala Type System)

>:

Used in parameterized and abstract type declarations to constrain the allowed types.

the section called “Type Bounds” (Chapter 12, The Scala Type System)

#

Used in type projections

the section called “Type Projections” (Chapter 12, The Scala Type System)

@

Marks an annotation

the section called “Annotations” (Chapter 13, Application Design)

(Unicode \u21D2) same as =>.

the section called “Function Literals and Closures” (Chapter 8, Functional Programming in Scala)

(Unicode \u2190) same as <-.

the section called “Scala for Comprehensions” (Chapter 3, Rounding Out the Essentials)


Notice that break and continue are not listed. These control keywords don’t exist in Scala. Instead, Scala encourages you to use functional programming idioms that are usually more succinct and less error prone. We’ll discuss alternative approaches when we discuss for loops (see the section called “Generator Expressions” in Chapter 3, Rounding Out the Essentials).

Some Java methods use names that are reserved by Scala, e.g., java.util.Scanner.match. To avoid a compilation error, surround the name with single back quotes, e.g., java.util.Scanner.‵match‵.

Recap and What’s Next

We covered several ways that Scala’s syntax is concise, flexible, and productive. We also described many Scala features. In the next chapter, we will round out some Scala essentials before we dive into Scala’s support for object-oriented programming and functional programming.

You must sign in or register before commenting

Chapter 3. Rounding Out the Essentials

Before we dive into Scala’s support for object-oriented and functional programming, let’s finish our discussion of the essential features you’ll use in most of your programs.

Operator? Operator?

An important fundamental concept in Scala is that all operators are actually methods. Consider this most basic of examples:

// code-examples/Rounding/one-plus-two-script.scala

1 + 2

That plus sign between the numbers? It’s a method. First, Scala allows non-alphanumeric method names. You can call methods +, -, $, or whatever you desire. Second, this expression is identical to 1 .+(2). (We put a space after the 1 because 1. would be interpreted as a Double.) When a method takes one argument, Scala lets you drop both the period and the parentheses, so the method invocation looks like an operator invocation. This is called “infix” notation, where the operator is between the instance and the argument. We’ll find out more about this shortly.

Similarly, a method with no arguments can be invoked without the period. This is called “postfix” notation.

Ruby and Smalltalk programmers should now feel right at home. As users of those languages know, these simple rules have far-reaching benefits when it comes to creating programs that flow naturally and elegantly.

So, what characters can you use in identifiers? Here is a summary of the rules for identifiers, used for method and type names, variables, etc. For the precise details, see [ScalaSpec2009]. Scala allows all the printable ASCII characters, such as letters, digits, the underscore ‘_’, and the dollar sign ‘$’, with the exceptions of the “parenthetical” characters, ‘(’, ‘)’, ‘[’, ‘]’, ‘{’, ‘}’, and the “delimiter” characters ‘`’, ‘’’, ‘'’, ‘"’, ‘.’, ‘;’, and ‘,’. Scala allows the other characters between \u0020-\u007F that are not in the sets above, such as mathematical symbols and “other” symbols. These remaining characters are called operator characters and they include characters such as ‘/’, ‘<’, etc.

Reserved Words Can’t Be Used
As in most languages, you can’t reuse reserved words for identifiers. We listed the reserved words in the section called “Reserved Words” in Chapter 2, Type Less, Do More. Recall that some of them are combinations of operator and punctuation characters. For example, a single underscore (‘_’) is a reserved word!

Plain Identifiers - Combinations of Letters, Digits, ‘$’, ‘_’ and Operators
Like Java and many languages, a plain identifier can begin with a letter or underscore, followed by more letters, digits, underscores, and dollar signs. Unicode equivalent characters are also allowed. However, like Java, Scala reserves the dollar sign for internal use, so you shouldn’t use it in your own identifiers. After an underscore, you can have either letters and digits or a sequence of operator characters. The underscore is important. It tells the compiler to treat all the characters up to the next whitespace as part of the identifier. For example val xyz_++= = 1 assigns the variable xyz_++= the value 1, while the expression val xyz++= = 1 won’t compile, because the “identifier” could also be interpreted as xyz ++=, which looks like an attempt to append something to xyz. Similarly, if you have operator characters after the underscore, you can’t mix them with letters and digits. This restriction prevents ambiguous expressions like this: abc_=123. Is that an identifier abc_=123 or an assignment of the value 123 to abc_?

Plain Identifiers - Operators
If an identifier begins with an operator character, the rest of the characters must be operator characters.

“Back-quote” Literals
An identifier can also be an arbitrary string (subject to platform limitations) between two back quote characters, e.g., val `this is a valid identifier` = "Hello World!". Recall that this syntax is also the way to invoke a method on a Java or .NET class when the method’s name is identical to a Scala reserved word, e.g., java.net.Proxy.‵type‵().

Pattern Matching Identifiers
In pattern matching expressions, tokens that begin with a lower-case letter are parsed as variable identifiers, while tokens that begin with an upper-case letter are parsed as constant identifiers. This restriction prevents some ambiguities because of the very succinct variable syntax that is used, e.g., no val keyword is present.

Syntactic Sugar

Once you know that all operators are methods, it’s easier to reason about unfamiliar Scala code. You don’t have to worry about special cases when you see new operators. When working with Actors in the section called “A Taste of Concurrency” in Chapter 1, Zero to Sixty: Introducing Scala, you’ll notice that we use an exclamation point (!) to send a message to an Actor. Now you know that the ! is just another method, as are the other handy shortcut operators you can use to talk to Actors. Similarly, Scala’s XML library provides the \ and \\ operators to dive into document structures. These are just methods on the scala.xml.NodeSeq class.

This flexible method naming gives you the power to write libraries that feel like a natural extension of Scala itself. You could write a new math library with numeric types that accept addition, subtraction, and all the usual mathematical operators. You could write a new concurrent messaging layer that behaves just like Actors. The possibilities are constrained only by Scala’s method naming limitations.

Caution

Just because you can doesn’t mean you should. When designing your own libraries and APIs in Scala, keep in mind that obscure punctuational operators are hard for programmers to remember. Overuse of these can contribute a “line noise” quality of unreadability to your code. Stick to conventions and err on the side of spelling method names out when a shortcut doesn’t come readily to mind.

Methods Without Parentheses and Dots

To facilitate a variety of readable programming styles, Scala is flexible about the use of parentheses in methods. If a method takes no parameters, you can define it without parentheses. Callers must invoke the method without parentheses. If you add empty parentheses, then callers may optionally add parentheses. For example, the size method for List has no parentheses, so you write List(1, 2, 3).size. If you try List(1, 2, 3).size(), you’ll get an error. However, the length method for String does have parentheses in its definition, so both "hello".length() and "hello".length will compile.

The convention in the Scala community is to omit parentheses when calling a method that has no side-effects. So, asking for the size of a sequence is fine without parentheses, but defining a method that transforms the elements in the sequence should be written with parentheses. This convention signals a potentially tricky method for users of your code.

It’s also possible to omit the dot (period) when calling a parameterless method or one that takes only one argument. With this in mind, our List(1, 2, 3).size example above could be written as:

// code-examples/Rounding/no-dot-script.scala

List(1, 2, 3) size

Neat, but confusing. When does this syntactical flexibility become useful? When chaining method calls together into expressive, self-explanatory “sentences” of code:

// code-examples/Rounding/no-dot-better-script.scala

def isEven(n: Int) = (n % 2) == 0

List(1, 2, 3, 4) filter isEven foreach println

As you might guess, running the above produces the output:

2
4

Scala’s liberal approach to parentheses and dots on methods provides one building block for writing Domain-Specific Languages. We’ll learn more about them after a brief discussion of operator precedence.

Precedence Rules

So, if an expression like 2.0 * 4.0 / 3.0 * 5.0 is actually a series of method calls on Doubles, what are the operator precedence rules? Here they are in order from lowest to highest precedence [ScalaSpec2009].

  • all letters
  • |
  • ^
  • &
  • < >
  • = !
  • :
  • + -
  • * / %
  • all other special characters

Characters on the same line have the same precedence. An exception is = when used for assignment, when it has the lowest precedence.

Since * and / have the same precedence, the two lines in the following scala session behave the same.

scala> 2.0 * 4.0 / 3.0 * 5.0
res2: Double = 13.333333333333332

scala> (((2.0 * 4.0) / 3.0) * 5.0)
res3: Double = 13.333333333333332

In a sequence of left-associative method invocations, they simply bind in left-to-right order. “Left-associative” you say? In Scala, any method with a name that ends with a colon ‘:’ actually binds to the right, while all other methods bind to the left. For example, you can prepend an element to a List using the :: method (called “cons”, short for “constructor”).

scala> val list = List('b', 'c', 'd')
list: List[Char] = List(b, c, d)

scala> 'a' :: list
res4: List[Char] = List(a, b, c, d)

The second expression is equivalent to list.::(a). In a sequence of right-associative method invocations, they bind from right to left. What about a mixture of left-binding and right-binding expressions?

scala> 'a' :: list ++ List('e', 'f')
res5: List[Char] = List(a, b, c, d, e, f)

(The ++ method appends two lists.) In this case, list is added to the List(e, f), then a is prepended to create the final list. It’s usually better to add parentheses to remove any potential uncertainty.

Tip

Any method whose name ends with a : binds to the right, not the left.

Finally, note that when you use the scala command, either interactively or with scripts, it may appear that you can define “global” variables and methods outside of types. This is actually an illusion; the interpreter wraps all definitions in an anonymous type before generating JVM or .NET CLR byte code.

Domain-Specific Languages

Domain-Specific Languages, or DSLs, provide a convenient syntactical means for expressing goals in a given problem domain. For example, SQL provides just enough of a programming language to handle the problems of working with databases, making it a domain-specific language.

While some DSLs like SQL are self-contained, it’s become popular to implement DSLs as subsets of full-fledged programming languages. This allows programmers to leverage the entirety of the host language for edge cases that the DSL does not cover, and saves the work of writing lexers, parsers, and the other building blocks of a language.

Scala’s rich, flexible syntax makes writing DSLs a breeze. Consider this example of a style of test writing called Behavior-Driven Development [BDD] using the Specs library (see the section called “Specs”).

// code-examples/Rounding/specs-script.scala
// Example fragment of a Specs script. Doesn't run standalone

"nerd finder" should {
  "identify nerds from a List" in {
    val actors = List("Rick Moranis", "James Dean", "Woody Allen")
    val finder = new NerdFinder(actors)
    finder.findNerds mustEqual List("Rick Moranis", "Woody Allen")
  }
}

Notice how much this code reads like English: “this should test that in the following scenario”, “this value must equal that value”, and so forth. This example uses the superb Specs library, which effectively provides a DSL for the behavior-driven development testing and engineering methodology. By making maximum use of Scala’s liberal syntax and rich methods, Specs test suites are readable even by non-developers.

This is just a taste of the power of DSLs in Scala. We’ll see other examples later and learn how to write our own as we get more advanced (see Chapter 11, Domain-Specific Languages in Scala).

Scala if Statements

Even the most familiar language features are supercharged in Scala, let’s have a look at the lowly if statement. As in most every language, Scala’s if evaluates a conditional expression, then proceeds to a block if the result is true or branches to an alternate block if the result is false. A simple example:

// code-examples/Rounding/if-script.scala

if (2 + 2 == 5) {
  println("Hello from 1984.")
} else if (2 + 2 == 3) {
    println("Hello from Remedial Math class?")
} else {
  println("Hello from a non-Orwellian future.")
}

What’s different in Scala is that if and almost all other statements are actually expressions themselves. So, we can assign the result of an if expression, as shown in this example:

// code-examples/Rounding/assigned-if-script.scala

val configFile = new java.io.File(".myapprc")

val configFilePath = if (configFile.exists()) {
  configFile.getAbsolutePath()
} else {
  configFile.createNewFile()
  configFile.getAbsolutePath()
}

Note that if statements are expressions, meaning they have values. In this example, the value configFilePath is the result of an if expression that handles the case of a configuration file not existing internally, then returns the absolute path to that file. This value can now be reused throughout an application, and the if expression won’t be re-evaluated when the value is used.

Because if statements are expressions in Scala, there is no need for the special-case ternary conditional expressions that exist in C-derived languages. You won’t see x ? doThis() : doThat() in Scala. Scala provides a mechanism that’s just as powerful and more readable.

What if we omit the else clause in the previous example? Typing the code in the scala interpreter will tell us what happens.

scala> val configFile = new java.io.File("~/.myapprc")
configFile: java.io.File = ~/.myapprc

scala> val configFilePath = if (configFile.exists()) {
     |   configFile.getAbsolutePath()
     | }
configFilePath: Unit = ()

scala>

Note that configFilePath is now Unit. (It was String before.) The type inference picks a type that works for all outcomes of the if expression. Unit is the only possibility, since no value is one possible outcome.

Scala for Comprehensions

Another familiar control structure that’s particularly feature-rich in Scala is the for loop, referred to in the Scala community as a for comprehension or for expression. This corner of the language deserves at least one fancy name, because it can do some great party tricks.

Actually, the term comprehension comes from functional programming. It expresses the idea that we are traversing a set of some kind, “comprehending” what we find, and computing something new from it.

A Dog-Simple Example

Let’s start with a basic for expression:

// code-examples/Rounding/basic-for-script.scala

val dogBreeds = List("Doberman", "Yorkshire Terrier", "Dachshund",
                     "Scottish Terrier", "Great Dane", "Portuguese Water Dog")

for (breed <- dogBreeds)
  println(breed)

As you might guess, this code says “for every element in the List dogBreeds, create a temporary variable called breed with the value of that element, then print it.” Think of the <- operator as an arrow directing elements of a collection, one-by-one, to the scoped variable by which we’ll refer to them inside the for expression. The left-arrow operator is called a generator, so named because it’s generating individual values from a collection for use in an expression.

Filtering

What if we want to get more granular? Scala’s for expressions allow for filters that let us specify which elements of a collection we want to work with. So to find all Terriers in our list of dog breeds, we could modify the above example to the following:

// code-examples/Rounding/filtered-for-script.scala

val dogBreeds = List("Doberman", "Yorkshire Terrier", "Dachshund",
                     "Scottish Terrier", "Great Dane", "Portuguese Water Dog")
for (breed <- dogBreeds
  if breed.contains("Terrier")
) println(breed)

To add more than one filter to a for expression, separate the filters with semicolons:

// code-examples/Rounding/double-filtered-for-script.scala

val dogBreeds = List("Doberman", "Yorkshire Terrier", "Dachshund",
                     "Scottish Terrier", "Great Dane", "Portuguese Water Dog")
for (breed <- dogBreeds
  if breed.contains("Terrier");
  if !breed.startsWith("Yorkshire")
) println(breed)

You’ve now found all the Terriers that don’t hail from Yorkshire, and hopefully learned just how useful filters can be in the process.

Yielding

What if, rather than printing your filtered collection, you needed to hand it off to another part of your program? The yield keyword is your ticket to generating new collections with for expressions. In the following example, note that we’re wrapping up the for expression in curly braces, as we would when defining any block.

Tip

for expressions may be defined with parenthesis or curly braces, but using curly braces means you don’t have to separate your filters with semicolons. Most of the time, you’ll prefer using curly braces when you have more than one filter, assignment, etc.

// code-examples/Rounding/yielding-for-script.scala

val dogBreeds = List("Doberman", "Yorkshire Terrier", "Dachshund",
                     "Scottish Terrier", "Great Dane", "Portuguese Water Dog")
val filteredBreeds = for {
  breed <- dogBreeds
  if breed.contains("Terrier")
  if !breed.startsWith("Yorkshire")
} yield breed

Every time through the for expression, the filtered result is yielded as a value named breed. These results accumulate with every run, and the resulting collection is assigned to the value filteredBreeds (as we did with if statements above). The type of the collection resulting from a for-yield expression is inferred from the type of the collection being iterated over. In this case, filteredBreeds is of type List[String], since it is a subset of the dogBreeds list, which is also of type List[String].

Expanded Scope

One final useful feature of Scala’s for comprehensions is the ability to define variables inside the first part of your for expressions that can be used in the latter part. This is best illustrated with an example:

// code-examples/Rounding/scoped-for-script.scala

val dogBreeds = List("Doberman", "Yorkshire Terrier", "Dachshund",
                     "Scottish Terrier", "Great Dane", "Portuguese Water Dog")
for {
  breed <- dogBreeds
  upcasedBreed = breed.toUpperCase()
} println(upcasedBreed)

Note that without declaring upcasedBreed as a val you can reuse it within the body of your for expression. This approach is ideal for transforming elements in a collection as you loop through them.

Finally, in the section called “Options and For Comprehensions” in Chapter 13, Application Design, we’ll see how using Options with for comprehensions can greatly reduce code size by eliminating unnecessary “null” and “missing” checks.

Other Looping Constructs

Scala has several other looping constructs

Scala while Loops

Familiar in many languages, the while loop executes a block of code as long as a condition is true. For example, the following code prints out a complaint once a day until the next Friday the 13th has arrived:

// code-examples/Rounding/while-script.scala
// WARNING: This script runs for a LOOOONG time!

import java.util.Calendar

def isFridayThirteen(cal: Calendar): Boolean = {
  val dayOfWeek = cal.get(Calendar.DAY_OF_WEEK)
  val dayOfMonth = cal.get(Calendar.DAY_OF_MONTH)

  // Scala returns the result of the last expression in a method
  (dayOfWeek == Calendar.FRIDAY) && (dayOfMonth == 13)
}

while (!isFridayThirteen(Calendar.getInstance())) {
  println("Today isn't Friday the 13th. Lame.")
  // sleep for a day
  Thread.sleep(86400000)
}

You can find a table of the conditional operators that work in while loops below.

Scala do-while Loops

Like the while loop above, a do-while loop executes some code while a conditional expression is true. The only difference that a do-while checks to see if the condition is true after running the block. To count up to ten, we could write this:

// code-examples/Rounding/do-while-script.scala

var count = 0

do {
  count += 1
  println(count)
} while (count < 10)

As it turns out, there’s a more elegant way to loop through collections in Scala, as we’ll see in the next section.

Generator Expressions

Remember the arrow operator (<-) from the discussion above about for loops? We can put it to work here, too. Let’s clean up the do-while example above:

// code-examples/Rounding/generator-script.scala

for (i <- 1 to 10) println(i)

Yup, that’s all that’s necessary. This clean one-liner is possible because of Scala’s RichInt class. An implicit conversion is invoked by the compiler to convert the 1, an Int, into a RichInt. (We’ll discuss these conversions in the section called “The Scala Type Hierarchy” in Chapter 7, The Scala Object System and in the section called “Implicit Conversions” in Chapter 8, Functional Programming in Scala.) RichInt defines a to method that takes another integer and returns an instance of Range.Inclusive. That is, Inclusive is a nested class in the Range companion object (a concept we introduced briefly in Chapter 1, Zero to Sixty: Introducing Scala; see Chapter 6, Advanced Object-Oriented Programming In Scala for details). This subclass of the class Range inherits a number of methods for working with sequences and iterable data structures, including those necessary to use it in a for loop.

By the way, if you wanted to count from 1 up to, but not including 10, you could use until instead of to, for example for (i <- 0 until 10).

This should paint a clearer picture of how Scala’s internal libraries compose to form easy-to-use language constructs.

Note

When working with loops in most languages, you can break out of a loop or continue the iterations. Scala doesn’t have either of these statements, but when writing idiomatic Scala code, they’re not necessary. Use conditional expressions to test if a loop should continue, or make use of recursion. Better yet, filter your collections ahead of time to eliminate complex conditions within your loops. However, because of demand for it, Scala version 2.8 includes support for break, implemented as a library method, rather than a built-in break keyword.

Conditional Operators

Scala borrows most of the conditional operators from Java and its predecessors. You’ll find the following in if statements, while loops, and everywhere else conditions apply.

Table 3.1. Conditional Operators

Operator Operation Description

&&

and

The values on the left and right of the operator are true. The right-hand side is only evaluated if the left-hand side is true.

||

or

At least one of the values on the left or right is true. The right-hand side is only evaluated if the left-hand side is false.

>

greater than

The value on the left is greater than the value on the right.

>=

greater than or equals

The value on the left is greater than or equal to the value on the right.

<

less than

The value on the left is less than the value on the right.

<=

less than or equals

The value on the left is less than or equal to the value on the right.

==

equals

The value on the left is the same as the value on the right.

!=

not equal

The value on the left is not the same as the value on the right.


Note that && and || are “short-circuiting” operators. They stop evaluating expressions as soon as the answer is known.

We’ll discuss object equality in more detail in the section called “Equality of Objects” in Chapter 6, Advanced Object-Oriented Programming In Scala. For example, we’ll see that == has a different meaning in Scala vs. Java. Otherwise, these operators should all be familiar, so let’s move on to something new and exciting.

Pattern Matching

An idea borrowed from functional languages, pattern matching is a powerful yet concise way to make a programmatic choice between multiple conditions. Pattern matching is the familiar case statement from your favorite C-like language, but on steroids. In the typical case statement you’re limited to matching against values of ordinal types, yielding trivial expressions like this: "in the case that i is 5, print a message; in the case that i is 6, exit the program". With Scala’s pattern matching, your cases can include types, wild-cards, sequences, and even deep inspections of an object’s variables.

A Simple Match

To begin with, let’s simulate flipping a coin by matching the value of a boolean:

// code-examples/Rounding/match-boolean-script.scala

val bools = List(true, false)

for (bool <- bools) {
  bool match {
    case true => println("heads")
    case false => println("tails")
    case _ => println("something other than heads or tails (yikes!)")
  }
}

It looks just like a C-style case statement, right? The only difference is the last case with the underscore ‘_’ wild card. It matches anything not defined in the cases above it, so it serves the same purpose as the default keyword in Java and C# switch statements.

Pattern matching is eager; the first match wins. So, if you try to put a case _ clause before any other case clauses, the compiler will throw an “unreachable code” error on the next clause, because nothing will get past the default clause!

Tip

Use case _ for the default, “catch-all” match.

What if we want to work with matches as variables?

Variables in Matches

// code-examples/Rounding/match-variable-script.scala

import scala.util.Random

val randomInt = new Random().nextInt(10)

randomInt match {
  case 7 => println("lucky seven!")
  case otherNumber => println("boo, got boring ol' " + otherNumber)
}

In this example, we assign the wild card case to a variable called otherNumber, then print it in the subsequent expression. If we generate a seven, we’ll extoll that number’s virtues. Otherwise, we’ll curse fate for making us suffer an unlucky number.

Matching on Type

These simple examples don’t even begin to scratch the surface of Scala’s pattern matching features. Let’s try matching based on type:

// code-examples/Rounding/match-type-script.scala

val sundries = List(23, "Hello", 8.5, 'q')

for (sundry <- sundries) {
  sundry match {
    case i: Int => println("got an Integer: " + i)
    case s: String => println("got a String: " + s)
    case f: Double => println("got a Double: " + f)
    case other => println("got something else: " + other)
  }
}

Here we pull each element out of a List of Any type of element, in this case containing a String, a Double, an Int, and a Char. For the first three of those types, we let the user know specifically which type we got and what the value was. When we get something else (the Char), we just let the user know the value. We could add further elements to the list of other types and they’d be caught by the other wild card case.

Matching on Sequences

Since working in Scala often means working with sequences, wouldn’t it be handy to be able to match against the length and contents of lists and arrays? The following example does just that, testing two lists to see if they contain four elements, the second of which is the integer 3.

// code-examples/Rounding/match-seq-script.scala

val willWork = List(1, 3, 23, 90)
val willNotWork = List(4, 18, 52)
val empty = List()

for (l <- List(willWork, willNotWork, empty)) {
  l match {
    case List(_, 3, _, _) => println("Four elements, with the 2nd being '3'.")
    case List(_*) => println("Any other list with 0 or more elements.")
  }
}

In the second case of we’ve used a special wild card pattern to match a List of any size, even zero elements, and any element values. You can use this pattern at the end of any sequence match to remove length as a condition.

Recall that we mentioned the “cons” method for List, ::. The expression a :: list prepends a to a list. You can also use this operator to extract the head and tail of a list.

// code-examples/Rounding/match-list-script.scala

val willWork = List(1, 3, 23, 90)
val willNotWork = List(4, 18, 52)
val empty = List()

def processList(l: List[Any]): Unit = l match {
  case head :: tail =>
    format("%s ", head)
    processList(tail)
  case Nil => println("")
}

for (l <- List(willWork, willNotWork, empty)) {
  print("List: ")
  processList(l)
}

The processList method matches on the List argument l. It may look strange to start the method definition like the following.

def processList(l: List[Any]): Unit = l match {
  ...
}

Hopefully hiding the details with the ellipsis makes the meaning a little clearer. The processList method is actually one statement that crosses several lines.

It first matches on head :: tail, where head will be assigned the first element in the list and tail will be assigned the rest of the list. That is, we’re extracting the head and tail from the list using ::. When this case matches, it prints the head and calls processList recursively to process the tail.

The second case matches the empty list, Nil. It prints an end of line and terminates the recursion.

Matching on Tuples (and Guards)

Alternately, if we just wanted to test that we have a tuple of two items, we could do a tuple match:

// code-examples/Rounding/match-tuple-script.scala

val tupA = ("Good", "Morning!")
val tupB = ("Guten", "Tag!")

for (tup <- List(tupA, tupB)) {
  tup match {
    case (thingOne, thingTwo) if thingOne == "Good" =>
        println("A two-tuple starting with 'Good'.")
    case (thingOne, thingTwo) =>
        println("This has two things: " + thingOne + " and " + thingTwo)
  }
}

In the second case in this example, we’ve extracted the values inside the tuple to scoped variables, then reused these variables in the resulting expression.

In the first case we’ve added a new concept: guards. The if condition after the tuple is a guard. The guard is evaluated when matching, but only extracting any variables in the preceding part of the case. Guards provide additional granularity when constructing cases. In this example, the only difference between the two patterns is the guard expression, but that’s enough for the compiler to differentiate them.

Tip

Recall that the cases in a pattern match are evaluated in order. For example, if your first case is broader than your second case, the second case will never be reached. (Unreachable cases will cause a compiler error.) You may include a “default” case at the end of a pattern match, either using the underscore wild card character or a meaningfully-named variable. When using a variable, it should have no explicit type or it should be declared as Any, so it can match anything. On the other hand, try to design your code to avoid a catch-all clause by ensuring it only receives specific items that are expected.

Matching on Case Classes

Let’s try a deep match, examining the contents of objects in our pattern match.

// code-examples/Rounding/match-deep-script.scala

case class Person(name: String, age: Int)

val alice = new Person("Alice", 25)
val bob = new Person("Bob", 32)
val charlie = new Person("Charlie", 32)

for (person <- List(alice, bob, charlie)) {
  person match {
    case Person("Alice", 25) => println("Hi Alice!")
    case Person("Bob", 32) => println("Hi Bob!")
    case Person(name, age) =>
      println("Who are you, " + age + " year-old person named " + name + "?")
  }
}

Poor Charlie gets the cold shoulder, as we can see in the output for the above example:

Hi Alice!
Hi Bob!
Who are you, 32 year-old person named Charlie?

We first define a case class, a special type of class that we’ll learn more about in the section called “Case Classes” in Chapter 6, Advanced Object-Oriented Programming In Scala. For now, it will suffice to say that a case class allows for very terse construction of simple objects with some pre-defined methods. Our pattern match then looks for Alice and Bob by inspecting the values passed to the constructor of the Person case class. Charlie falls through to the catch-all case; even though he has the same age value as Bob, we’re matching on the name property as well.

This type of pattern match becomes extremely useful when working with Actors, as we’ll see later on. Case classes are frequently sent to Actors as messages, and deep pattern matching on an object’s contents is a convenient way to "parse" those messages.

Matching on Regular Expressions

Regular expressions are convenient for extracting data from strings that have an informal structure, but are not “structured data” (that is, in a format like XML or JSON, for example). Commonly referred to as regexes, regular expressions are a feature of nearly all modern programming languages. They provide a terse syntax for specifying complex matches, one which is typically translated into a state machine behind the scenes for optimum performance.

Regexes in Scala should contain no surprises if you’ve used them in other programming languages. Let’s see an example.

// code-examples/Rounding/match-regex-script.scala

val BookExtractorRE = """Book: title=([^,]+),\s+authors=(.+)""".r
val MagazineExtractorRE = """Magazine: title=([^,]+),\s+issue=(.+)""".r

val catalog = List(
  "Book: title=Programming Scala, authors=Dean Wampler, Alex Payne",
  "Magazine: title=The New Yorker, issue=January 2009",
  "Book: title=War and Peace, authors=Leo Tolstoy",
  "Magazine: title=The Atlantic, issue=February 2009",
  "BadData: text=Who put this here??"
)

for (item <- catalog) {
  item match {
    case BookExtractorRE(title, authors) =>
      println("Book \"" + title + "\", written by " + authors)
    case MagazineExtractorRE(title, issue) =>
      println("Magazine \"" + title + "\", issue " + issue)
    case entry => println("Unrecognized entry: " + entry)
  }
}

We start with two regular expressions, one for records of books and another for records of magazines. Calling .r on a string turns it into a regular expression; we use raw (triple-quoted) strings here to avoid having to double-escape backslashes. Should you find the .r transformation method on strings unclear, you can also define regexes by creating new instances of the Regex class, as in: new Regex("""\W""").

Notice that each of our regexes defines two capture groups, connoted by parentheses. Each group captures the value of a single field in the record, such as a book’s title or author. Regexes in Scala translate those capture groups to extractors. Every match sets a field to the captured result; every miss is set to null.

What does this mean in practice? If the text fed to the regular expression matches, case BookExtractorRE(title, authors) will assign the first capture group to title and the second to authors. We can then use those values on the right-hand side of the case clause, as we have in the above example. The variable names title and author within the extractor are arbitrary; matches from capture groups are simply assigned from left to right, and you can call them whatever you’d like.

That’s regexes in Scala in nutshell. The scala.util.matching.Regex class supplies several handy methods for finding and replacing matches in strings, both all occurrences of a match and just the first occurrence, so be sure to make use of them.

What we won’t cover in this section is the details of writing regular expressions. Scala’s Regex class uses the underlying platform’s regular expression APIs (that is, Java’s or .NET’s). Consult references on those APIs for the hairy details, as they may be subtly different than the regex support in your language of choice.

Binding Nested Variables in Case Clauses

Sometimes you want to bind a variable to an object enclosed in a match, where you are also specifying match criteria on the nested object. Suppose we modify the previous example so we’re matching on the key-value pairs from a map. We’ll store our same Person objects as the values and use an employee id as the key. We’ll also add another attribute to Person, a role field that points to an instance from a type hierarchy.

// code-examples/Rounding/match-deep-pair-script.scala

class Role
case object Manager extends Role
case object Developer extends Role

case class Person(name: String, age: Int, role: Role)

val alice = new Person("Alice", 25, Developer)
val bob = new Person("Bob", 32, Manager)
val charlie = new Person("Charlie", 32, Developer)

for (item <- Map(1 -> alice, 2 -> bob, 3 -> charlie)) {
  item match {
    case (id, p @ Person(_, _, Manager)) => format("%s is overpaid.\n", p)
    case (id, p @ Person(_, _, _)) => format("%s is underpaid.\n", p)
  }
}

The case objects are just singleton objects like we’ve seen before, but with the special case behavior. We’re most interested in the embedded p @ Person(…) inside the case clause. We’re matching on particular kinds of Person objects inside the enclosing tuple. We also want to assign the Person to a variable p, so we can use it for printing.

Person(Alice,25,Developer) is underpaid.
Person(Bob,32,Manager) is overpaid.
Person(Charlie,32,Developer) is underpaid.

If we weren’t using matching criteria in Person itself, we could just write p: Person. For example, the previous match clause could be written this way.

item match {
  case (id, p: Person) => p.role match {
    case Manager => format("%s is overpaid.\n", p)
    case _ => format("%s is underpaid.\n", p)
  }
}

Note that the p @ Person(…) syntax gives us a way to flatten this nesting of match statements into one statement. It is analogous to using “capture groups” in a regular expression to pull out substrings we want, instead of splitting the string in several successive steps to extract the substrings we want. Use whichever technique you prefer.

Using try, catch, and finally Clauses

Through its use of functional constructs and strong typing, Scala encourages a coding style that lessens the need for exceptions and exception handling. But where Scala interacts with Java, exceptions are still prevalent.

Note

Scala does not have checked exceptions, like Java. Even Java’s checked exceptions are treated as unchecked by Scala. There is also no throws clause on method declarations. However, there is a @throws annotation that is useful for Java interoperability. See the section called “Annotations” in Chapter 13, Application Design.

Thankfully, Scala treats exception handling as just another pattern match, allowing us to make smart choices when presented with a multiplicity of potential exceptions. Let’s see this in action:

// code-examples/Rounding/try-catch-script.scala

import java.util.Calendar

val then = null
val now = Calendar.getInstance()

try {
  now.compareTo(then)
} catch {
  case e: NullPointerException => println("One was null!"); System.exit(-1)
  case unknown => println("Unknown exception " + unknown); System.exit(-1)
} finally {
  println("It all worked out.")
  System.exit(0)
}

In the above example, we explicitly catch the NullPointerException thrown when trying to compare a Calendar instance with null. We also define unknown as a catch-all case, just to be safe. If we weren’t hard-coding this program to fail, the finally block would be reached and the user would be informed that everything worked out just fine.

Note

You can use an underscore (Scala’s standard wild card character) as a placeholder to catch any type of exception (really, to match any case in a pattern matching expression). However, you won’t be able to refer to the exception in the subsequent expression. Name the exception variable if you need it, for example, if you need to print the exception as we do in the catch-all case of the previous example. e or ex are fine names.

Pattern matching aside, Scala’s treatment of exception handling should be familiar to those fluent in Java, Ruby, Python, and most other mainstream languages. And yes, you throw an exception by writing throw new MyBadException(…). That’s all there is to it.

Concluding Remarks on Pattern Matching

Pattern matching is a powerful and elegant way of extracting information from objects, when used appropriately. Recall from Chapter 1, Zero to Sixty: Introducing Scala that we highlighted the synergy between pattern matching and polymorphism. Most of the time, you want to avoid the problems of “switch” statements that know a class hierarchy, because they have to be modified every time the hierarchy is changed.

In our drawing actor example, we used pattern matching to separate different “categories” of messages, but we used polymorphism to draw the shapes sent to it. We could change the Shape hierarchy and the actor code would not require changes.

Pattern matching is also useful for the design problem where you need to get at data inside an object, but only in special circumstances. One of the unintended consequences of the JavaBeans [JavaBeansSpec] specification was it encouraged people to expose fields in their objects through getters and setters. This should never be a default decision. Access to “state information” should be encapsulated and exposed only in ways that make logical sense for the type, as viewed from the abstraction it exposes.

Instead, consider using pattern matching for those “rare” times when you need to extract information in a controlled way. As we will see in the section called “Unapply” in Chapter 6, Advanced Object-Oriented Programming In Scala, the pattern matching examples we have shown use unapply methods defined to extract information from instances. These methods let you extract that information while hiding the implementation details. In fact, the information returned by an unapply method might be a transformation of the actual information in the instance.

Finally, when designing pattern matching statements, be wary of relying on a default case clause. Under what circumstances would “none of the above” be the correct answer? It may indicate that the design should be refined so you know more precisely all the possible matches that might occur. We’ll learn one technique that helps when we discuss sealed class hierarchies in the section called “Sealed Class Hierarchies” in Chapter 7, The Scala Object System.

Enumerations

Remember our examples above involving various breeds of dog? In thinking about the types in these programs, we might want a top-level Breed type that keeps track of a number of breeds. Such a type is called an enumerated type, and the values it contains are called enumerations.

While enumerations are a built-in part of many programming languages, Scala takes a different route and implements them as a class in its standard library. This means there is no special syntax for enumerations in Scala, as in Java and C#. Instead, you just define an object that extends the Enumeration class. Hence, at the byte code level, there is no connection between Scala enumerations and the enum constructs in Java and C#.

Here is in example:

// code-examples/Rounding/enumeration-script.scala

object Breed extends Enumeration {
  val doberman = Value("Doberman Pinscher")
  val yorkie = Value("Yorkshire Terrier")
  val scottie = Value("Scottish Terrier")
  val dane = Value("Great Dane")
  val portie = Value("Portuguese Water Dog")
}

// print a list of breeds and their IDs
println("ID\tBreed")
for (breed <- Breed) println(breed.id + "\t" + breed)

// print a list of Terrier breeds
println("\nJust Terriers:")
Breed.filter(_.toString.endsWith("Terrier")).foreach(println)

When run, you’ll get the following output:

ID      Breed
0       Doberman Pinscher
1       Yorkshire Terrier
2       Scottish Terrier
3       Great Dane
4       Portuguese Water Dog

Just Terriers:
Yorkshire Terrier
Scottish Terrier

We can see that our Breed enumerated type contains several variables of type Value, as in the following example.

val doberman = Value("Doberman Pinscher")

Each declaration is actually calling a method named Value that takes a string argument. We use this method to assign a long-form breed name to each enumeration value, which is what the Value.toString method returned in the output above.

Note that there is no name space collision between the type and method that both have the name Value. There are other overloaded versions of the Value method. One of them takes no arguments, another one takes an Int ID value, and another one takes both an Int and String. These Value methods return a Value object and they add the value to the enumeration’s collection of values.

In fact, Scala’s Enumeration class supports the usual methods for working with collections, so we can easily iterate through the breeds with a for loop and filter them by name. The output above also demonstrated that every Value in an enumeration is automatically assigned a numeric identifier, unless you call one of the Value methods where you specify your own ID value explicitly.

You’ll often want to give your enumeration values human readable names, as we did here. However, sometimes you may not need them. Here’s another enumeration example adapted from the scaladoc entry for Enumeration.

// code-examples/Rounding/days-enumeration-script.scala

object WeekDay extends Enumeration {
  type WeekDay = Value
  val Mon, Tue, Wed, Thu, Fri, Sat, Sun = Value
}
import WeekDay._

def isWorkingDay(d: WeekDay) = ! (d == Sat || d == Sun)

WeekDay filter isWorkingDay foreach println

Running this script with scala yields the following output.

Main$$anon$1$WeekDay(0)
Main$$anon$1$WeekDay(1)
Main$$anon$1$WeekDay(2)
Main$$anon$1$WeekDay(3)
Main$$anon$1$WeekDay(4)

When a name isn’t assigned using one of the Value methods that takes a String argument, Value.toString prints the name of the type that is synthesized by the compiler, along with the ID value that was generated automatically.

Note that we imported WeekDay._. This made each enumeration value, e.g., Mon, Tues, etc. in scope. Otherwise, you would have to write WeekDay.Mon, WeekDay.Tues, etc.

Also, the import made the type alias, type Weekday = Value in scope, which we used as the type for the argument for the isWorkingDay method. If you don’t define a type alias like this, then you would declare the method as def isWorkingDay(d: WeekDay.Value).

Since Scala enumerations are just regular objects, you could use any object with vals to indicate different “enumeration values”. However, extending Enumeration has several advantages. It automatically manages the values as a collection that you can iterate over, etc., as in our examples. It also automatically assigns unique integer ids to each value.

Case classes (see the section called “Case Classes” in Chapter 6, Advanced Object-Oriented Programming In Scala) are often used instead of enumerations in Scala, because the “use case” for them often involves pattern matching. We’ll revisit this topic in the section called “Enumerations vs. Pattern Matching” in Chapter 13, Application Design.

Recap and What’s Next

We’ve covered a lot of ground in this chapter. We learned how flexible Scala’s syntax can be, and how it facilitates the creation of Domain-Specific Languages. Then we explored Scala’s enhancements to looping constructs and conditional expressions. We experimented with different uses for pattern matching, a powerful improvement on the familiar case-switch statement. Finally, we learned how to encapsulate values in enumerations.

You should now be prepared to read a fair bit of Scala code, but there’s plenty more about the language to put in your tool belt. In the next four chapters, we’ll explore Scala’s approach to object-oriented programming, starting with traits.

You must sign in or register before commenting

Chapter 4. Traits

Introducing Traits

Before we dive into object-oriented programming, there’s one more essential feature of Scala that you should get acquainted with: traits. Understanding the value of this feature requires a little backstory.

In Java, a class can implement an arbitrary number of interfaces. This model is very useful for declaring that a class exposes multiple abstractions. Unfortunately, it has one major drawback.

For many interfaces, much of the functionality can be implemented with “boilerplate” code that will be valid for all classes that use the interface. Java provides no built-in mechanism for defining and using such reusable code. Instead, Java programmers must use ad hoc conventions to reuse implementation code for a given interface. In the worst case, the developer just copies and pastes the same code into every class that needs it.

Often, the implementation of an interface has members that are unrelated (“orthogonal”) to the rest of the instance’s members. The term mixin is often used for such focused and potentially reusable parts of an instance that could be independently maintained.

Have a look at the following code for a button in a graphical user interface, which uses callbacks for “clicks”.

// code-examples/Traits/ui/button-callbacks.scala

package ui

class ButtonWithCallbacks(val label: String,
    val clickedCallbacks: List[() => Unit]) extends Widget {

  require(clickedCallbacks != null, "Callback list can't be null!")

  def this(label: String, clickedCallback: () => Unit) =
    this(label, List(clickedCallback))

  def this(label: String) = {
    this(label, Nil)
    println("Warning: button has no click callbacks!")
  }

  def click() = {
    // ... logic to give the appearance of clicking a physical button ...
    clickedCallbacks.foreach(f => f())
  }
}

There’s a lot going on here. The primary constructor takes a label argument and a list of callbacks that are invoked when the button’s click method is invoked. We’ll explore this class in greater detail in Chapter 5, Basic Object-Oriented Programming in Scala. For now we want to focus on one particular problem. Not only does ButtonWithCallbacks handle behaviors essential to buttons (like clicking), it also handles notification of click events by invoking the callback functions. This goes against the Single Responsibility Principle [Martin2003], a means to the design goal of separation of concerns. We would like to separate the button-specific logic from the callback logic, such that each logical component becomes simpler, more modular, and more reusable. The callback logic is a good example of a mixin.

This separation is difficult to do in Java, even if we define an interface for the callback behavior. We still have to embed the implementation code in the class somehow, compromising modularity. The only other alternative is to use a specialized tool like aspect-oriented programming (AOP, see [AOSD]), as implemented by AspectJ [AspectJ], an extension of Java. AOP is primarily designed to separate the implementations of “pervasive” concerns that are repeated throughout an application. It seeks to modularize these concerns, yet enable the fine-grained “mixing” of their behaviors with other concerns, including the core domain logic of the application, either at build or run time.

Traits as Mixins

Scala provides a complete mixin solution, called traits. In our example, we can define the callback abstraction in a trait, as in a Java interface, but we can also implement the abstraction in the trait (or a derived trait). We can declare classes that “mix in” the trait, much the way you can declare classes that implement an interface in Java. However, in Scala we can even mix-in traits at the same time we create instances. That is, we don’t have to declare a class first that mixes in all the traits we want. So, Scala traits preserve separation of concerns while giving us the ability to compose behavior on demand.

If you come from a Java background, you can think of traits as interfaces with optional implementations. Other languages provide constructs that are similar to traits, such as modules in Ruby, for example.

Let’s use a trait to separate the callback handling from the button logic. We’ll generalize our approach a little bit. Callbacks are really a special case of the Observer Pattern [GOF1995]. So, let’s create a trait that implements this pattern, then use it to handle callback behavior. To simplify things, we’ll start with a single callback that counts the number of button clicks.

First, let’s define a simple Button class.

// code-examples/Traits/ui/button.scala

package ui

class Button(val label: String) extends Widget {
  def click() = {
    // Logic to give the appearance of clicking a button...
  }
}

Here is the parent class, Widget.

// code-examples/Traits/ui/widget.scala

package ui

abstract class Widget

The logic for managing callbacks (i.e., the clickedCallbacks list) is omitted, as are the two auxiliary constructors. Only the button’s label field and click method remain. The click method now only cares about the visual appearance of a “physical” button being clicked. Button has only one concern, handling the “essence” of being a button.

Here is a trait that implements the logic of the Observer Pattern.

// code-examples/Traits/observer/observer.scala

package observer

trait Subject {
  type Observer = { def receiveUpdate(subject: Any) }

  private var observers = List[Observer]()
  def addObserver(observer:Observer) = observers ::= observer
  def notifyObservers = observers foreach (_.receiveUpdate(this))
}

Except for the trait keyword, Subject looks like a normal class. Subject defines all the members it declares. Traits can declare abstract members, concrete members, or both, just as classes can (see the section called “Overriding Members of Classes and Traits” in Chapter 6, Advanced Object-Oriented Programming In Scala for more details). Also like classes, traits can contain nested trait and class definitions and classes can contain nested trait definitions.

The first line defines a type for an Observer. This is a structural type of the form { def receiveUpdate(subject:Any) }. Structural types specify only the structure a subtype must support; you could think of them as “anonymous” types.

In this case, the structural type is defined by a method with a particular signature. Any type that has a method with this signature can be used as an observer. We’ll learn more about structural types in Chapter 12, The Scala Type System. If you’re wondering why we didn’t use Subject as the type of the argument, instead of Any, we’ll revisit that issue in the section called “Self-Type Annotations and Abstract Type Members” in Chapter 13, Application Design.

The main thing to notice for now is how this structural type minimizes the coupling between the Subject trait and any potential users of the trait.

Note

Subject is still coupled by the name of the method in Observer through the structural type, i.e., to a method named receiveUpdate. There are several ways we can reduce this remaining coupling. We’ll see how in the section called “Overriding Abstract Types” in Chapter 6, Advanced Object-Oriented Programming In Scala.

Next, we declare a list of observers. We make it a var, rather than a val, because List is immutable, so we must create a new list when an observer is added using the addObserver method.

We’ll discuss Scala List's more in the section called “The Scala Type Hierarchy” in Chapter 7, The Scala Object System and also in Chapter 8, Functional Programming in Scala. For now, notice that addObserver uses the the list cons “operator” method (::) to prepend an observer to the list of observers. The scala compiler is smart enough to turn the following statement,

observers ::= observer

into this statement,

observers = observer :: observers

Note that we wrote observer :: observers, with the existing observers list on the right hand side. Recall that any method that ends with : binds to the right. So, the previous statement is equivalent to the following statement.

observers = observers.::(observer)

The notifyObservers method iterates through the observers, using the foreach method and calls receiveUpdate on each one. (Note that we are using the “infix” operator notation instead of observers.foreach.) We use the placeholder ‘_’ to shorten the following expression,

(obs) => obs.receiveUpdate(this)

into this expression,

_.receiveUpdate(this)

This expression is actually the body of an “anonymous function”, called a function literal in Scala. This is similar to a lambda and similar constructs used in many other languages. Function literals and the related concept of a closure are discussed in the section called “Function Literals and Closures” in Chapter 8, Functional Programming in Scala.

In Java, the foreach method would probably take an interface and you would pass an instance of a class that implements the interface (e.g., the way Comparable is typically used).

In Scala, the List[A].foreach method expects an argument of type (A) => Unit, which is a function taking an instance of type A, where A represents the type of the elements of the list (Observer, in this case), and returning Unit (like void in Java).

Note

We chose to use a var with immutable Lists for the observers in this example. We could have used a val with a mutable type, like ListBuffer. That choice would make a little more sense for a real application, but we wanted to avoid the distraction of explaining new library classes.

Once again, we learned a lot of Scala from a small example. Now let’s put our Subject trait to use. Here is ObservableButton, which subclasses Button and mixes in Subject.

// code-examples/Traits/ui/observable-button.scala

package ui
import observer._

class ObservableButton(name: String) extends Button(name) with Subject {
  override def click() = {
    super.click()
    notifyObservers
  }
}

We start by importing everything in the observer package, using the ‘_’ wild card. Actually, we have only defined the Subject trait in the package.

The new class uses the with keyword to add the Subject trait to the class. ObservableButton overrides the click method. Using the super keyword (see the section called “Overriding Abstract and Concrete Methods” in Chapter 6, Advanced Object-Oriented Programming In Scala), it first invokes the “superclass” method, Button.click, then it notifies the observers. Since the new click method overrides Button's concrete implementation, the override keyword is required.

The with keyword is analogous to Java’s implements keyword for interfaces. You can specify as many traits as you want, each with its own with keyword.

A class can extend a trait and a trait can extend a class. In fact, our Widget class above could have been declared to be a trait.

Note

If you declare a class that uses one or more traits and it doesn’t extend another class, you must use the extends keyword for the first trait listed.

If you don’t use extends for the first trait, e.g., you write the following.

// ERROR:
class ObservableButton(name: String) with Button(name) with Subject {...}

You’ll get an error like this.

... error: ';' expected but 'with' found.
       class ObservableButton(name: String) with Button(name) with Subject {...}
                                            ^

The error should really say “with found, but extends expected.”

To demonstrate this code, let’s start with a class for observing button clicks that simply counts the number of clicks.

// code-examples/Traits/ui/button-count-observer.scala

package ui
import observer._

class ButtonCountObserver {
  var count = 0
  def receiveUpdate(subject: Any) = count += 1
}

Finally, let’s write a test that exercises all these classes. We will use the Specs library (discussed in the the section called “Specs” section of Chapter 14, Scala Tools, Libraries and IDE Support) to write a Behavior-Driven Development ([BDD]) “specification” that exercises the combined Button and Subject types.

// code-examples/Traits/ui/button-observer-spec.scala

package ui
import org.specs._
import observer._

object ButtonObserverSpec extends Specification {
  "A Button Observer" should {
    "observe button clicks" in {
      val observableButton = new ObservableButton("Okay")
      val buttonObserver = new ButtonCountObserver
      observableButton.addObserver(buttonObserver)

      for (i <- 1 to 3) observableButton.click()
      buttonObserver.count mustEqual 3
    }
  }
}

If you downloaded the code examples from the O’Reilly site, then you can follow the directions in its README files for building and running the examples in this chapter. The output of the specs “target” of the build should include the following text.

Specification "ButtonCountObserverSpec"
  A Button Observer should
  + observe button clicks

Total for specification "ButtonCountObserverSpec":
Finished in 0 second, 10 ms
1 example, 1 expectation, 0 failure, 0 error

Notice that the strings "A Button Observer should" and "observe button clicks" correspond to strings in the example. The output of a Specs run provides a nice summary of the requirements for the items being tested, assuming good choices were made for the strings.

The body of the test creates an “Okay” ObservableButton and a ButtonCountObserver, which gives the observer to the button. The button is clicked three times, using the for loop. The last line requires the observer’s count to equal 3. If you are accustomed to using an XUnit-style TDD tool, like JUnit [JUnit] or ScalaTest [ScalaTestTool] (see also the section called “ScalaTest” in Chapter 14, Scala Tools, Libraries and IDE Support), then the last line is equivalent to the following JUnit assertion.

assertEquals(3, buttonObserver.count)

Note

The Specs library (see the section called “Specs”) and the ScalaTest library (see the section called “ScalaTest”) both support behavior-driven development [BDD], a style of test-driven development [TDD] that emphasizes the “specification” role of tests.

Suppose we only need one ObservableButton instance? We actually don’t have to declare a class that subclasses Button with Subject. We can incorporate the trait when we create the instance.

The next example shows a revised Specs file that instantiates a Button with Subject mixed in as part of the declaration.

// code-examples/Traits/ui/button-observer-anon-spec.scala

package ui
import org.specs._
import observer._

object ButtonObserverAnonSpec extends Specification {
  "A Button Observer" should {
    "observe button clicks" in {
      val observableButton = new Button("Okay") with Subject {
        override def click() = {
          super.click()
          notifyObservers
        }
      }

      val buttonObserver = new ButtonCountObserver
      observableButton.addObserver(buttonObserver)

      for (i <- 1 to 3) observableButton.click()
      buttonObserver.count mustEqual 3
    }
  }
}

The revised declaration of observableButton actually creates an anonymous class in which we override the click method, as before. The main difference with creating anonymous classes in Java is that we can incorporate traits in this process. Java does not let you implement a new interface while instantiating a class.

Finally, note that the inheritance hierarchy for an instance can be complex if it mixes in traits that extend other traits, etc. We’ll discuss the details of the hierarchy in the section called “Linearization of an Object’s Hierarchy” in Chapter 7, The Scala Object System.

Stackable Traits

There are a couple of refinements we can do to improve the reusability of our work and to make it easier to use more than one trait at a time, i.e., to “stack” them.

First, let’s introduce a new trait, Clickable, an abstraction for any widget that responds to clicks.

// code-examples/Traits/ui2/clickable.scala

package ui2

trait Clickable {
  def click()
}

Note

We’re starting with a new package, ui2, to make it easier to keep older and newer versions of the examples distinct in the downloadable code.

The Clickable trait looks just like a Java interface; it is completely abstract. It defines a single, abstract method, click. The method is abstract because it has no body. If Clickable were a class, we would have to add the abstract keyword in front of the class keyword. This is not necessary for traits.

Here is the refactored button, which uses the trait.

// code-examples/Traits/ui2/button.scala

package ui2

import ui.Widget

class Button(val label: String) extends Widget with Clickable {
  def click() = {
    // Logic to give the appearance of clicking a button...
  }
}

This code is like Java code that implements a Clickable interface.

When we previously defined ObservableButton (in the section called “Traits as Mixins”), we overrode Button.click to notify the observers. We had to duplicate that logic in ButtonObserverAnonSpec when we declared observableButton as a Button instance that mixed in the Subject trait directly. Let’s eliminate this duplication.

When we refactor the code this way, we realize that we don’t really care about observing buttons; we care about observing clicks. Here is a trait that focuses solely on observing Clickable.

// code-examples/Traits/ui2/observable-clicks.scala

package ui2
import observer._

trait ObservableClicks extends Clickable with Subject {
  abstract override def click() = {
    super.click()
    notifyObservers
  }
}

The ObservableClicks trait extends Clickable and mixes in Subject. It then overrides the click method with an implementation that looks almost the same as the overridden method shown in the section called “Traits as Mixins”. The important difference is the abstract keyword.

Look closely at this method. It calls super.click(), but what is super in this case? At this point, it could only appear to be Clickable, which declares, but does not define the click method, or it could be Subject, which doesn’t have a click method. So, super can’t be bound, at least not yet.

In fact, super will be bound when this trait is mixed into an instance that defines a concrete click method, such as Button. Therefore, we need an abstract keyword on ObservableClicks.click to tell the compiler (and the reader) that click is not yet fully implemented, even though ObservableClicks.click has a body.

Note

Except for declaring abstract classes, the abstract keyword is only required on a method in a trait when the method has a body, but it calls the super method which doesn’t have a concrete implementation in parents of the trait.

Let’s use this trait with Button and its concrete click method in a Specs test.

// code-examples/Traits/ui2/button-clickable-observer-spec.scala

package ui2
import org.specs._
import observer._
import ui.ButtonCountObserver

object ButtonClickableObserverSpec extends Specification {
  "A Button Observer" should {
    "observe button clicks" in {
      val observableButton = new Button("Okay") with ObservableClicks
      val buttonClickCountObserver = new ButtonCountObserver
      observableButton.addObserver(buttonClickCountObserver)

      for (i <- 1 to 3) observableButton.click()
      buttonClickCountObserver.count mustEqual 3
    }
  }
}

Compare this code to ButtonObserverAnonSpec. We instantiate a Button with the ObservableClicks trait mixed in, but now there is no override of click required. Hence, this client of Button doesn’t have to worry about properly overriding click. The hard work is already done by ObservableClicks. The desired behavior is composed declaratively when needed.

Let’s finish our example by adding a second trait. The JavaBeans specification [JavaBeansSpec] has the idea of “vetoable” events, where listeners for changes to a JavaBean can veto the change. Let’s implement something similar with a trait that vetoes more than a set number of clicks.

// code-examples/Traits/ui2/vetoable-clicks.scala

package ui2
import observer._

trait VetoableClicks extends Clickable {
  val maxAllowed = 1  // default
  private var count = 0

  abstract override def click() = {
    if (count < maxAllowed) {
      count += 1
      super.click()
    }
  }
}

Once again, we override the click method. As before, the override must be declared abstract. The maximum allowed number of clicks defaults to 1. You might wonder what we mean by “defaults” here? Isn’t the field declared to be a val? There is no constructor defined to initialize it to another value. We’ll revisit these questions in the section called “Overriding Members of Classes and Traits” in Chapter 6, Advanced Object-Oriented Programming In Scala.

This trait also declares a count variable to keep track of the number of clicks seen. It is declared private, so it is invisible outside the trait (see the section called “Visibility Rules” in Chapter 5, Basic Object-Oriented Programming in Scala). The overridden click method increments count. It only calls the super.click() method if the count is less than or equal to the maxAllowed count.

Here is a Specs object that demonstrates ObservableClicks and VetoableClicks working together. Note that a separate with keyword is required for each trait, as opposed to using one keyword and separating the names with commas, as Java does for implements clauses.

// code-examples/Traits/ui2/button-clickable-observer-vetoable-spec.scala

package ui2
import org.specs._
import observer._
import ui.ButtonCountObserver

object ButtonClickableObserverVetoableSpec extends Specification {
  "A Button Observer with Vetoable Clicks" should {
    "observe only the first button click" in {
      val observableButton =
          new Button("Okay") with ObservableClicks with VetoableClicks
      val buttonClickCountObserver = new ButtonCountObserver
      observableButton.addObserver(buttonClickCountObserver)

      for (i <- 1 to 3) observableButton.click()
      buttonClickCountObserver.count mustEqual 1
    }
  }
}

The expected observer count is 1. The observableButton is declared as follows.

new Button("Okay") with ObservableClicks with VetoableClicks

We can infer that the click override in VetoableClicks is called before the click override in ObservableClicks. Loosely speaking, since our anonymous class doesn’t define click itself, the method lookup proceeds right to left, as declared. It’s actually more complicated than that, as we’ll see later in the section called “Linearization of an Object’s Hierarchy” in Chapter 7, The Scala Object System.

In the meantime, what happens if we use the traits in the reverse order?

// code-examples/Traits/ui2/button-vetoable-clickable-observer-spec.scala

package ui2
import org.specs._
import observer._
import ui.ButtonCountObserver

object ButtonVetoableClickableObserverSpec extends Specification {
  "A Vetoable Button with Click Observer" should {
    "observe all the button clicks, even when some are vetoed" in {
      val observableButton =
          new Button("Okay") with VetoableClicks with ObservableClicks
      val buttonClickCountObserver = new ButtonCountObserver
      observableButton.addObserver(buttonClickCountObserver)

      for (i <- 1 to 3) observableButton.click()
      buttonClickCountObserver.count mustEqual 3
    }
  }
}

Now the expected observer count is 3. ObservableClicks now has precedence over VetoableClicks, so the count of clicks is incremented, even when some clicks are subsequently vetoed!

So, the order of declaration matters, which is important to remember for preventing unexpected behavior when traits impact each other. Perhaps another lesson to note is that splitting objects into too many fine-grained traits can obscure the order of execution in your code!

Breaking up your application into small, focused traits is a powerful way to create reusable, scalable abstractions and “components”. Complex behaviors can be built up through declarative composition of traits. We will explore this idea in greater detail in the section called “Scalable Abstractions” in Chapter 13, Application Design.

Constructing Traits

Traits don’t support auxiliary constructors, nor do they accept an argument list for the primary constructor, the body of a trait. Traits can extend classes or other traits. However, they can’t pass arguments to the parent class constructor (even literal values), so traits can only extend classes that have a no-argument primary or auxiliary constructor.

However, like classes, the body of a trait is executed every time an instance is created that uses the trait, as demonstrated by the following script.

// code-examples/Traits/trait-construction-script.scala

trait T1 {
  println( "  in T1: x = " + x )
  val x=1
  println( "  in T1: x = " + x )
}
trait T2 {
  println( "  in T2: y = " + y )
  val y="T2"
  println( "  in T2: y = " + y )
}

class Base12 {
  println( "  in Base12: b = " + b )
  val b="Base12"
  println( "  in Base12: b = " + b )
}
class C12 extends Base12 with T1 with T2 {
  println( "  in C12: c = " + c )
  val c="C12"
  println( "  in C12: c = " + c )
}
println( "Creating C12:" )
new C12
println( "After Creating C12" )

Running this script with the scala command yields the following output.

Creating C12:
  in Base12: b = null
  in Base12: b = Base12
  in T1: x = 0
  in T1: x = 1
  in T2: y = null
  in T2: y = T2
  in C12: c = null
  in C12: c = C12
After Creating C12

Notice the order of invocation of the class and trait constructors. Since the declaration of C12 is extends Base12 with T1 with T2, the order of construction for this simple class hierarchy is left to right, starting with the base class Base12, followed by the traits T1 and T2, and ending with the C12 constructor body. (For constructing arbitrarily-complex hierarchies, see the section called “Linearization of an Object’s Hierarchy” in Chapter 7, The Scala Object System.)

So, while you can’t pass construction parameters to traits, you can initialize fields with default values or leave them abstract. We actually saw this before in our Subject trait, where the Subject.observers field was initialized to an empty list.

If a concrete field in a trait does not have a suitable default value, there is no “fail-safe” way to initialize the value. All the alternative approaches require some ad hoc steps by users of the trait, which is error prone because they might do it wrong or forget to do it all. Perhaps the field should be left abstract, so that classes or other traits that use this trait are forced to define the value appropriately. We’ll discuss overriding abstract and concrete members in detail in Chapter 6, Advanced Object-Oriented Programming In Scala.

Another solution is to move that field to a separate class, where the construction process can guarantee that the correct initialization data is supplied by the user. It might be that the whole trait should actually be a class instead, so you can define a constructor for it that initializes the field.

Class or Trait?

When considering whether a “concept” should be a trait or a class, keep in mind that traits as mixins make the most sense for “adjunct” behavior. If you find that a particular trait is used most often as a parent of other classes, so that the child classes behave as the parent trait, then consider defining the trait as a class instead, to make this logical relationship more clear. (We said behaves as, rather than is a, because the former is the more precise definition of inheritance, based on the Liskov Substitution Principle - see [Martin2003], for example.)

Tip

Avoid concrete fields in traits that can’t be initialized to suitable default values. Use abstract fields instead or convert the trait to a class with a constructor. Of course, stateless traits don’t have any issues with initialization.

It’s a general principle of good object-oriented design that an instance should always be in a known valid state, starting from the moment the construction process finishes.

Recap and What’s Next

In this chapter, we learned how to use traits to encapsulate and share cross-cutting concerns between classes. We covered when and how to use traits, how to "stack" multiple traits, and the rules for initializing values within traits.

In the next chapter, we explore how the fundamentals of object-oriented programming work in Scala. Even if you’re an old hand at object-oriented programming, you’ll want to read the next several chapters to understand the particulars of Scala’s approach to OOP.

You must sign in or register before commenting

Chapter 5. Basic Object-Oriented Programming in Scala

Scala is an object-oriented language like Java, Python, Ruby, Smalltalk, and others. If you’re coming from the Java world, you’ll notice some notable improvements over the limitations of Java’s object model.

We assume you have some prior experience with object-oriented programming (OOP), so we will not discuss the basic principles here, although some common terms and concepts are discussed in the Glossary. See [Meyer1997] for a detailed introduction to OOP, see [Martin2003] for a recent treatment of OOP principles in the context of “agile software development”, see [GOF1995] to learn about design patterns, and see [WirfsBrock2003] for a discussion of object-oriented design concepts.

Class and Object Basics

Let’s review the terminology of OOP in Scala.

Note

We saw previously that Scala has the concept of a declared object, which we’ll dig into in the section called “Classes and Objects: Where Are the Statics?”. We’ll use the term instance to refer to a class instance generically, meaning either an object or an instance of a class, to avoid the potential for confusion between these two concepts.

Classes are declared with the keyword class. We will see later that additional keywords can also be used, like final to prevent creation of derived classes and abstract to indicate that the class can’t be instantiated, usually because it contains or inherits member declarations without providing concrete definitions for them.

An instance can refer to itself using the this keyword, just as in Java and similar languages.

Following Scala’s convention, we use the term method for a function that is tied to an instance. Some other object-oriented languages use the term “member function”. Method definitions start with the def keyword.

Like Java, but unlike Ruby and Python, Scala allows overloaded methods. Two or more methods can have the same name as long as their full signatures are unique. The signature includes the type name, the list of parameters with types, and the method’s return value.

However, there is an exception to this rule due to type erasure, which is a feature of the JVM only, but is used by Scala on both the JVM and .NET platforms, to minimize incompatibilities. Suppose two methods are identical except that one takes a parameter of type List[String] while the other takes a parameter of type List[Int], as in the the following example.

// code-examples/BasicOOP/type-erasure-wont-compile.scala
// WON'T COMPILE

object Foo {
  def bar(list: List[String]) = list.toString
  def bar(list: List[Int]) = list.size.toString
}

You’ll get a compilation error on the second method because the two methods will have an identical signature after type erasure.

Warning

The scala interpreter will let you type in both methods. It simply drops the first version. However, if you try to load the previous example using the :load file command, you’ll get the same error scalac raises.

We’ll discuss type erasure in more detail in Chapter 12, The Scala Type System.

Also by convention, we use the term field for a variable that is tied to an instance. The term attribute is often used in other languages (like Ruby). Note that the state of an instance is the union of all the values currently represented by the instance’s fields.

As we discussed in the section called “Variable Declarations” in Chapter 2, Type Less, Do More, read-only (“value”) fields are declared using the val keyword and read/write fields are declared using the var keyword.

Scala also allows types to be declared in classes, as we saw in the section called “Abstract Types And Parameterized Types” in Chapter 2, Type Less, Do More.

We use the term member to refer to a field, method, or type in a generic way. Note that field and method members (but not type members) share the same namespace, unlike Java. We’ll discuss this more in the section called “When Accessor Methods and Fields Are Indistinguishable: The Uniform Access Principle” in Chapter 6, Advanced Object-Oriented Programming In Scala.

Finally, new instances of reference types are created from a class using the new keyword, as in languages like Java and C#. Note that you can drop the parentheses when using a default constructor (i.e., one that takes no arguments). In some cases, literal values can be used instead, e.g., val name="htmlcat_ch05.html_Programming Scala" is equivalent to val name = new String("Programming Scala").

Instances of value types (e.g., Int, Double, etc.), which correspond to the primitives in languages like Java, are always created using literal values, e.g., 1, 3.14, etc.. In fact, there are no public constructors for these types, so an expression like val i = new Int(1) won’t compile.

We’ll discuss the difference between reference and value types in the section called “The Scala Type Hierarchy”.

Parent Classes

Scala supports single inheritance, not multiple inheritance. A child (or derived) class can have one and only one parent (or base) class. The sole exception is the root of the Scala class hierarchy, Any, which has no parent.

We’ve seen several examples of parent and child classes already. Here are snippets of one of the first examples we saw, in the section called “Abstract Types And Parameterized Types” from Chapter 2, Type Less, Do More.

// code-examples/TypeLessDoMore/abstract-types-script.scala

import java.io._

abstract class BulkReader {
  // ...
}

class StringBulkReader(val source: String) extends BulkReader {
  // ...
}

class FileBulkReader(val source: File) extends BulkReader {
  // ...
}

As in Java, the keyword extends indicates the parent class, in this case BulkReader. In Scala, extends is also used when a class inherits a trait as its parent (even when it mixes in other traits using the with keyword). Also, extends is used when one trait is the child of another trait or class. Yes, traits can inherit classes.

If you don’t extend a parent class, the default parent is AnyRef, a direct child class of Any. (We discuss the difference between Any and AnyRef when we discuss the Scala type hierarchy in the section called “The Scala Type Hierarchy”.)

Constructors in Scala

Scala distinguishes between a primary constructor and zero or more auxiliary constructors. In Scala, the primary constructor is the entire body of the class. Any parameters that the constructor requires are listed after the class name. We’ve seen many examples of this already, as in the ButtonWithCallbacks example we used in Chapter 4, Traits.

// code-examples/Traits/ui/button-callbacks.scala

package ui

class ButtonWithCallbacks(val label: String,
    val clickedCallbacks: List[() => Unit]) extends Widget {

  require(clickedCallbacks != null, "Callback list can't be null!")

  def this(label: String, clickedCallback: () => Unit) =
    this(label, List(clickedCallback))

  def this(label: String) = {
    this(label, Nil)
    println("Warning: button has no click callbacks!")
  }

  def click() = {
    // ... logic to give the appearance of clicking a physical button ...
    clickedCallbacks.foreach(f => f())
  }
}

The ButtonWithCallbacks class represents a button on a graphical user interface. It has a label and a list of callback functions that are invoked if the button is clicked. Each callback function takes no arguments and returns Unit. The click method iterates through the list of callbacks and invokes each one.

ButtonWithCallbacks defines three constructors. The primary constructor, which is the body of the entire class, has a parameter list that takes a label string and a list of callback functions. Because each parameter is declared as a val, the compiler generates a private field corresponding to each parameter (a different internal name is used), along with a public reader method that has the same name as the parameter. “Private” and “public” have the same meaning here as in most object-oriented languages. We’ll discuss the various visibility rules and the keywords that control them in the section called “Visibility Rules” below.

If a parameter has the var keyword, a public writer method is also generated with the parameter’s name as a prefix, followed by _=. For example, if label were declared as a var, the writer method would be named label_= and it would take a single argument of type String.

There are times when you don’t want the accessor methods to be generated automatically. In other words, you want the field to be private. Add the private keyword before the val or var keyword and the accessor methods won’t be generated. (See the section called “Visibility Rules” below for more details.)

Note

For you Java programmers, Scala doesn’t follow the s [JavaBeansSpec] convention that field reader and writer methods begin with get and set respectively, followed by the field name with the first character capitalized. We’ll see why when we discuss the Uniform Access Principle in the section called “When Accessor Methods and Fields Are Indistinguishable: The Uniform Access Principle” below. However, you can get JavaBeans-style getters and setters when you need them using the scala.reflect.BeanProperty annotation, as we’ll discuss in the section called “JavaBean Properties” in Chapter 14, Scala Tools, Libraries and IDE Support.

When an instance of the class is created, each field corresponding to a parameter in the parameter list will be initialized with the parameter automatically. No constructor logic is required to initialize these fields, in contrast to most other object-oriented languages.

The first statement in the ButtonWithCallbacks class (i.e., the constructor) body is a test to ensure that a non-null list has been passed to the constructor. (It does allow an empty Nil list, however.) It uses the convenient require function that is imported automatically into the current scope (as we’ll discuss in the section called “The Predef Object” in Chapter 7, The Scala Object System). If the list is null, require will throw an exception. The require function and its companion assume are very useful for design by contract programming, as discussed in the section called “Better Design with Design By Contract” in Chapter 13, Application Design.

Here is part of a full specification for ButtonWithCallbacks that demonstrates the require statement in use.

// code-examples/Traits/ui/button-callbacks-spec.scala
package ui
import org.specs._

object ButtonWithCallbacksSpec extends Specification {
  "A ButtonWithCallbacks" should {
    // ...
    "not be constructable with a null callback list" in {
      val nullList:List[() => Unit] = null
      val errorMessage =
        "requirement failed: Callback list can't be null!"
      (new ButtonWithCallbacks("button1", nullList)) must throwA(
        new IllegalArgumentException(errorMessage))
    }
  }
}

Scala even makes it difficult to pass null as the second parameter to the constructor; it won’t type check when you compile it. However, you can assign null to a value, as shown. If we didn’t have the must throwA(…) clause, we would see the following exception thrown.

java.lang.IllegalArgumentException: requirement failed: Callback list can't be null!
        at scala.Predef$.require(Predef.scala:112)
        at ui.ButtonWithCallbacks.<init>(button-callbacks.scala:7)
....

ButtonWithCallbacks defines two auxiliary constructors for the user’s convenience. The first auxiliary constructor accepts a label and a single callback. It calls the primary constructor, passing the label and a new List to wrap the single callback.

The second auxiliary constructor accepts just a label. It calls the primary constructor with Nil (which represents an empty List object). The constructor then prints a warning message that there are no callbacks, since lists are immutable and there is no way to replace the callback list val with a new one.

In order to avoid infinite recursion, Scala requires each auxiliary constructor to invoke another constructor defined before it [ScalaSpec2009]. The constructor invoked may be either another auxiliary constructor or the primary constructor, and it must be the first statement in the auxiliary constructor’s body. Additional processing can occur after this call, such as the warning message printed in our example.

Note

Because all auxiliary constructors eventually invoke the primary constructor, logic checks and other initializations done in the body will be performed consistently for all instances created.

There are a few advantages of Scala’s constraints on constructors.

Elimination of Duplication
Because auxiliary constructors invoke the primary constructor, potential duplication of construction logic is largely eliminated.

Code Size Reduction
As shown in the examples, when one or more of the primary constructor parameters is declared as a val or a var, Scala automatically generates a field, the appropriate accessor methods (unless they are declared private), and the initialization logic for when instances are created.

There is also at least one disadvantage of Scala’s constraints on constructors.

Less Flexibility
Sometimes it’s just not convenient to have one constructor body that all constructors are forced to use. However, we find these circumstances to be rare. In such cases, it may simply be that the class has too many responsibilities and it should be refactored into smaller classes.

Calling Parent Class Constructors

The primary constructor in a derived class must invoke one of the parent class constructors, either the primary constructor or an auxiliary constructor. In the following example, a class derived from ButtonWithCallbacks, called RadioButtonWithCallbacks, invokes the primary ButtonWithCallbacks constructor. “Radio” buttons can be either on or off.

// code-examples/BasicOOP/ui/radio-button-callbacks.scala

package ui

/**
 * Button with two states, on or off, like an old-style,
 * channel-selection button on a radio.
 */
class RadioButtonWithCallbacks(
  var on: Boolean, label: String, clickedCallbacks: List[() => Unit])
      extends ButtonWithCallbacks(label, clickedCallbacks) {

  def this(on: Boolean, label: String, clickedCallback: () => Unit) =
      this(on, label, List(clickedCallback))

  def this(on: Boolean, label: String) = this(on, label, Nil)
}

The primary constructor for RadioButtonWithCallbacks takes three parameters: an on state (true or false), a label, and a list of callbacks. It passes the label and list of callbacks to its parent class, ButtonWithCallbacks. The on parameter is declared as a var, so it is mutable. on is also the one constructor parameter unique to a radio button, so it is kept as an attribute of RadioButtonWithCallbacks.

For consistency with its parent class, RadioButtonWithCallbacks also declares two auxiliary constructors. Note that they must invoke a preceding constructor in RadioButtonWithCallbacks, as before. They can’t invoke a ButtonWithCallbacks constructor directly. Declaring all these constructors in each class could get tedious after a while, but we explored techniques in Chapter 4, Traits that can eliminate repetition.

Note

While super is used to invoke overridden methods, as in Java, it cannot be used to invoke a super class constructor.

Nested Classes

Scala lets you nest class declarations, like many object-oriented languages. Suppose we want all Widgets to have a map of properties. These properties could be size, color, whether or not the widget is visible, etc. We might use a simple map to hold the properties, but let’s assume that we also want to control access to the properties, and to perform other operations when they change.

Here is one way we might expand our original Widget example from the section called “Traits as Mixins” in Chapter 4, Traits to add this feature.

// code-examples/BasicOOP/ui/widget.scala

package ui

abstract class Widget {
  class Properties {
    import scala.collection.immutable.HashMap

    private var values: Map[String, Any] = new HashMap

    def size = values.size

    def get(key: String) = values.get(key)

    def update(key: String, value: Any) = {
      // Do some preprocessing, e.g., filtering.
      values = values.update(key, value)
      // Do some postprocessing.
    }
  }

  val properties = new Properties
}

We added a Properties class that has a private, mutable reference to an immutable HashMap. We also added three public methods that retrieve the size (i.e., the number of properties defined), retrieve a single element in the map, and update the map with a new element, respectively. We might need to do additional work in the update method, and we’ve indicated as much with comments.

Note

You can see from the above example that Scala allows classes to be declared inside one another, or “nested”. A nested class make sense when you have enough related functionality to lump together in a class, but the functionality is only ever going to be used by its “outer” class.

So far, we’ve covered how to declare classes, how to instantiate them, and some of the basics of inheritance. In the next section, we’ll discuss visibility rules within classes and objects.

Visibility Rules

Note

For convenience, we’ll use the word “type” in this section to refer to classes and traits generically, as opposed to referring to member type declarations. We’ll include those when we use the term “member” generically, unless otherwise indicated.

Most object-oriented languages have constructs to constrain the visibility (or scope) of type and type-member declarations. These constructs support the object-oriented form of encapsulation, where only the essential public abstraction of a class or trait is exposed and implementation information is hidden from view.

You’ll want to use public visibility for anything that users of your classes and objects should see and use. Keep in mind that the set of publicly visible members form the abstraction exposed by the type, along with the type’s name itself.

The conventional wisdom in object-oriented design is that fields should be private or protected. If access is required, it should happen through methods, but not everything should be accessible by default. The virtue of the Uniform Access Principle (see the section called “When Accessor Methods and Fields Are Indistinguishable: The Uniform Access Principle”) is that we can give the user the semantics of public field access via either a method or direct access to a field, whichever is appropriate for the task.

Tip

The art of good object-oriented design includes defining minimal, clear, and cohesive public abstractions.

There are two kinds of “users” of a type: derived types and code that works with instances of the type. Derived types usually need more access to the members of their parent types than users of instances do.

Scala’s visibility rules are similar to Java’s, but tend to be both more consistently applied and more flexible. For example, in Java, if an inner class has a private member, the enclosing class can see it. In Scala, the enclosing class can’t see a private member, but Scala provides another way to declare it visible to the enclosing class.

As in Java and C#, the keywords that modify visibility, such as private and protected, appear at the beginning of declarations. You’ll find them before the class or trait keywords for types, before the val or var for fields, and before the def for methods.

Note

You can also use an access modifier keyword on the primary constructor of a class. Put it after the type name and type parameters, if any, and before the argument list, as in this example: class Restricted[+A] private (name: String) {…}

Table 5.1, “Visibility Scopes.” summarizes the visibility scopes.

Table 5.1. Visibility Scopes.

Name Keyword Description

public

none

Public members and types are visible everywhere, across all boundaries.

protected

protected

Protected members are visible to the defining type, to derived types, and to nested types. Protected types are visible only within the same package and subpackages.

private

private

Private members are visible only within the defining type and nested types. Private types are visible only within the same package.

scoped protected

protected[scope]

Visibility is limited to scope, which can be a package, type, or this (meaning the same instance, when applied to members, or the enclosing package, when applied to types). See the text below for details.

scoped private

private[scope]

Synonymous with scoped protected visibility, except under inheritance (discussed below).


Let’s explore these visibility options in more detail. To keep things simple, we’ll use fields for member examples. Method and type declarations behave the same way.

Note

Unfortunately, you can’t apply any of the visibility modifiers to packages. Therefore, a package is always public, even when it contains no publicly visible types.

Public Visibility

Any declaration without a visibility keyword is “public”, meaning it is visible everywhere. There is no public keyword in Scala. This is in contrast to Java, which defaults to public visibility only within the enclosing package (i.e., “package private”). Other object-oriented languages, like Ruby, also default to public visibility.

// code-examples/BasicOOP/scoping/public.scala

package scopeA {
  class PublicClass1 {
    val publicField = 1

    class Nested {
      val nestedField = 1
    }

    val nested = new Nested
  }

  class PublicClass2 extends PublicClass1 {
    val field2  = publicField + 1
    val nField2 = new Nested().nestedField
  }
}

package scopeB {
  class PublicClass1B extends scopeA.PublicClass1

  class UsingClass(val publicClass: scopeA.PublicClass1) {
    def method = "UsingClass:" +
      " field: " + publicClass.publicField +
      " nested field: " + publicClass.nested.nestedField
  }
}

You can compile this file with scalac. It should compile without error.

Everything is public in these packages and classes. Note that scopeB.UsingClass can access scopeA.PublicClass1 and its members, including the instance of Nested and its public field.

Protected Visibility

Protected visibility is for the benefit of implementers of derived types, who need a little more access to the details of their parent types. Any member declared with the protected keyword is visible only to the defining type, including other instances of the same type and any derived types. When applied to a type, protected limits visibility to the enclosing package.

Java, in contrast, makes protected members visible throughout the enclosing package. Scala handles this case with scoped private and protected access.

// code-examples/BasicOOP/scoping/protected-wont-compile.scala
// WON'T COMPILE

package scopeA {
  class ProtectedClass1(protected val protectedField1: Int) {
    protected val protectedField2 = 1

    def equalFields(other: ProtectedClass1) =
      (protectedField1 == other.protectedField1) &&
      (protectedField1 == other.protectedField1) &&
      (nested == other.nested)

    class Nested {
      protected val nestedField = 1
    }

    protected val nested = new Nested
  }

  class ProtectedClass2 extends ProtectedClass1(1) {
    val field1 = protectedField1
    val field2 = protectedField2
    val nField = new Nested().nestedField  // ERROR
  }

  class ProtectedClass3 {
    val protectedClass1 = new ProtectedClass1(1)
    val protectedField1 = protectedClass1.protectedField1 // ERROR
    val protectedField2 = protectedClass1.protectedField2 // ERROR
    val protectedNField = protectedClass1.nested.nestedField // ERROR
  }

  protected class ProtectedClass4

  class ProtectedClass5 extends ProtectedClass4
  protected class ProtectedClass6 extends ProtectedClass4
}

package scopeB {
  class ProtectedClass4B extends scopeA.ProtectedClass4 // ERROR
}

When you compile this file with scalac, you get the following output. (The file name before the N: line numbers have been removed from the output to better fit the space.)

16: error: value nestedField cannot be accessed in ProtectedClass2.this.Nested
        val nField = new Nested().nestedField
                                  ^
20: error: value protectedField1 cannot be accessed in scopeA.ProtectedClass1
        val protectedField1 = protectedClass1.protectedField1
                                              ^
21: error: value protectedField2 cannot be accessed in scopeA.ProtectedClass1
        val protectedField2 = protectedClass1.protectedField2
                                              ^
22: error: value nested cannot be accessed in scopeA.ProtectedClass1
        val protectedNField = protectedClass1.nested.nestedField
                                              ^
32: error: class ProtectedClass4 cannot be accessed in package scopeA
    class ProtectedClass4B extends scopeA.ProtectedClass4
                                          ^
5 errors found

The // ERROR comments in the listing mark the lines that fail to parse.

ProtectedClass2 can access protected members of ProtectedClass1 since it derives from it. However, it can’t access the protected nestedField in protectedClass1.nested. Also, ProtectedClass3 can’t access protected members of the ProtectedClass1 instance it uses.

Finally, because ProtectedClass4 is declared protected, it is not visible in the scopeB package.

Private Visibility

Private visibility completely hides implementation details, even from the implementers of derived classes. Any member declared with the private keyword is visible only to the defining type, including other instances of the same type. When applied to a type, private limits visibility to the enclosing package.

// code-examples/BasicOOP/scoping/private-wont-compile.scala
// WON'T COMPILE

package scopeA {
  class PrivateClass1(private val privateField1: Int) {
    private val privateField2 = 1

    def equalFields(other: PrivateClass1) =
      (privateField1 == other.privateField1) &&
      (privateField2 == other.privateField2) &&
      (nested == other.nested)

    class Nested {
      private val nestedField = 1
    }

    private val nested = new Nested
  }

  class PrivateClass2 extends PrivateClass1(1) {
    val field1 = privateField1  // ERROR
    val field2 = privateField2  // ERROR
    val nField = new Nested().nestedField // ERROR
  }

  class PrivateClass3 {
    val privateClass1 = new PrivateClass1(1)
    val privateField1 = privateClass1.privateField1 // ERROR
    val privateField2 = privateClass1.privateField2 // ERROR
    val privateNField = privateClass1.nested.nestedField // ERROR
  }

  private class PrivateClass4

  class PrivateClass5 extends PrivateClass4  // ERROR
  protected class PrivateClass6 extends PrivateClass4 // ERROR
  private class PrivateClass7 extends PrivateClass4
}

package scopeB {
  class PrivateClass4B extends scopeA.PrivateClass4  // ERROR
}

Compiling this file yields the following output.

14: error: not found: value privateField1
        val field1 = privateField1
                     ^
15: error: not found: value privateField2
        val field2 = privateField2
                     ^
16: error: value nestedField cannot be accessed in PrivateClass2.this.Nested
        val nField = new Nested().nestedField
                                  ^
20: error: value privateField1 cannot be accessed in scopeA.PrivateClass1
        val privateField1 = privateClass1.privateField1
                                          ^
21: error: value privateField2 cannot be accessed in scopeA.PrivateClass1
        val privateField2 = privateClass1.privateField2
                                          ^
22: error: value nested cannot be accessed in scopeA.PrivateClass1
        val privateNField = privateClass1.nested.nestedField
                                          ^
27: error: private class PrivateClass4 escapes its defining scope as part
of type scopeA.PrivateClass4
    class PrivateClass5 extends PrivateClass4
                                ^
28: error: private class PrivateClass4 escapes its defining scope as part
of type scopeA.PrivateClass4
    protected class PrivateClass6 extends PrivateClass4
                                          ^
33: error: class PrivateClass4 cannot be accessed in package scopeA
    class PrivateClass4B extends scopeA.PrivateClass4
                                        ^
9 errors found

Now, PrivateClass2 can’t access private members of its parent class PrivateClass1. They are completely invisible to the subclass, as indicated by the error messages. Nor can it access a private field in a Nested class.

Just as for the case of protected access, PrivateClass3 can’t access private members of the PrivateClass1 instance it is using. Note, however, that the equalFields method can access private members of the other instance.

The declarations of PrivateClass5 and PrivateClass6 fail, because if allowed, they would enable PrivateClass4 to “escape its defining scope”. However, the declaration of PrivateClass7 succeeds, because it is also declared to be private. Curiously, our previous example was able to declare a public class that subclassed a protected class without a similar error.

Finally, just as for protected type declarations, the private types can’t be subclassed outside the same package.

Scoped Private and Protected Visibility

Scala allows you to fine-tune the scope of visibility with the scoped private and protected visibility declarations. Note that using private or protected in a scoped declaration is interchangeable, as they behave identically, except under inheritance when applied to members.

Tip

While either choice behaves the same in most scenarios, it is more common to see private[X] rather than protected[X] used in code. In the core libraries included with Scala, the ratio is roughly five to one.

Let’s begin by considering the only differences in behavior between scoped private and scoped protected, how they behave under inheritance when members have these scopes.

// code-examples/BasicOOP/scoping/scope-inheritance-wont-compile.scala
// WON'T COMPILE

package scopeA {
  class Class1 {
    private[scopeA]   val scopeA_privateField = 1
    protected[scopeA] val scopeA_protectedField = 2
    private[Class1]   val class1_privateField = 3
    protected[Class1] val class1_protectedField = 4
    private[this]     val this_privateField = 5
    protected[this]   val this_protectedField = 6
  }

  class Class2 extends Class1 {
    val field1 = scopeA_privateField
    val field2 = scopeA_protectedField
    val field3 = class1_privateField     // ERROR
    val field4 = class1_protectedField
    val field5 = this_privateField       // ERROR
    val field6 = this_protectedField
  }
}

package scopeB {
  class Class2B extends scopeA.Class1 {
    val field1 = scopeA_privateField     // ERROR
    val field2 = scopeA_protectedField
    val field3 = class1_privateField     // ERROR
    val field4 = class1_protectedField
    val field5 = this_privateField       // ERROR
    val field6 = this_protectedField
  }
}

Compiling this file yields the following output.

17: error: not found: value class1_privateField
    val field3 = class1_privateField     // ERROR
                 ^
19: error: not found: value this_privateField
    val field5 = this_privateField       // ERROR
                 ^
26: error: not found: value scopeA_privateField
    val field1 = scopeA_privateField     // ERROR
                 ^
28: error: not found: value class1_privateField
    val field3 = class1_privateField     // ERROR
                 ^
30: error: not found: value this_privateField
    val field5 = this_privateField       // ERROR
                 ^
5 errors found

The first two errors, inside Class2, show us that a derived class inside the same package can’t reference a member that is scoped private to the parent class or this, but it can reference a private member scoped to the package (or type) that encloses both Class1 and Class2.

In contrast, for a derived class outside the same package, it has no access to any of the scoped private members of Class1.

However, all the scoped protected members are visible in both derived classes.

We’ll use scoped private declarations for the rest of our examples and discussion, since use of scoped private is a little more common in the Scala library than scoped protected, when the previous inheritance scenarios aren’t a factor.

First, let’s start with the most restrictive visibility, private[this], as it affects type members.

// code-examples/BasicOOP/scoping/private-this-wont-compile.scala
// WON'T COMPILE

package scopeA {
  class PrivateClass1(private[this] val privateField1: Int) {
    private[this] val privateField2 = 1

    def equalFields(other: PrivateClass1) =
      (privateField1 == other.privateField1) && // ERROR
      (privateField2 == other.privateField2) &&
      (nested == other.nested)

    class Nested {
      private[this] val nestedField = 1
    }

    private[this] val nested = new Nested
  }

  class PrivateClass2 extends PrivateClass1(1) {
    val field1 = privateField1  // ERROR
    val field2 = privateField2  // ERROR
    val nField = new Nested().nestedField  // ERROR
  }

  class PrivateClass3 {
    val privateClass1 = new PrivateClass1(1)
    val privateField1 = privateClass1.privateField1  // ERROR
    val privateField2 = privateClass1.privateField2  // ERROR
    val privateNField = privateClass1.nested.nestedField // ERROR
  }
}

Compiling this file yields the following output.

5: error: value privateField1 is not a member of scopeA.PrivateClass1
            (privateField1 == other.privateField1) &&
                                    ^
14: error: not found: value privateField1
        val field1 = privateField1
                     ^
15: error: not found: value privateField2
        val field2 = privateField2
                     ^
16: error: value nestedField is not a member of PrivateClass2.this.Nested
        val nField = new Nested().nestedField
                                  ^
20: error: value privateField1 is not a member of scopeA.PrivateClass1
        val privateField1 = privateClass1.privateField1
                                          ^
21: error: value privateField2 is not a member of scopeA.PrivateClass1
        val privateField2 = privateClass1.privateField2
                                          ^
22: error: value nested is not a member of scopeA.PrivateClass1
        val privateNField = privateClass1.nested.nestedField
                                          ^
7 errors found

Note

Lines 6-8 also won’t parse. Since they are part of the expression that started on line 5, the compiler stopped after the first error.

The private[this] members are only visible to the same instance. An instance of the same class can’t see private[this] members of another instance, so the equalFields method won’t parse.

Otherwise, the visibility of class members is the same as private without a scope specifier.

When declaring a type with private[this], use of this effectively binds to the enclosing package, as shown here.

// code-examples/BasicOOP/scoping/private-this-pkg-wont-compile.scala
// WON'T COMPILE

package scopeA {
  private[this] class PrivateClass1

  package scopeA2 {
    private[this] class PrivateClass2
  }

  class PrivateClass3 extends PrivateClass1  // ERROR
  protected class PrivateClass4 extends PrivateClass1 // ERROR
  private class PrivateClass5 extends PrivateClass1
  private[this] class PrivateClass6 extends PrivateClass1

  private[this] class PrivateClass7 extends scopeA2.PrivateClass2 // ERROR
}

package scopeB {
  class PrivateClass1B extends scopeA.PrivateClass1 // ERROR
}

Compiling this file yields the following output.

8: error: private class PrivateClass1 escapes its defining scope as part
of type scopeA.PrivateClass1
    class PrivateClass3 extends PrivateClass1
                                ^
9: error: private class PrivateClass1 escapes its defining scope as part
of type scopeA.PrivateClass1
    protected class PrivateClass4 extends PrivateClass1
                                          ^
13: error: type PrivateClass2 is not a member of package scopeA.scopeA2
    private[this] class PrivateClass7 extends scopeA2.PrivateClass2
                                                      ^
17: error: type PrivateClass1 is not a member of package scopeA
    class PrivateClass1B extends scopeA.PrivateClass1
                                        ^
four errors found

In the same package, attempting to declare a public or protected subclass fails. Only private and private[this] subclasses are allowed. Also, PrivateClass2 is scoped to scopeA2, so you can’t declare it outside scopeA2. Similarly an attempt to declare a class in unrelated scopeB using PrivateClass1 also fails.

Hence, when applied to types, private[this] is equivalent to Java’s package private visibility.

Next, let’s examine type-level visibility, private[T], where T is a type.

// code-examples/BasicOOP/scoping/private-type-wont-compile.scala
// WON'T COMPILE

package scopeA {
  class PrivateClass1(private[PrivateClass1] val privateField1: Int) {
    private[PrivateClass1] val privateField2 = 1

    def equalFields(other: PrivateClass1) =
      (privateField1 == other.privateField1) &&
      (privateField2 == other.privateField2) &&
      (nested  == other.nested)

    class Nested {
      private[Nested] val nestedField = 1
    }

    private[PrivateClass1] val nested = new Nested
    val nestedNested = nested.nestedField   // ERROR
  }

  class PrivateClass2 extends PrivateClass1(1) {
    val field1 = privateField1  // ERROR
    val field2 = privateField2  // ERROR
    val nField = new Nested().nestedField  // ERROR
  }

  class PrivateClass3 {
    val privateClass1 = new PrivateClass1(1)
    val privateField1 = privateClass1.privateField1  // ERROR
    val privateField2 = privateClass1.privateField2  // ERROR
    val privateNField = privateClass1.nested.nestedField // ERROR
  }
}

Compiling this file yields the following output.

12: error: value nestedField cannot be accessed in PrivateClass1.this.Nested
        val nestedNested = nested.nestedField
                                  ^
15: error: not found: value privateField1
        val field1 = privateField1
                     ^
16: error: not found: value privateField2
        val field2 = privateField2
                     ^
17: error: value nestedField cannot be accessed in PrivateClass2.this.Nested
        val nField = new Nested().nestedField
                                  ^
21: error: value privateField1 cannot be accessed in scopeA.PrivateClass1
        val privateField1 = privateClass1.privateField1
                                          ^
22: error: value privateField2 cannot be accessed in scopeA.PrivateClass1
        val privateField2 = privateClass1.privateField2
                                          ^
23: error: value nested cannot be accessed in scopeA.PrivateClass1
        val privateNField = privateClass1.nested.nestedField
                                          ^
7 errors found

A private[PrivateClass1] member is visible to other instances, so the equalFields method now parses. Hence, private[T] is not as restrictive as private[this]. Note that PrivateClass1 can’t see Nested.nestedField, because that field is declared private[Nested].

Tip

When members of T are declared private[T] the behavior is equivalent to private. It is not equivalent to private[this], which is more restrictive.

What if we change the scope of Nested.nestedField to be private[PrivateClass1]? Let’s see see how private[T] affects nested types.

// code-examples/BasicOOP/scoping/private-type-nested-wont-compile.scala
// WON'T COMPILE

package scopeA {
  class PrivateClass1 {
    class Nested {
      private[PrivateClass1] val nestedField = 1
    }

    private[PrivateClass1] val nested = new Nested
    val nestedNested = nested.nestedField
  }

  class PrivateClass2 extends PrivateClass1 {
    val nField = new Nested().nestedField   // ERROR
  }

  class PrivateClass3 {
    val privateClass1 = new PrivateClass1
    val privateNField = privateClass1.nested.nestedField // ERROR
  }
}

Compiling this file yields the following output.

10: error: value nestedField cannot be accessed in PrivateClass2.this.Nested
        def nField = new Nested().nestedField
                                  ^
14: error: value nested cannot be accessed in scopeA.PrivateClass1
        val privateNField = privateClass1.nested.nestedField
                                          ^
two errors found

Now nestedField is visible to PrivateClass1, but it is still invisible outside of PrivateClass1. This is how private works in Java.

Let’s examine scoping using a package name.

// code-examples/BasicOOP/scoping/private-pkg-type-wont-compile.scala
// WON'T COMPILE

package scopeA {
  private[scopeA] class PrivateClass1

  package scopeA2 {
    private [scopeA2] class PrivateClass2
    private [scopeA]  class PrivateClass3
  }

  class PrivateClass4 extends PrivateClass1
  protected class PrivateClass5 extends PrivateClass1
  private class PrivateClass6 extends PrivateClass1
  private[this] class PrivateClass7 extends PrivateClass1

  private[this] class PrivateClass8 extends scopeA2.PrivateClass2 // ERROR
  private[this] class PrivateClass9 extends scopeA2.PrivateClass3
}

package scopeB {
  class PrivateClass1B extends scopeA.PrivateClass1 // ERROR
}

Compiling this file yields the following output.

14: error: class PrivateClass2 cannot be accessed in package scopeA.scopeA2
    private[this] class PrivateClass8 extends scopeA2.PrivateClass2
                                                      ^
19: error: class PrivateClass1 cannot be accessed in package scopeA
    class PrivateClass1B extends scopeA.PrivateClass1
                                        ^
two errors found

Note that PrivateClass2 can’t be subclassed outside of scopeA2, but PrivateClass3 can be subclassed in scopeA, because it is declared private[scopeA].

Finally, let’s look at the effect of package-level scoping of type members.

// code-examples/BasicOOP/scoping/private-pkg-wont-compile.scala
// WON'T COMPILE

package scopeA {
  class PrivateClass1 {
    private[scopeA] val privateField = 1

    class Nested {
      private[scopeA] val nestedField = 1
    }

    private[scopeA] val nested = new Nested
  }

  class PrivateClass2 extends PrivateClass1 {
    val field  = privateField
    val nField = new Nested().nestedField
  }

  class PrivateClass3 {
    val privateClass1 = new PrivateClass1
    val privateField  = privateClass1.privateField
    val privateNField = privateClass1.nested.nestedField
  }

  package scopeA2 {
    class PrivateClass4 {
      private[scopeA2] val field1 = 1
      private[scopeA]  val field2 = 2
    }
  }

  class PrivateClass5 {
    val privateClass4 = new scopeA2.PrivateClass4
    val field1 = privateClass4.field1  // ERROR
    val field2 = privateClass4.field2
  }
}

package scopeB {
  class PrivateClass1B extends scopeA.PrivateClass1 {
    val field1 = privateField   // ERROR
    val privateClass1 = new scopeA.PrivateClass1
    val field2 = privateClass1.privateField  // ERROR
  }
}

Compiling this file yields the following output.

28: error: value field1 cannot be accessed in scopeA.scopeA2.PrivateClass4
        val field1 = privateClass4.field1
                                   ^
35: error: not found: value privateField
        val field1 = privateField
                     ^
37: error: value privateField cannot be accessed in scopeA.PrivateClass1
        val field2 = privateClass1.privateField
                                   ^
three errors found

The only errors are when we attempt to access members scoped to scopeA from the unrelated package scopeB and when we attempt to access a member from a nested package scopeA2 that is scoped to that package.

Tip

When a type or member is declared private[P], where P is the enclosing package, then it is equivalent to Java’s package private visibility.

Final Thoughts on Visibility

Scala visibility declarations are very flexible and they behave consistently. They provide fine-grained control over visibility at all possible scopes, from the instance level (private[this]) up to package-level visibility (private[P], for a package P). For example, they make it easier to create “components” with types exposed outside of the component’s top-level package, while hiding implementation types and type members within the “component’s” packages.

Finally, we have observed a potential “gotcha” with hidden members of traits.

Tip

Be careful when choosing the names of members of traits. If two traits have a member of the same name and the traits are used in the same instance, a name collision will occur even if both members are private.

Fortunately, the compiler catches this problem.

Recap and What’s Next

We introduced the basics of Scala’s object model, including constructors, inheritance, nesting of classes, and rules for visibility.

In the next chapter we’ll explore Scala’s more advanced OOP features, including overriding, companion objects, case classes, and rules for equality between objects.

You must sign in or register before commenting

Chapter 6. Advanced Object-Oriented Programming In Scala

We’ve got the basics of OOP in Scala under our belt, but there’s plenty more to learn.

Overriding Members of Classes and Traits

Classes and traits can declare abstract members: fields, methods, and types. These members must be defined by a derived class or trait before an instance can be created. Most object-oriented languages support abstract methods and some also support abstract fields and types.

Note

When overriding a concrete member, Scala requires the override keyword. It is optional when a subtype defines (“overrides”) an abstract member. Conversely, don’t use override unless you are actually overriding a member.

Requiring the override keyword has several benefits.

  1. It catches misspelled members that were intended to be overrides. The compiler will throw an error that the member doesn’t override anything.
  2. It catches a potentially subtle bug that can occur if a new member is added to a base class where the member’s name collides with an older derived class member that is unknown to the base class developer. That is, the derived-class member was never intended to override a base-class member. Because the derived class member won’t have the override keyword, the compiler will throw an error when the new base-class member is introduced.
  3. Having to add the keyword reminds you to consider what members should or should not be overridden.

Java has an optional @Override annotation for methods. It helps catch errors of the first type (mispellings), but it can’t help with errors of the second type, since using the annotation is optional.

Attempting to Override final Declarations

However, if a declaration includes the final keyword, then overriding the declaration is prohibited. In the following example, the fixedMethod is declared final in the parent class. Attempting to compile the example will result in a compilation error.

// code-examples/AdvOOP/overrides/final-member-wont-compile.scala
// WON'T COMPILE.

class NotFixed {
  final def fixedMethod = "fixed"
}

class Changeable2 extends NotFixed {
  override def fixedMethod = "not fixed"   // ERROR
}

This constraint applies to classes and traits as well as members. In this example, the class Fixed is declared final, so an attempt to derive a new type from it will also fail to compile.

// code-examples/AdvOOP/overrides/final-class-wont-compile.scala
// WON'T COMPILE.

final class Fixed {
  def doSomething = "Fixed did something!"
}

class Changeable1 extends Fixed     // ERROR

Note

Some of the types in the Scala library are final, including JDK classes like String and all the “value” types derived from AnyVal (see the section called “The Scala Type Hierarchy” in Chapter 7, The Scala Object System).

For declarations that aren’t final, let’s examine the rules and behaviors for overriding, starting with methods.

Overriding Abstract and Concrete Methods

Let’s extend our familiar Widget base class with an abstract method draw, to support “rendering” the widget to a display, web page, etc. We’ll also override a concrete method familiar to any Java programmer, toString(), using an ad hoc format. As before, we will use a new package, ui3.

Note

Drawing is actually a cross-cutting concern. The state of a Widget is one thing; how it is rendered on different platforms, thick clients, web pages, mobile devices, etc., is a separate issue. So, drawing is a very good candidate for a trait, especially if you want your GUI abstractions to be portable. However, to keep things simple, we will handle drawing in the Widget hierarchy itself.

Here is the revised Widget class, with draw and toString methods.

// code-examples/AdvOOP/ui3/widget.scala

package ui3

abstract class Widget {
  def draw(): Unit
  override def toString() = "(widget)"
}

The draw method is abstract because it has no body; that is, the method isn’t followed by an equals sign (=), nor any text after it. Therefore, Widget has to be declared abstract (it was optional before). Each concrete subclass of Widget will have to implement draw or rely on a parent class that implements it. We don’t need to return anything from draw, so its return value is Unit.

The toString() method is straightforward. Since AnyRef defines toString, the override keyword is required for Widget.toString.

Here is the revised Button class, with draw and toString methods..

// code-examples/AdvOOP/ui3/button.scala

package ui3

class Button(val label: String) extends Widget with Clickable {

  def click() = {
    // Logic to give the appearance of clicking a button...
  }

  def draw() = {
    // Logic to draw the button on the display, web page, etc.
  }

  override def toString() =
    "(button: label=" + label + ", " + super.toString() + ")"
}

Button implements the abstract method draw. No override keyword is required. Button also overrides toString, so the override keyword is required. Note that super.toString is called.

Tip

Overriding a concrete method should be done rarely, because it is error prone. Should you invoke the parent method? If so, when? Do you call it before doing anything else or afterwards? While the writer of the parent method might document the overriding constraints for the method, it’s difficult to ensure that the writer of a derived class will honor those constraints. A much more robust approach is the Template Method Pattern [GOF1995].

Overriding Abstract and Concrete Fields

Most OO languages allow you to override mutable fields (var). Fewer object-oriented languages allow you to define abstract fields or override concrete immutable fields (val). For example, it’s common for a base class constructor to initialize a mutable field and for a derived class constructor to change its value.

We’ll discuss overriding fields in traits and classes separately, as traits have some particular issues.

Overriding Abstract and Concrete Fields in Traits

Recall our VetoableClicks trait in the section called “Stackable Traits”. It defines a val named maxAllowed and initializes it to 1. We would like the ability to override the value in a class that mixes in this trait.

Unfortunately, in Scala version 2.7.X, it is not possible to override a val defined in a trait. However it is possible to override a val defined in a parent class. Version 2.8 of Scala does support overriding a val in a trait.

Tip

Because the override behavior for a val in a trait is changing, you should avoid relying on the ability to override it, if you are currently using Scala version 2.7.X. Use another approach instead.

Unfortunately, the version 2.7 compiler accepts code that attempts to override a trait-defined val, but the override does not actually happen, as illustrated by this example.

// code-examples/AdvOOP/overrides/trait-val-script.scala
// DANGER! Silent failure to override a trait's "name" (V2.7.5 only).
// Works as expected in V2.8.0.

trait T1 {
  val name="htmlcat_ch06.html_T1"
}

class Base

class ClassWithT1 extends Base with T1 {
  override val name="htmlcat_ch06.html_ClassWithT1"
}

val c = new ClassWithT1()
println(c.name)

class ClassExtendsT1 extends T1 {
  override val name="htmlcat_ch06.html_ClassExtendsT1"
}

val c2 = new ClassExtendsT1()
println(c2.name)

If you run this script with scala version 2.7.5, the output is the following.

T1
T1

Reading the script, we would have expected the two T1 strings to be ClassWithT1 and ClassExtendsT1, respectively.

However, if you run this script with scala version 2.8.0, you get this output.

ClassWithT1
ClassExtendsT1

Caution

Attempts to override a trait-defined val will be accepted by the compiler, but have no effect in Scala version 2.7.X.

There are three workarounds you can use with Scala version 2.7. The first is to use some advanced options for scala and scalac. The -Xfuture option will enable the override behavior that is supported in version 2.8. The -Xcheckinit option will analyze your code and report if the behavior change will break it. The option -Xexperimental, which enables many experimental changes, will also warn you that the val override behavior is different.

The second workaround is to make the val abstract in the trait. This forces an instance using the trait to assign a value. Declaring a val in a trait abstract is a perfectly useful design approach for both versions of Scala. In fact, this will be the best design choice, when there is no appropriate default value to assign to the val in the trait.

// code-examples/AdvOOP/overrides/trait-abs-val-script.scala

trait AbstractT1 {
  val name: String
}

class Base

class ClassWithAbstractT1 extends Base with AbstractT1 {
  val name="htmlcat_ch06.html_ClassWithAbstractT1"
}

val c = new ClassWithAbstractT1()
println(c.name)

class ClassExtendsAbstractT1 extends AbstractT1 {
  val name="htmlcat_ch06.html_ClassExtendsAbstractT1"
}

val c2 = new ClassExtendsAbstractT1()
println(c2.name)

This script produces the output that we would expect.

ClassWithAbstractT1
ClassExtendsAbstractT1

So, an abstract val works fine, unless the field is used in the trait body in a way that will fail until the field is properly initialized. Unfortunately, the proper initialization won’t occur until after the trait’s body has executed. Consider the following example.

// code-examples/AdvOOP/overrides/trait-invalid-init-val-script.scala
// ERROR: "value" read before initialized.

trait AbstractT2 {
  println("In AbstractT2:")
  val value: Int
  val inverse = 1.0/value      // ???
  println("AbstractT2: value = "+value+", inverse = "+inverse)
}

val c2b = new AbstractT2 {
  println("In c2b:")
  val value = 10
}
println("c2b.value = "+c2b.value+", inverse = "+c2b.inverse)

While it appears that we are creating an instance of the trait with new AbstractT2 …, we are actually using an anonymous class that implicitly extends the trait. This script shows what happens when inverse is calculated

In AbstractT2:
AbstractT2: value = 0, inverse = Infinity
In c2b:
c2b.value = 10, inverse = Infinity

As you might expect, the inverse is calculated too early. Note that a divide by zero exception isn’t thrown; the compiler recognizes the value is infinite, but it hasn’t actually “tried” the division yet!

The behavior of this script is actually quite subtle. As an exercise, try selectively removing (or commenting-out) the different println statements, one at a time. Observe what happens to the results. Sometimes inverse is initialized properly! (Hint: remove the println("In c2b:") statement. Then try putting it back, but after the val value = 10 line.)

What this experiment really shows is that side effects (i.e., from the println statements) can be unexpected and subtle, especially during initialization. It’s best to avoid them.

Scala provides two solutions to this problem, lazy values, which we discuss in the section called “Lazy Vals” in Chapter 8, Functional Programming in Scala, and pre-initialized fields, which is demonstrated in the following refinement to the previous example.

// code-examples/AdvOOP/overrides/trait-pre-init-val-script.scala

trait AbstractT2 {
  println("In AbstractT2:")
  val value: Int
  val inverse = 1.0/value
  println("AbstractT2: value = "+value+", inverse = "+inverse)
}

val c2c = new {
  // Only initializations are allowed in pre-init. blocks.
  // println("In c2c:")
  val value = 10
} with AbstractT2

println("c2c.value = "+c2c.value+", inverse = "+c2c.inverse)

We instantiate an anonymous inner class, initializing the value field in the block, before the with AbstractT2 clause. This guarantees that value is initialized before the body of AbstractT2 is executed, as shown when you run the script.

In AbstractT2:
AbstractT2: value = 10, inverse = 0.1
c2c.value = 10, inverse = 0.1

Also, if you selectively remove any of the println statements, you get the same expected and now predictable results.

Now let’s consider the second workaround we described above, changing the declaration to var. This solution is more suitable if a good default value exists and you don’t want to require instances that use the trait to always set the value. In this case, change the val to a var, either a public var or a private var hidden behind reader and writer methods. Either way, we can simply reassign the value in a derived trait or class.

Returning to our VetoableClicks example, here is the modified VetoableClicks trait that uses a public var for maxAllowed.

// code-examples/AdvOOP/ui3/vetoable-clicks.scala

package ui3
import observer._

trait VetoableClicks extends Clickable {
  var maxAllowed = 1       // default
  private var count = 0
  abstract override def click() = {
    count += 1
    if (count <= maxAllowed)
      super.click()
  }
}

Here is a new “specs” object, ButtonClickableObserverVetoableSpec2, that demonstrates changing the value of maxAllowed.

// code-examples/AdvOOP/ui3/button-clickable-observer-vetoable2-spec.scala
package ui3

import org.specs._
import observer._
import ui.ButtonCountObserver

object ButtonClickableObserverVetoableSpec2 extends Specification {
  "A Button Observer with Vetoable Clicks" should {
    "observe only the first 'maxAllowed' clicks" in {
      val observableButton =
        new Button("Okay") with ObservableClicks with VetoableClicks {
          maxAllowed = 2
      }
      observableButton.maxAllowed mustEqual 2
      val buttonClickCountObserver = new ButtonCountObserver
      observableButton.addObserver(buttonClickCountObserver)
      for (i <- 1 to 3) observableButton.click()
      buttonClickCountObserver.count mustEqual 2
    }
  }
}

No override var is required. We just assign a new value. Since the body of the trait is executed before the body of the class using it, reassigning the field value happens after the initial assignment in the trait’s body. However, as we saw before, that reassignment could happen too late if the field is used in the trait’s body in some calculation that will become invalid by a reassignment later! You can avoid this problem if you make the field private and define a public writer method that redoes any dependent calculations.

Another disadvantage of using a var declaration is that maxAllowed was not intended to be writable. As we will see in Chapter 8, Functional Programming in Scala, read-only values have important benefits. We would prefer for maxAllowed to be read-only, at least after the construction process completes.

We can see that the simple act of changing the val to a var causes potential problems for the maintainer of VetoableClicks. Control over that field is now lost. The maintainer must carefully consider whether or not the value will change and if a change will invalidate the state of the instance. This issue is especially pernicious in multithreaded systems (see the section called “The Problems of Shared, Synchronized State” in Chapter 9, Robust, Scalable Concurrency with Actors).

Tip

Avoid var fields when possible (in classes as well as traits). Consider public var fields especially risky.

Overriding Abstract and Concrete Fields in Classes

In contrast to traits, overriding a val declared in a class works as expected. Here is an example with both a val override and a var reassignment in a derived class.

// code-examples/AdvOOP/overrides/class-field-script.scala

class C1 {
  val name="htmlcat_ch06.html_C1"
  var count = 0
}

class ClassWithC1 extends C1 {
  override val name="htmlcat_ch06.html_ClassWithC1"
  count = 1
}

val c = new ClassWithC1()
println(c.name)
println(c.count)

The override keyword is required for the concrete val field name, but not for the var field count. This is because we are changing the initialization of a constant (val), which is a “special” operation.

If you run this script, the output is the following.

ClassWithC1
1

Both fields are overridden in the derived class, as expected. Here is the same example modified so that both the val and the var are abstract in the base class.

// code-examples/AdvOOP/overrides/class-abs-field-script.scala

abstract class AbstractC1 {
  val name: String
  var count: Int
}

class ClassWithAbstractC1 extends AbstractC1 {
  val name="htmlcat_ch06.html_ClassWithAbstractC1"
  var count = 1
}

val c = new ClassWithAbstractC1()
println(c.name)
println(c.count)

The override keyword is not required for name in ClassWithAbstractC1, since the original declaration is abstract. The output of this script is the following.

ClassWithAbstractC1
1

It’s important to emphasize that name and count are abstract fields, not concrete fields with default values. A similar-looking declaration of name in a Java class, String name; would declare a concrete field with the default value (null in this case). Java doesn’t support abstract fields or types (as we’ll discuss next), only methods.

Overriding Abstract Types

We introduced abstract type declarations in the section called “Abstract Types And Parameterized Types” in Chapter 2, Type Less, Do More. Recall the BulkReader example from that section.

// code-examples/TypeLessDoMore/abstract-types-script.scala

import java.io._

abstract class BulkReader {
  type In
  val source: In
  def read: String
}

class StringBulkReader(val source: String) extends BulkReader {
  type In = String
  def read = source
}

class FileBulkReader(val source: File) extends BulkReader {
  type In = File
  def read = {
    val in = new BufferedInputStream(new FileInputStream(source))
    val numBytes = in.available()
    val bytes = new Array[Byte](numBytes)
    in.read(bytes, 0, numBytes)
    new String(bytes)
  }
}

println( new StringBulkReader("Hello Scala!").read )
println( new FileBulkReader(new File("abstract-types-script.scala")).read )

Abstract types are an alternative to parameterized types, which we’ll explore in the section called “Understanding Parameterized Types” in Chapter 12, The Scala Type System. Like parameterized types, they provide an abstraction mechanism at the type level.

The example shows how to declare an abstract type and how to define a concrete value in derived classes. BulkReader declares type In without initializing it. The concrete derived class StringBulkReader provides a concrete value using type In = String.

Unlike fields and methods, it is not possible to override a concrete type definition. However, the abstract declaration can constrain the allowed concrete type values. We’ll learn how in Chapter 12, The Scala Type System.

Finally, you probably noticed that this example also demonstrates defining an abstract field, using a constructor parameter, and an abstract method.

For another example, let’s revisit our Subject trait from the section called “Traits as Mixins” in Chapter 4, Traits. The definition of the Observer type is a structural type with a method named receiveUpdate. Observers must have this “structure”. Let’s generalize the implementation now, using an abstract type.

// code-examples/AdvOOP/observer/observer2.scala

package observer

trait AbstractSubject {
  type Observer

  private var observers = List[Observer]()
  def addObserver(observer:Observer) = observers ::= observer
  def notifyObservers = observers foreach (notify(_))

  def notify(observer: Observer): Unit
}

trait SubjectForReceiveUpdateObservers extends AbstractSubject {
  type Observer = { def receiveUpdate(subject: Any) }

  def notify(observer: Observer): Unit = observer.receiveUpdate(this)
}

trait SubjectForFunctionalObservers extends AbstractSubject {
  type Observer = (AbstractSubject) => Unit

  def notify(observer: Observer): Unit = observer(this)
}

Now, AbstractSubject declares type Observer as abstract (implicitly, because there is no definition). Since the original structural type is gone, we don’t know exactly how to notify an observer. So, we also added an abstract method notify, which a concrete class or trait will define as appropriate.

The SubjectForReceiveUpdateObservers derived trait defines Observer with the same structural type we used in the original example and notify simply calls receiveUpdate, as before.

The SubjectForFunctionalObservers derived trait defines Observer to be a function taking an instance of AbstractSubject and returning Unit. All notify has to do is call the observer function, passing the subject as the sole argument. Note that this implementation is similar to the approach we used in our original button implementation, ButtonWithCallbacks, where the “callbacks” where user-supplied functions. (See the section called “Introducing Traits” in Chapter 4, Traits and a revisited version in the section called “Constructors in Scala” in Chapter 5, Basic Object-Oriented Programming in Scala.)

Here is a specification that exercises these two variations, observing button clicks as before.

// code-examples/AdvOOP/observer/button-observer2-spec.scala

package ui
import org.specs._
import observer._

object ButtonObserver2Spec extends Specification {
  "An Observer watching a SubjectForReceiveUpdateObservers button" should {
    "observe button clicks" in {
      val observableButton =
        new Button(name) with SubjectForReceiveUpdateObservers {
        override def click() = {
          super.click()
          notifyObservers
        }
      }
      val buttonObserver = new ButtonCountObserver
      observableButton.addObserver(buttonObserver)
      for (i <- 1 to 3) observableButton.click()
      buttonObserver.count mustEqual 3
    }
  }
  "An Observer watching a SubjectForFunctionalObservers button" should {
    "observe button clicks" in {
      val observableButton =
        new Button(name) with SubjectForFunctionalObservers {
        override def click() = {
          super.click()
          notifyObservers
        }
      }
      var count = 0
      observableButton.addObserver((button) => count += 1)
      for (i <- 1 to 3) observableButton.click()
      count mustEqual 3
    }
  }
}

First we exercise SubjectForReceiveUpdateObservers, which looks very similar to our earlier examples. Next we exercise SubjectForFunctionalObservers. In this case, we don’t need another “observer” instance at all. We just maintain a count variable and pass a function literal to addObserver to increment the count (and ignore the button).

The main virtue of SubjectForFunctionalObservers is its minimalism. It requires no special instances, no traits defining abstractions, etc. For many cases, it is an ideal approach.

AbstractSubject is more reusable than the original definition of Subject, because it imposes fewer constraints on potential observers.

Note

AbstractSubject illustrates that an abstraction with fewer concrete details is usually more reusable.

But wait, there’s more! We’ll revisit the use of abstract types and the observer pattern in the section called “Scalable Abstractions” in Chapter 13, Application Design.

When Accessor Methods and Fields Are Indistinguishable: The Uniform Access Principle

Suppose a user of ButtonCountObserver from the section called “Traits as Mixins” in Chapter 4, Traits accesses the count member.

// code-examples/Traits/ui/button-count-observer-script.scala

val bco = new ui.ButtonCountObserver
val oldCount = bco.count
bco.count = 5
val newCount = bco.count
println(newCount + " == 5 and " + oldCount + " == 0?")

When the count field is read or written, as in this example, are methods called or is the field accessed directly? As originally declared in ButtonCountObserver, the field is accessed directly. However, the user doesn’t really care. In fact, the following two definitions are functionally equivalent, from the perspective of the user.

class ButtonCountObserver {
  var count = 0  // public field access (original definition)
  // ...
}
class ButtonCountObserver {
  private var cnt = 0  // private field
  def count = cnt      // reader method
  def count_=(newCount: Int) = cnt = newCount  // writer method
  // ...
}

This equivalence is an example of the Uniform Access Principle. Clients read and write field values as if they are publicly accessible, even though in some case they are actually calling methods. The maintainer of ButtonCountObserver has the freedom to change the implementation without forcing users to make code changes.

The reader method in the second version does not have parentheses. Recall that consistency in the use of parentheses is required if a method definition omits parentheses. This is only possible if the method takes no arguments. For the uniform access principle to work, we want to define field reader methods without parentheses. (Contrast with Ruby where method parentheses are always optional, as long as the parse is unambiguous.)

The writer method has the format count_=(…). As a bit of syntactic sugar, the compiler allows invocations of methods with this format to be written in either of the following ways.

obj.field_=(newValue)
// or
obj.field = newValue

We named the private variable cnt in the alternative definition. Scala keeps field and method names in the same namespace, which means we can’t name the field count if a method is named count. Many languages, like Java, don’t have this restriction, because they keep field and method names in separate namespaces. However, these languages can’t support the uniform access principle as a result, unless they build in ad hoc support in their grammars or compilers.

Since member object definitions behave similar to fields from the caller’s perspective, they are also in the same namespace as methods and fields. Hence, the following class would not compile.

// code-examples/AdvOOP/overrides/member-namespace-wont-compile.scala
// WON'T COMPILE

class IllegalMemberNameUse {
  def member(i: Int) = 2 * i
  val member = 2         // ERROR
  object member {        // ERROR
    def apply() = 2
  }
}

There is one other benefit of this namespace “unification”. If a parent class declares a parameterless method, then a subclass can override that method with a val. If the parent’s method is concrete, then the override keyword is required.

// code-examples/AdvOOP/overrides/method-field-class-script.scala

class Parent {
  def name="htmlcat_ch06.html_Parent"
}

class Child extends Parent {
  override val name="htmlcat_ch06.html_Child"
}

println(new Child().name)   // => "Child"

If the parent’s method is abstract, then the override keyword is optional.

// code-examples/AdvOOP/overrides/abs-method-field-class-script.scala

abstract class AbstractParent {
  def name: String
}

class ConcreteChild extends AbstractParent {
  val name="htmlcat_ch06.html_Child"
}

println(new ConcreteChild().name)   // => "Child"

This also works for Traits. If the trait’s method is concrete, we have the following.

// code-examples/AdvOOP/overrides/method-field-trait-script.scala

trait NameTrait {
  def name="htmlcat_ch06.html_NameTrait"
}

class ConcreteNameClass extends NameTrait {
  override val name="htmlcat_ch06.html_ConcreteNameClass"
}

println(new ConcreteNameClass().name)   // => "ConcreteNameClass"

If the trait’s method is abstract, then we have the following.

// code-examples/AdvOOP/overrides/abs-method-field-trait-script.scala

trait AbstractNameTrait {
  def name: String
}

class ConcreteNameClass extends AbstractNameTrait {
  val name="htmlcat_ch06.html_ConcreteNameClass"
}

println(new ConcreteNameClass().name)   // => "ConcreteNameClass"

Why is this feature useful? It allows derived classes and traits to use a simple field access, when that is sufficient, or a method call when more processing is required, such as lazy initialization. The same argument holds for the uniform access principle, in general.

Overriding a def with a val in a subclass can also be handy when interoperating with Java code. Turn a getter into a val by placing it in the constructor. You’ll see this in action in the following example, in which our Scala class Person implements a hypothetical PersonInterface from some legacy Java code.

class Person(val getName: String) extends PersonInterface

If you only have a few accessors in the Java code you’re integrating with, this technique makes quick work of them.

What about overriding a parameterless method with a var or overriding a val or var with a method? These are not permitted, because they can’t match the behaviors of the things they are overriding.

If you attempt to use a var to override a parameterless method, you get an error that the writer method, override name_=, is not overriding anything. This would also be inconsistent with a philosophical goal of functional programming, that a method that takes no parameters should always return the same result. To do otherwise would require side-effects in the implementation, which functional programming tries to avoid, for reasons we will examine in Chapter 8, Functional Programming in Scala. Because a var is changeable, the no-parameter “method” defined in the parent type would no longer return the same result consistently.

If you could override a val with a method, there is no way for Scala to guarantee that the method will always return the same value, consistent with val semantics. That issue doesn’t exist with a var, of course, but you would have to override the var with two methods, a reader and a writer. The Scala compiler doesn’t support that substitution.

Companion Objects

Recall that fields and methods defined in objects serve the role that class “static” fields and methods serve in languages like Java. When object-based fields and methods are closely associated with a particular class, they are normally defined in a companion object.

We mentioned companion objects briefly in Chapter 1 and we discussed the Pair example from the Scala library in Chapter 2 and the last chapter. Let’s fill in the remaining details now.

First, recall that if a class (or a type referring to a class) and an object are declared in the same file, in the same package, and with the same name, they are called a companion class (or companion type) and a companion object, respectively.

There is no namespace collision when the name is reused in this way, because Scala stores the class name in the type namespace, while it stores the object name in the term namespace [ScalaSpec2009].

The two most interesting methods frequently defined in a companion object are apply and unapply.

Apply

Scala provides some syntactic sugar in the form of the apply method. When an instance of a class is followed by parentheses with a list of zero or more parameters, the compiler invokes the apply method for that instance. This is true for an object with a defined apply method (such as a companion object), as well as an instance of a class that defines an apply method.

In the case of an object, apply is conventionally used as a factory method, returning a new instance. This is what Pair.apply does in the Scala library. Here is Pair from the standard library.

type Pair[+A, +B] = Tuple2[A, B]
object Pair {
  def apply[A, B](x: A, y: B) = Tuple2(x, y)
  def unapply[A, B](x: Tuple2[A, B]): Option[Tuple2[A, B]] = Some(x)
}

So, you can create a new Pair as follows.

val p = Pair(1, "one")

It looks like we are some how creating an Pair instance without a new. Rather than calling a Pair constructor directly, we are actually calling Pair.apply (i.e., the companion object Pair), which then calls Tuple2.apply on the Tuple2 companion object!

Tip

If there are several alternative constructors for a class and it also has a companion object, consider defining fewer constructors on the class and defining several overloaded apply methods on the companion object to handle the variations.

However, apply is not limited to instantiating the companion class. It could instead return an instance of a subclass of the companion class. Here is an example where we define a companion object Widget that uses regular expressions to parse a string representing a Widget subclass. When a match occurs, the subclass is instantiated and the new instance is returned.

// code-examples/AdvOOP/objects/widget.scala

package objects

abstract class Widget {
  def draw(): Unit
  override def toString() = "(widget)"
}

object Widget {
  val ButtonExtractorRE = """\(button: label=([^,]+),\s+\(Widget\)\)""".r
  val TextFieldExtractorRE = """\(textfield: text=([^,]+),\s+\(Widget\)\)""".r

  def apply(specification: String): Option[Widget] = specification match {
    case ButtonExtractorRE(label)   => new Some(new Button(label))
    case TextFieldExtractorRE(text) => new Some(new TextField(text))
    case _ => None
  }
}

Widget.apply receives a string “specification” that defines which class to instantiate. The string might come from a configuration file with widgets to create at startup, for example. The string format is the same format used by toString(). Regular expressions are defined for each type. (Parser combinator are an alternative. They are discussed in the section called “External DSLs with Parser Combinators” in Chapter 11, Domain-Specific Languages in Scala.)

The match expression applies each regular expression to the string. A case expression like

case ButtonExtractorRE(label) => new Some(new Button(label))

means that the string is matched against the ButtonExtractorRE regular expression. If successful, it extracts the substring in the first capture group in the regular expression and assigns it to the variable label. Finally, a new Button with this label is created, wrapped in a Some. We’ll learn how this extraction process works in the next section, the section called “Unapply”.

A similar case handles TextField creation. (TextField is not shown. See the online code examples.). Finally, if apply can’t match the string, it returns None.

Here is a “specs” object that exercises Widget.apply.

// code-examples/AdvOOP/objects/widget-apply-spec.scala

package objects
import org.specs._

object WidgetApplySpec extends Specification {
  "Widget.apply with a valid widget specification string" should {
    "return a widget instance with the correct fields set" in {
      Widget("(button: label=click me, (Widget))") match {
        case Some(w) => w match {
          case b:Button => b.label mustEqual "click me"
          case x => fail(x.toString())
        }
        case None => fail("None returned.")
      }
      Widget("(textfield: text=This is text, (Widget))") match {
        case Some(w) => w match {
          case tf:TextField => tf.text mustEqual "This is text"
          case x => fail(x.toString())
        }
        case None => fail("None returned.")
      }
    }
  }
  "Widget.apply with an invalid specification string" should {
    "return None" in {
      Widget("(button: , (Widget)") mustEqual None
    }
  }
}

The first match statement implicitly invokes Widget.apply with the string “(button: label=click me, (Widget))”. If a button wrapped in a Some is not returned with the label “click me”, this test will fail. Next, a similar test for a TextField widget is done. The final test uses an invalid string and confirms that None is returned.

A drawback of this particular implementation is that we have hard-coded a dependency on each derived class of Widget in Widget itself, which breaks the Open-Closed Principle (see [Meyer1997] and [Martin2003]). A better implementation would use a factory design pattern from [GOF1995]. Nevertheless, the example illustrates how an apply method can be used as a real factory.

There is no requirement for apply in an object to be used as a factory. Neither is there any restriction on the argument list or what apply returns. However, because it is so common to use apply in an object as a factory, use caution when using apply for other purposes, as it could confuse users. However, there are good counter examples, such as the use of apply in Domain-Specific Languages (see Chapter 11, Domain-Specific Languages in Scala).

The factory convention is less commonly used for apply defined in classes. For example, in the Scala standard library, Array.apply(i: int) returns the element at index i in the array. Many of the other collections use apply in a similar way. So, users can write code like the following.

val a = Array(1,2,3,4)
println(a(2))  // => 3

Finally, as a reminder, although apply is handled specially by the compiler, it is otherwise no different than any other method. You can overload it, you can invoke it directly, etc.

Unapply

The name unapply suggests that it does the “opposite” operation that apply does. Indeed, it is used to extract the constituent parts of an instance. Pattern matching uses this feature extensively. Hence, unapply is often defined in companion objects and it is used to extract the field values from instances of the corresponding companion types. For this reason, unapply methods are called extractors.

Here is an expanded button.scala with a Button object that defines an unapply extractor method.

// code-examples/AdvOOP/objects/button.scala

package objects
import ui3.Clickable

class Button(val label: String) extends Widget with Clickable {

  def click() = {
    // Logic to give the appearance of clicking a button...
  }

  def draw() = {
    // Logic to draw the button on the display, web page, etc.
  }

  override def toString() = "(button: label="+label+", "+super.toString()+")"
}

object Button {
  def unapply(button: Button) = Some(button.label)
}

Button.unapply takes a single Button argument and returns a Some wrapping the label value. This demonstrates the protocol for unapply methods. They return a Some wrapping the extracted fields. (We’ll see how to handle more than one field in a moment.)

Here is a “specs” object that exercises Button.unapply.

// code-examples/AdvOOP/objects/button-unapply-spec.scala

package objects
import org.specs._

object ButtonUnapplySpec extends Specification {
  "Button.unapply" should {
    "match a Button object" in {
      val b = new Button("click me")
      b match {
        case Button(label) => label mustEqual "click me"
        case _ => fail()
      }
    }
    "match a RadioButton object" in {
      val b = new RadioButton(false, "click me")
      b match {
        case Button(label) => label mustEqual "click me"
        case _ => fail()
      }
    }
    "not match a non-Button object" in {
      val tf = new TextField("hello world!")
      tf match {
        case Button(label) => fail()
        case x => x must notBeNull // hack to make Specs not ignore this test.
      }
    }
    "extract the Button's label" in {
      val b = new Button("click me")
      b match {
        case Button(label) => label mustEqual "click me"
        case _ => fail()
      }
    }
    "extract the RadioButton's label" in {
      val rb = new RadioButton(false, "click me, too")
      rb match {
        case Button(label) => label mustEqual "click me, too"
        case _ => fail()
      }
    }
  }
}

The first three examples (in clauses) confirm that Button.unapply is only called for actual Button instances or instances of derived classes, like RadioButton.

Since unapply takes a Button argument (in this case), the Scala runtime type checks the instance being matched. It then looks for a companion object with an unapply method and invokes that method, passing the instance. The default case clause case _ is invoked for the instances that don’t type check as compatible. The pattern matching process is fully type safe.

The remaining examples (in clauses) confirm that the correct values for the label are extracted. The Scala runtime automatically extracts the item in the Some.

What about extracting multiple fields? For a fixed set of known fields, a Some wrapping a Tuple is returned, as shown in this updated version of RadioButton.

// code-examples/AdvOOP/objects/radio-button.scala

package objects

/**
 * Button with two states, on or off, like an old-style,
 * channel-selection botton on a radio.
 */
class RadioButton(val on: Boolean, label: String) extends Button(label)

object RadioButton {
  def unapply(button: RadioButton) = Some((button.on, button.label))
                 // equivalent to: = Some(Pair(button.on, button.label))
}

A Some wrapping a Pair(button.on, button.label) is returned. As we discuss in the section called “The Predef Object” in Chapter 7, The Scala Object System, Pair is a type defined to be equal to Tuple2. Here is the corresponding “specs” object that tests it.

// code-examples/AdvOOP/objects/radio-button-unapply-spec.scala

package objects
import org.specs._

object RadioButtonUnapplySpec extends Specification {
  "RadioButton.unapply" should {
    "should match a RadioButton object" in {
      val b = new RadioButton(true, "click me")
      b match {
        case RadioButton(on, label) => label mustEqual "click me"
        case _ => fail()
      }
    }
    "not match a Button (parent class) object" in {
      val b = new Button("click me")
      b match {
        case RadioButton(on, label) => fail()
        case x => x must notBeNull
      }
    }
    "not match a non-RadioButton object" in {
      val tf = new TextField("hello world!")
      tf match {
        case RadioButton(on, label) => fail()
        case x => x must notBeNull
      }
    }
    "extract the RadioButton's on/off state and label" in {
      val b = new RadioButton(true, "click me")
      b match {
        case RadioButton(on, label) => {
          label mustEqual "click me"
          on    mustEqual true
        }
        case _ => fail()
      }
    }
  }
}

Apply and UnapplySeq for Collections

What if you want to build a collection from a variable argument list passed to apply? What if you want to extract the first few elements from a collection and you don’t care about the rest of it?

In this case, you define apply and unapplySeq (“unapply sequence”) methods. Here are those methods from Scala’s own List class.

def apply[A](xs: A*): List[A] = xs.toList

def unapplySeq[A](x: List[A]): Some[List[A]] = Some(x)

The [A] type parameterization on these methods allows the List object, which is not parameterized, to construct a new List[A] (See the section called “Understanding Parameterized Types” in Chapter 12, The Scala Type System for more details.) Most of the time, the type parameter will be inferred based on the context.

The parameter list xs: A* is a variable argument list. Callers of apply can pass as many A instances as they want, including none. Internally, variable argument lists are stored in an Array[A], which inherits the toList method from Iterable that we used here.

Tip

This is a handy idiom for API writers. Accepting variable arguments to a function can be convenient for users and converting the arguments to a List is often ideal for internal management.

Here is an example script that uses List.apply implicitly.

// code-examples/AdvOOP/objects/list-apply-example-script.scala

val list1 = List()
val list2 = List(1, 2.2, "three", 'four)
val list3 = List("1", "2.2", "three", "four")
println("1: "+list1)
println("2: "+list2)
println("3: "+list3)

The 'four is a symbol, essentially an interned string. Symbols are more commonly used in Ruby, for example, where the same symbol would be written as :four. Symbols are useful for representing identities consistently.

This script yields the following output.

1: List()
2: List(1, 2.2, three, 'four)
3: List(1, 2.2, three, four)

The unapplySeq method is trivial; it returns the input list wrapped in a Some. However, this is sufficient for pattern matching as shown in this example.

// code-examples/AdvOOP/objects/list-unapply-example-script.scala

val list = List(1, 2.2, "three", 'four)
list match {
  case List(x, y, _*) => println("x = "+x+", y = "+y)
  case _ => throw new Exception("No match! "+list)
}

The List(x, y, _*) syntax means we will only match on a list with at least two elements and the first two elements will be assigned to x and y. We don’t care about the rest of the list. The _* matches zero or more remaining elements.

The output is the following.

x = 1, y = 2.2

We’ll have much more to say about List and pattern matching in the section called “Lists in Functional Programming” in Chapter 8, Functional Programming in Scala.

Companion Objects and Java Static Methods

There is one more thing to know about companion objects. Whenever you define a main method to use as the entry point for an application, Scala requires you to put it in an object. However, at the time of this writing, main methods cannot be defined in a companion object. Because of implementation details in the generated code, the JVM won’t find the main method. This issue may be resolved in a future release. For now, you must define any main method in a singleton object (i.e., a “non-companion” object) [ScalaTips]. Consider the following example of a simple Person class and companion object that attempts to define main.

// code-examples/AdvOOP/objects/person.scala

package objects

class Person(val name: String, val age: Int) {
  override def toString = "name: " + name + ", age: " + age
}

object Person {
  def apply(name: String, age: Int) = new Person(name, age)
  def unapply(person: Person) = Some((person.name, person.age))

  def main(args: Array[String]) = {
    // Test the constructor...
    val person = new Person("Buck Trends", 18)
    assert(person.name == "Buck Trends")
    assert(person.age  == 21)
  }
}

object PersonTest {
  def main(args: Array[String]) = Person.main(args)
}

This code compiles fine, but if you attempt to invoke Person.main, using scala -cp ... objects.Person, you get the following error.

java.lang.NoSuchMethodException: objects.Person.main([Ljava.lang.String;)

The objects/Person.class file exists. If you decompile it with javap -classpath ... objects.Person (see the section called “The scalap, javap, and jad Command Line Tools” in Chapter 14, Scala Tools, Libraries and IDE Support), you can see that it doesn’t contain a main method. If you decompile objects/Person$.class, the file for the companion object’s byte code, it has a main method, but notice that it isn’t declared static. So, attempting to invoke scala -cp ... objects.Person$ also fails to find the “static” main.

java.lang.NoSuchMethodException: objects.Person$.main is not static

The separate singleton object PersonTest defined in this example has to be used. Decompiling it with javap -classpath ... objects.PersonTest shows that it has a static main method. If you invoke it using scala -cp ... objects.PersonTest, the PersonTest.main method is invoked, which in turn invokes Person.main. You get an assertion error from the second call to assert, which is intentional.

java.lang.AssertionError: assertion failed
    at scala.Predef$.assert(Predef.scala:87)
    at objects.Person$.test(person.scala:15)
    at objects.PersonTest$.main(person.scala:20)
    at objects.PersonTest.main(person.scala)
    ....

In fact, this is a general issue with methods defined in companion objects that need to be visible to Java code as static methods. They aren’t static in the byte code. You have to put these methods in singleton objects instead. Consider the following Java class that attempts to create a user with Person.apply.

// code-examples/AdvOOP/objects/PersonUserWontCompile.java
// WON'T COMPILE

package objects;

public class PersonUserWontCompile {
  public static void main(String[] args) {
    Person buck = Person.apply("Buck Trends", 100);  // ERROR
    System.out.println(buck);
  }
}

If we compile it (after compiling Person.scala), we get the following error.

$ javac -classpath ... objects/PersonUserWontCompile.java
objects/PersonUserWontCompile.java:5: cannot find symbol
symbol  : method apply(java.lang.String,int)
location: class objects.Person
        Person buck = Person.apply("Buck Trends", 100);
                            ^
1 error

However, we can use the following singleton object.

// code-examples/AdvOOP/objects/person-factory.scala

package objects

object PersonFactory {
  def make(name: String, age: Int) = new Person(name, age)
}

Now the following Java class will compile.

// code-examples/AdvOOP/objects/PersonUser.java

package objects;

public class PersonUser {
  public static void main(String[] args) {
    // The following line won't compile.
    // Person buck = Person.apply("Buck Trends", 100);
    Person buck = PersonFactory.make("Buck Trends", 100);
    System.out.println(buck);
  }
}

Warning

Do not define main or any other method in a companion object that needs to be visible to Java code as a static method. Define it in a singleton object, instead.

If you have no other choice but to call a method in a companion object from Java, you can explicitly create an instance of the object with new, since the object is a “regular” Java class in the byte code, and call the method on the instance.

Case Classes

In the section called “Matching on Case Classes” in Chapter 3, Rounding Out the Essentials, we briefly introduced you to case classes. Case classes have several useful features, but also some drawbacks.

Let’s rewrite the Shape example we used in the section called “A Taste of Concurrency” in Chapter 1, Zero to Sixty: Introducing Scala to use case classes. Here is the original implementation.

// code-examples/IntroducingScala/shapes.scala

package shapes {
  class Point(val x: Double, val y: Double) {
    override def toString() = "Point(" + x + "," + y + ")"
  }

  abstract class Shape() {
    def draw(): Unit
  }

  class Circle(val center: Point, val radius: Double) extends Shape {
    def draw() = println("Circle.draw: " + this)
    override def toString() = "Circle(" + center + "," + radius + ")"
  }

  class Rectangle(val lowerLeft: Point, val height: Double, val width: Double)
        extends Shape {
    def draw() = println("Rectangle.draw: " + this)
    override def toString() =
      "Rectangle(" + lowerLeft + "," + height + "," + width + ")"
  }

  class Triangle(val point1: Point, val point2: Point, val point3: Point)
        extends Shape {
    def draw() = println("Triangle.draw: " + this)
    override def toString() =
      "Triangle(" + point1 + "," + point2 + "," + point3 + ")"
  }
}

Here is the example rewritten using the case keyword.

// code-examples/AdvOOP/shapes/shapes-case.scala

package shapes {
  case class Point(x: Double, y: Double)

  abstract class Shape() {
    def draw(): Unit
  }

  case class Circle(center: Point, radius: Double) extends Shape() {
    def draw() = println("Circle.draw: " + this)
  }

  case class Rectangle(lowerLeft: Point, height: Double, width: Double)
      extends Shape() {
    def draw() = println("Rectangle.draw: " + this)
  }

  case class Triangle(point1: Point, point2: Point, point3: Point)
      extends Shape() {
    def draw() = println("Triangle.draw: " + this)
  }
}

Adding the case keyword causes the compiler to add a number of useful features automatically. The keyword suggests an association with case expressions in pattern matching. Indeed, they are particularly well suited for that application, as we will see.

First, the compiler automatically converts the constructor arguments into immutable fields (val's). The val keyword is optional. If you want mutable fields, use the var keyword. So, our constructor argument lists are now shorter.

Second, the compiler automatically implements equals, hashCode, and toString methods to the class, which use the fields specified as constructor arguments. So, we no longer need our own toString methods. In fact, the generated toString methods produce the same outputs as the ones we implemented ourselves. Also, the body of Point is gone because there are no methods that we need to define!

The following script uses these methods that are now in the shapes.

// code-examples/AdvOOP/shapes/shapes-usage-example1-script.scala

import shapes._

val shapesList = List(
  Circle(Point(0.0, 0.0), 1.0),
  Circle(Point(5.0, 2.0), 3.0),
  Rectangle(Point(0.0, 0.0), 2, 5),
  Rectangle(Point(-2.0, -1.0), 4, 3),
  Triangle(Point(0.0, 0.0), Point(1.0, 0.0), Point(0.0, 1.0)))

val shape1 = shapesList.head  // grab the first one.
println("shape1: "+shape1+". hash = "+shape1.hashCode)
for (shape2 <- shapesList) {
  println("shape2: "+shape2+". 1 == 2 ? "+(shape1 == shape2))
}

This script outputs the following.

shape1: Circle(Point(0.0,0.0),1.0). hash = 2061963534
shape2: Circle(Point(0.0,0.0),1.0). 1 == 2 ? true
shape2: Circle(Point(5.0,2.0),3.0). 1 == 2 ? false
shape2: Rectangle(Point(0.0,0.0),2.0,5.0). 1 == 2 ? false
shape2: Rectangle(Point(-2.0,-1.0),4.0,3.0). 1 == 2 ? false
shape2: Triangle(Point(0.0,0.0),Point(1.0,0.0),Point(0.0,1.0)). 1 == 2 ? false

As we’ll see in the section called “Equality of Objects” below, the == method actually invokes the equals method.

Even outside of case expressions, automatic generation of these three methods is very convenient for simple, “structural” classes, i.e., classes that contain relatively simple fields and behaviors.

Third, when the case keyword is used, the compiler automatically creates a companion object with an apply factory method that takes the same arguments as the primary constructor. The previous example used the appropriate apply methods to create the Points, the different Shapes, and also the List itself. That’s why we don’t need new; we’re actually calling apply(x,y) in the Point companion object, for example.

Note

You can have secondary constructors in case classes, but there will be no overloaded apply method generated that has the same argument list. You’ll have to use new to create instances with those constructors.

The companion object also gets an unapply extractor method, which extracts all the fields of an instance in an elegant fashion. The following script demonstrates the extractors in pattern matching case statements.

// code-examples/AdvOOP/shapes/shapes-usage-example2-script.scala

import shapes._

val shapesList = List(
  Circle(Point(0.0, 0.0), 1.0),
  Circle(Point(5.0, 2.0), 3.0),
  Rectangle(Point(0.0, 0.0), 2, 5),
  Rectangle(Point(-2.0, -1.0), 4, 3),
  Triangle(Point(0.0, 0.0), Point(1.0, 0.0), Point(0.0, 1.0)))

def matchOn(shape: Shape) = shape match {
  case Circle(center, radius) =>
    println("Circle: center = "+center+", radius = "+radius)
  case Rectangle(ll, h, w) =>
    println("Rectangle: lower-left = "+ll+", height = "+h+", width = "+w)
  case Triangle(p1, p2, p3) =>
    println("Triangle: point1 = "+p1+", point2 = "+p2+", point3 = "+p3)
  case _ =>
    println("Unknown shape!"+shape)
}

shapesList.foreach { shape => matchOn(shape) }

This script outputs the following.

Circle: center = Point(0.0,0.0), radius = 1.0
Circle: center = Point(5.0,2.0), radius = 3.0
Rectangle: lower-left = Point(0.0,0.0), height = 2.0, width = 5.0
Rectangle: lower-left = Point(-2.0,-1.0), height = 4.0, width = 3.0
Triangle: point1 = Point(0.0,0.0), point2 = Point(1.0,0.0), point3 = Point(0.0,1.0)

Syntactic Sugar for Binary Operations

By the way, remember in the section called “Matching on Sequences” in Chapter 3, Rounding Out the Essentials when we discussed matching on lists? We wrote this case expression.

def processList(l: List[Any]): Unit = l match {
  case head :: tail => ...
  ...
}

It turns out that the following expressions are identical.

  case head :: tail => ...
  case ::(head, tail) => ...

We are using the companion object for the case class named ::, which is used for non-empty lists. When used in case expressions, the compiler supports this special infix operator notation for invocations of unapply.

It works not only for unapply methods with two arguments, but also with one or more arguments. We could rewrite our matchOn method above this way.

def matchOn(shape: Shape) = shape match {
  case center Circle radius => ...
  case ll Rectangle (h, w) => ...
  case p1 Triangle (p2, p3) => ...
  case _ => ...
}

For an unapply that takes one argument, you would have to insert an empty set of parentheses to avoid a parsing ambiguity.

  case arg Foo () => ...

From the point of view of clarity, this syntax is elegant for some cases when there are two arguments. For lists, head :: tail matches the expressions for building up lists, so there is a beautiful symmetry when the extraction process uses the same syntax. However, the merits of this syntax are less clear for other examples, especially when there are N != 2 arguments.

The copy Method in Scala Version 2.8

In Scala version 2.8, another instance method is automatically generated, called copy. This method is useful when you want to make a new instance of a case class that is identical to another instance with a few fields changed. Consider the following example script.

// code-examples/AdvOOP/shapes/shapes-usage-example3-v28-script.scala
// Scala version 2.8 only.

import shapes._

val circle1 = Circle(Point(0.0, 0.0), 2.0)
val circle2 = circle1 copy (radius = 4.0)

println(circle1)
println(circle2)

The second circle is created by copying the first and specifying a new radius. The copy method implementation that is generated by the compiler exploits the new named and default parameters in Scala version 2.8, which we discussed in the section called “Method Default and Named Arguments (Scala Version 2.8)” in Chapter 2, Type Less, Do More. The generated implementation of Circle.copy looks roughly like the following.

case class Circle(center: Point, radius: Double) extends Shape() {
  ...
  def copy(center: Point = this.center, radius: Double = this.radius) =
    new Circle(center, radius)
}

So, default values are provided for all the arguments to the method (only two in this case). When using the copy method, the user only specifies by name the fields that are changing. The values for the rest of the fields are used without having to reference them explicitly.

Case Class Inheritance

Did you notice that the new Shapes code in the section called “Case Classes” did not put the case keyword on the abstract Shape class? This is allowed by the compiler, but there are reasons for not having one case class inherit another. First, it can complicate field initialization. Suppose we make Shape a case class. Suppose we want to add a string field to all shapes representing an "id" that the user wants to set. It makes sense to define this field in Shape. Let’s make these two changes to Shape.

abstract case class Shape(id: String) {
  def draw(): Unit
}

Now the derived shapes need to pass the id to the Shape constructor. For example, Circle would become the following.

case class Circle(id: String, center: Point, radius: Double) extends Shape(id){
  def draw(): Unit
}

However, if you compile this code, you’ll get errors like the following.

... error: error overriding value id in class Shape of type String;
 value id needs `override' modifier
  case class Circle(id: String, center: Point, radius: Double) extends Shape(id){
                    ^

Remember that both definitions of id, the one in Shape and the one in Circle are considered val field definitions! The error message tells us the answer; use the override keyword, as we discussed in the section called “Overriding Members of Classes and Traits”. So, the complete set of required modifications are as follows.

// code-examples/AdvOOP/shapes/shapes-case-id.scala

package shapesid {
  case class Point(x: Double, y: Double)

  abstract case class Shape(id: String) {
    def draw(): Unit
  }

  case class Circle(override val id: String, center: Point, radius: Double)
        extends Shape(id) {
    def draw() = println("Circle.draw: " + this)
  }

  case class Rectangle(override val id: String, lowerLeft: Point,
        height: Double, width: Double) extends Shape(id) {
    def draw() = println("Rectangle.draw: " + this)
  }

  case class Triangle(override val id: String, point1: Point,
        point2: Point, point3: Point) extends Shape(id) {
    def draw() = println("Triangle.draw: " + this)
  }
}

Note that we also have to add the val keywords. This works, but it is somewhat ugly.

A more ominous problem involves the generated equals methods. Under inheritance, the equals methods don’t obey all the standard rules for robust object equality. We’ll discuss those rules below in the section called “Equality of Objects”. For now, consider the following example.

// code-examples/AdvOOP/shapes/shapes-case-equals-ambiguity-script.scala

import shapesid._

case class FancyCircle(name: String, override val id: String,
    override val center: Point, override val radius: Double)
      extends Circle(id, center, radius) {
  override def draw() = println("FancyCircle.draw: " + this)
}

val fc = FancyCircle("me", "circle", Point(0.0,0.0), 10.0)
val c  = Circle("circle", Point(0.0,0.0), 10.0)
format("FancyCircle == Circle? %b\n", (fc == c))
format("Circle == FancyCircle? %b\n", (c  == fc))

If you run this script, you get the following output.

FancyCircle == Circle? false
Circle == FancyCircle? true

So, Circle.equals evaluates to true when given a FancyCircle with the same values for the Circle fields. The reverse case isn’t true. While you might argue that, as far as Circle is concerned, they really are equal, most people would argue that this is a risky, “relaxed” interpretation of equality. It’s true that a future version of Scala could generate equals methods for case classes that do exact type-equality checking.

So, the conveniences provided by case classes sometimes lead to problems. It is best to avoid inheritance of one case class by another. Note that it’s fine for a case class to inherit from a non-case class or trait. It’s also fine for a non-case class or trait to inherit from a case class.

Because of these issues, it is possible that case class inheritance will be deprecated and removed in future versions of Scala.

Warning

Avoid inheriting a case class from another case class.

Equality of Objects

Implementing a reliable equality test for instances is difficult to do correctly. Effective Java [Bloch2008] and the Scaladoc page for AnyRef.equals describe the requirements for a good equality test. A very good description of the techniques for writing correct equals and hashCode methods can be found in [Odersky2009], which uses Java syntax, but is adapted from chapter 28 of Programming in Scala [Odersky2008]. Consult these references when you need to implement your own equals and hashCode methods. Recall that these methods are created automatically for case classes.

Here we focus on the different equality methods available in Scala and their meanings. There are some slight inconsistencies between the Scala specification [ScalaSpec2009] and the Scaladoc pages for the equality-related methods for Any and AnyRef, but the general behavior is clear.

Caution

Some of the equality methods have the same names as equality methods in other languages, but the semantics are sometimes different!

The equals Method

The equals method tests for value equality. That is, obj1 equals obj2 is true if both obj1 and obj2 have the same value. They do not need to refer to the same instance.

Hence, equals behaves like the equals method in Java and the eql? method in Ruby.

The == and != Methods

While == is an operator in many languages, it is a method in Scala, defined as final in Any. It tests for value equality, like equals. That is, obj1 == obj2 is true if both obj1 and obj2 have the same value. In fact, == delegates to equals. Here is part of the Scaladoc entry for Any.==.

o == arg0 is the same as o.equals(arg0).

Here is the corresponding part of the Scaladoc entry for AnyRef.==.

o == arg0 is the same as if (o eq null) arg0 eq null else o.equals(arg0).

As you would expect != is the negation, i.e., it is equivalent to !(obj1 == obj2).

Since == and != are declared final in Any, you can’t override them, but you don’t need to, since they delegate to equals.

Note

In Java, C++, and C# the == operator tests for reference, not value equality. In contrast, Ruby’s == operator tests for value equality. Whatever language you’re used to, make sure to remember that in Scala, == is testing for value equality.

The ne and eq Methods

The eq method tests for reference equality. That is, obj1 eq obj2 is true if both obj1 and obj2 point to the same location in memory. These methods are only defined for AnyRef.

Hence, eq behave like the == operator in Java, C++, and C#, but not == in Ruby.

The ne method is the negation of eq, i.e., it is equivalent to !(obj1 eq obj2).

Array Equality and the sameElements Method

Comparing the contents of two Arrays doesn’t have an obvious result in Scala.

scala> Array(1, 2) == Array(1, 2)
res0: Boolean = false

That’s a surprise! Thankfully, there’s a simple solution in the form of the sameElements method.

scala> Array(1, 2).sameElements(Array(1, 2))
res1: Boolean = true

Much better. Remember to use sameElements when you want to test if two Arrays contain the same elements.

While this may seem like an inconsistency, encouraging an explicit test of the equality of two mutable data structures is a conservative approach on the part of the language designers. In the long run, it should save you from unexpected results in your conditionals.

Recap and What’s Next

We explored the fine points of overriding members in derived classes. We learned about object equality, case classes, and companion classes and objects.

In the next chapter, we’ll learn about the Scala type hierarchy, in particular, the Predef object that includes many useful definitions. We’ll also learn about Scala’s alternative to Java’s static class members and the linearization rules for method lookup.

You must sign in or register before commenting

Chapter 7. The Scala Object System

The Predef Object

For your convenience, whenever you compile code, the Scala compiler automatically imports the definitions in the java.lang package (javac does this, too). On the .NET platform, it imports the system package. The compiler also imports the definitions in the analogous Scala package, scala. Hence, common Java or .NET types can be used without explicitly importing them or fully qualifying them with the java.lang. prefix, in the Java case. Similarly, a number of common, Scala-specific types are made available without qualification, such as String. Where there are Java and Scala type names that overlap, like List, the Scala version is imported last, so it “wins”.

The compiler also automatically imports the Predef object, which defines or imports several useful types, objects, and functions.

Tip

You can learn a lot of Scala by viewing the source for Predef. It is available by clicking the “source” link in the Predef Scaladoc page or you can download the full source code for Scala at [Scala].

Here is a partial list of the items imported or defined by Predef on the Java platform.

Table 7.1. Items Imported or Defined by Predef.

Types

Character, Class, Error, Function, Integer, Map, Pair, Runnable, Set, String, Throwable, Triple.

Exceptions

Exception, ArrayIndexOutOfBoundsException, ClassCastException, IllegalArgumentException, IndexOutOfBoundsException, NoSuchElementException, NullPointerException, NumberFormatException, RuntimeException, StringIndexOutOfBoundsException, UnsupportedOperationException

Values

Map, Set.

Objects

Pair, Triple.

Classes

Ensuring, ArrowAssoc.

Methods

Factory methods to create tuples, overloaded versions of exit, error, assert, assume, and require, implicit type conversion methods, I/O methods like readLine, println, and format, and a method currentThread, which calls java.lang.Thread.currentThread.


Predef declares the types and exceptions listed in the table using the type keyword. They are definitions that equal the corresponding scala.<Type> or java.lang.<Type> classes, so they behave like “aliases” or imports for the corresponding classes. For example, String is declared as follows.

type String = java.lang.String

In this case, the declaration has the same net effect as an import java.lang.String statement would have.

But didn’t we just say that definitions in java.lang are imported automatically, like String? The reason there is a type definition is to enable support for a uniform string type across all runtime environments. The definition is only redundant on the JVM.

The type Pair is an “alias” for Tuple2.

type Pair[+A, +B] = Tuple2[A, B]

There are two type parameters, A and B, one for each item in the pair. Recall from the section called “Abstract Types And Parameterized Types” in Chapter 2, Type Less, Do More that we explained the meaning of the ‘+’ in front of each type parameter.

Briefly, a Pair[A2,B2], for some A2 and B2, is a subclass of Pair[A1,B1], for some A1 and B1, if A2 is a subtype of A1 and B2 is a subtype of B1. In the section called “Understanding Parameterized Types” in Chapter 12, The Scala Type System, we’ll discuss ‘+’ and other type qualifiers in more detail.

The Pair class also has a companion object Pair with an apply factory method, as discussed in the section called “Companion Objects” previously. Hence, we can create Pair instances as in this example:

val p = Pair(1, "one")

Pair.apply is called with the two arguments. The types A and B, shown in the definition of Pair, are inferred. A new Tuple2 instance is returned.

Map and Set appear in both the types and values lists. In the values list, they are assigned the companion objects scala.collection.immutable.Map and scala.collection.immutable.Set, respectively. Hence, Map and Set in Predef are values, not object definitions, because they refer to objects defined elsewhere, whereas Pair and Triple are defined in Predef itself. The types Map and Set are assigned the corresponding immutable classes.

The ArrowAssoc class defines two methods, -> and the unicode equivalent . The utility of these methods was demonstrated previously in the section called “Option, Some, and None: Avoiding nulls”, where we created a map of U.S. state capitals.

val stateCapitals = Map(
  "Alabama" -> "Montgomery",
  "Alaska"  -> "Juneau",
  // ...
  "Wyoming" -> "Cheyenne")
// ...

The definition of the ArrowAssoc class and the Map and Set values in Predef make the convenient Map initialization syntax possible. First, when Scala sees Map(…) it calls the apply method on the Map companion object, just as we discussed for Pair.

Map.apply expects zero or more Pairs (e.g., (a1, b2), (a2, b2), …), where each tuple holds a name and value. In the example, the tuple types are all inferred to be of type Pair[String,String]. The declaration of Map.apply is as follows.

object Map {
  ...
  def apply[A, B](elems : (A, B)*) : Map[A, B] = ...
}

Recall that there can be no type parameters on the Map companion object, because there can be only one instance. However, apply can have type parameters.

The apply method takes a variable-length argument list. Internally, x will be a subtype of Array[X]. So, for Map.apply, elems is of type Array[(A,B)] or Array[Tuple2[A,B]], if you prefer.

So, now that we know what Map.apply expects, how do we get from a -> b to (a, b)?

Predef also defines an implicit type conversion method called any2ArrowAssoc. The compiler knows that String does not define a -> method, so it looks for an implicit conversion in scope to a type that defines such a method, such as ArrowAssoc. The any2ArrowAssoc method performs that conversion. It has the following implementation.

implicit def any2ArrowAssoc[A](x: A): ArrowAssoc[A] = new ArrowAssoc(x)

It is applied to each item to the left of an arrow ->, e.g., the "Alabama" string. These strings are wrapped in ArrowAssoc instances, upon which the -> method is then invoked. This method has the following implementation.

class ArrowAssoc[A](x: A) {
    ...
    def -> [B](y: B): Tuple2[A, B] = Tuple2(x, y)
}

When it is invoked, it is passed the string on the right-hand side of the ->. The method returns a tuple with the value, ("Alabama", "Montgomery"), for example. In this way, each key -> value is converted into a tuple and the resulting comma-separated list of tuples is passed to the Map.apply factory method.

The description may sound complicated at first, but the beauty of Scala is that this map initialization syntax is not an ad hoc language feature, such as a special-purpose operator -> defined in the language grammar. Instead, this syntax is defined with normal definitions of types and methods, combined with a few general-purpose parsing conventions, such as support for implicits. Furthermore, it is all type-safe. You can use the same techniques to write your own convenient “operators” for mini domain-specific languages (see Chapter 11, Domain-Specific Languages in Scala).

Implicit type conversions are discussed in more detail in the section called “Implicit Conversions” in Chapter 8, Functional Programming in Scala.

Next, recall from Chapter 1 that we were able to replace calls to Console.println(…) with println(…). This “bare” println method is defined in Predef, then imported automatically by the compiler. The definition calls the corresponding method in Console. Similarly, all the other I/O methods defined by Predef, e.g., readLine and format, call the corresponding Console methods.

Finally, the assert, assume, and require methods are each overloaded with various argument list options. They are used for runtime testing of boolean conditions. If a condition is false, an exception is thrown. The Ensuring class serves a similar purpose. You can use these features for Design by Contract programming, as discussed in the section called “Better Design with Design By Contract” in Chapter 13, Application Design.

For the full list of features defined by Predef, see the corresponding Scaladoc entry in [ScalaAPI2008].

Classes and Objects: Where Are the Statics?

Many object-oriented languages allow classes to have class-level constants, fields, and methods, called “static” members in Java, C# and C++. These constants, fields, and methods are not associated with any instances of the class.

An example of a class-level field is a shared logging instance used by all instances of a class for logging messages. An example of a class-level constant is the default logging “threshold” level.

An example of a class-level method is a “finder” method that locates all instances of the class in some repository that match some user-specified criteria. Another example is a factory method, as used in one of the factory-related design patterns [GOF1995].

In order to remain consistent with the goal that “everything is an object” in Scala, class-level fields and methods are not supported. Instead, Scala supports declarations of classes that are singletons, using the object keyword instead of the class keyword. The objects provide an object-oriented approach to “static” data and methods. Hence, Scala does not even have a static keyword.

Objects are instantiated automatically and lazily by the runtime system (see section 5.4 of [ScalaSpec2009]). Just as for classes and traits, the body of the object is the constructor, but since the system instantiates the object, there is no way for the user to specify a parameter list for the constructor, so they aren’t supported. Any data defined in the object has to be initialized with default values. For the same reasons, auxiliary constructors can’t be used and are not supported.

We’ve already seen some examples of objects, such as the “specs” objects used previously for tests, and the Pair type and its companion object, which we explored in the section called “The Predef Object” earlier in this chapter.

type Pair[+A, +B] = Tuple2[A, B]
object Pair {
  def apply[A, B](x: A, y: B) = Tuple2(x, y)
  def unapply[A, B](x: Tuple2[A, B]): Option[Tuple2[A, B]] = Some(x)
}

To reference an object field or method, you use the syntax object_name.field or object_name.method(…), respectively. For example, Pair.apply(…). Note that this is the same syntax that is commonly used in languages with static fields and methods.

Tip

When an object named MyObject is compiled to a class file, the class file name will be MyObject$.class.

In Java and C#, the convention for defining constants is to use final static fields. (C# also has a constant keyword for simple fields, like ints and strings.) In Scala, the convention is to use val fields in objects.

Finally, recall from the section called “Nested Classes” that class definitions can be nested within other class definitions. This property generalizes for objects. You can define nested objects, traits, and classes inside other objects, traits, and classes.

Package Objects

Scala version 2.8 introduces a new scoping construct called package objects. They are used to define types, variables, and methods that are visible at the level of the corresponding package. To understand their usefulness, let’s see an example from Scala version 2.8 itself. The collection library is being reorganized to refine the package structure and to use it more consistently (among other changes). The Scala team faced a dilemma. They wanted to move types to new packages, but avoid breaking backwards compatibility. The package object construct provided a solution, along with other benefits.

For example, the immutable List is defined in the scala package in version 2.7, but it is moved to the scala.collection.immutable package in version 2.8. Despite the change, List is made visible in the scala package using package object scala, found in the src/library/scala/package.scala file in the version 2.8 source code distribution. Note the file name. It’s not required, but it’s a useful convention for package objects. Here is the full package object definition (at the time of this writing; it could change before the 2.8.0 final version).

package object scala {
  type Iterable[+A] = scala.collection.Iterable[A]
  val Iterable = scala.collection.Iterable

  @deprecated("use Iterable instead") type Collection[+A] = Iterable[A]
  @deprecated("use Iterable instead") val Collection = Iterable

  type Seq[+A] = scala.collection.Sequence[A]
  val Seq = scala.collection.Sequence

  type RandomAccessSeq[+A] = scala.collection.Vector[A]
  val RandomAccessSeq = scala.collection.Vector

  type Iterator[+A] = scala.collection.Iterator[A]
  val Iterator = scala.collection.Iterator

  type BufferedIterator[+A] = scala.collection.BufferedIterator[A]

  type List[+A] = scala.collection.immutable.List[A]
  val List = scala.collection.immutable.List

  val Nil = scala.collection.immutable.Nil

  type ::[A] = scala.collection.immutable.::[A]
  val :: = scala.collection.immutable.::

  type Stream[+A] = scala.collection.immutable.Stream[A]
  val Stream = scala.collection.immutable.Stream

  type StringBuilder = scala.collection.mutable.StringBuilder
  val StringBuilder = scala.collection.mutable.StringBuilder
}

Note that pairs of declarations like type List[+] = … and val List = … are effectively “aliases” for the companion class and object, respectively. Because the contents of the scala package is automatically imported by the compiler, you can still reference all the definitions in this object in any scope without an explicit import statement for fully-qualified names.

Other than the way the members in package objects are scoped, they behave just like other object declarations. While this example contains only vals and types, you can also define methods and you can subclass another class or trait and mix in other traits.

Another benefit of package objects is that it provides a more succinct implementation of what was an awkward idiom before. Without package objects, you would have to put definitions in an ad hoc object inside the desired package, then import from the object. For example, here is how List would have to be handled without a package object:

package scala {
  object toplevel {
    ...
    type List[+A] = scala.collection.immutable.List[A]
    val List = scala.collection.immutable.List
    ...
  }
}

...
import scala.toplevel._
...

Finally, another benefit of package objects is the way they provide a clear separation between the abstractions exposed by a package and the implementations that should be hidden inside it. In a larger application, a package object could be used to expose all the public types, values, and operations (methods) for a “component”, while everything else in the package and nested packages could be treated as internal implementation details.

Sealed Class Hierarchies

Recall from the section called “Case Classes” in Chapter 6, Advanced Object-Oriented Programming In Scala that we demonstrated pattern matching with our Shapes hierarchy, which use case classes. We had a default case _ => … expression. It’s usually wise to have one. Otherwise, if someone defines a new subtype of Shape and passes it to this match statement, a runtime scala.MatchError will be thrown, because the new shape won’t match the shapes covered in the match statement. However, it’s not always possible to define reasonable behavior for the default case.

There is an alternative solution if you know that the case class hierarchy is unlikely to change and you can define the whole hierarchy in one file. In this situation, you can add the sealed keyword to the declaration of the common base class. When sealed, the compiler knows all the possible classes that could appear in the match expression, because all of them must be defined in the same source file. So, if you cover all those classes in the case expressions (either explicitly or through shared parent classes), then you can safely eliminate the default case expression.

Here is an example using the HTTP 1.1 methods [HTTP1.1], which are not likely to change very often, so we declare a “sealed” set of case classes for them.

// code-examples/ObjectSystem/sealed/http-script.scala

sealed abstract class HttpMethod()
case class Connect(body: String) extends HttpMethod
case class Delete (body: String) extends HttpMethod
case class Get    (body: String) extends HttpMethod
case class Head   (body: String) extends HttpMethod
case class Options(body: String) extends HttpMethod
case class Post   (body: String) extends HttpMethod
case class Put    (body: String) extends HttpMethod
case class Trace  (body: String) extends HttpMethod

def handle (method: HttpMethod) = method match {
  case Connect (body) => println("connect: " + body)
  case Delete  (body) => println("delete: "  + body)
  case Get     (body) => println("get: "     + body)
  case Head    (body) => println("head: "    + body)
  case Options (body) => println("options: " + body)
  case Post    (body) => println("post: "    + body)
  case Put     (body) => println("put: "     + body)
  case Trace   (body) => println("trace: "   + body)
}

val methods = List(
  Connect("connect body..."),
  Delete ("delete body..."),
  Get    ("get body..."),
  Head   ("head body..."),
  Options("options body..."),
  Post   ("post body..."),
  Put    ("put body..."),
  Trace  ("trace body..."))

methods.foreach { method => handle(method) }

This script outputs the following.

connect: connect body...
delete: delete body...
get: get body...
head: head body...
options: options body...
post: post body...
put: put body...
trace: trace body...

No default case is necessary, since we cover all the possibilities. Conversely, if you omit one of the classes and you don’t provide a default case or a case for a shared parent class, the compiler warns you that the “match is not exhaustive”. For example, if you comment out the case for Put, you get this warning.

warning: match is not exhaustive!
missing combination            Put

def handle (method: HttpMethod) = method match {
...

You also get a MatchError exception if a Put instance is passed to the match.

Using sealed has one drawback. Every time you add or remove a class from the hierarchy, you have to modify the file, since the entire hierarchy has to be declared in the same file. This breaks the Open-Closed Principle ([Meyer1997] and [Martin2003]), which is a solution to the practical problem that it can be costly to modify existing code, retest it (and other code that uses it), and redeploy it. It’s much less “costly” if you can extend the system by adding new derived types in separate source files. This is why we picked the HTTP method hierarchy for the example. The list of methods is very stable.

Tip

Avoid sealed case class hierarchies if the hierarchy changes frequently (for an appropriate definition of “frequently”).

Finally, you may have noticed some duplication in the example. All the concrete classes have a body field. Why didn’t we put that field in the parent HttpMethod class? Because we decided to use case classes for the concrete classes, we’ll run into the same problem with case-class inheritance that we discussed previously in the section called “Case Class Inheritance” in Chapter 6, Advanced Object-Oriented Programming In Scala, where we added a shared id field in the Shape hierarchy. We need the body argument for each HTTP method’s constructor, yet it will be made a field of each method type automatically. So, we would have to use the override val technique we demonstrated previously.

We could remove the case keywords and implement the methods and companion objects that we need. However, in this case, the duplication is minimal and tolerable.

What if we want to use case classes, yet also reference the body field in HttpMethod? Fortunately, we know that Scala will generate a body reader method in every concrete subclass (as long as we use the name body consistently!). So, we can declare that method abstract in HttpMethod, then use it as we see fit. The following example demonstrates this technique.

// code-examples/ObjectSystem/sealed/http-body-script.scala

sealed abstract class HttpMethod() {
    def body: String
    def bodyLength = body.length
}

case class Connect(body: String) extends HttpMethod
case class Delete (body: String) extends HttpMethod
case class Get    (body: String) extends HttpMethod
case class Head   (body: String) extends HttpMethod
case class Options(body: String) extends HttpMethod
case class Post   (body: String) extends HttpMethod
case class Put    (body: String) extends HttpMethod
case class Trace  (body: String) extends HttpMethod

def handle (method: HttpMethod) = method match {
  case Connect (body) => println("connect: " + body)
  case Delete  (body) => println("delete: "  + body)
  case Get     (body) => println("get: "     + body)
  case Head    (body) => println("head: "    + body)
  case Options (body) => println("options: " + body)
  case Post    (body) => println("post: "    + body)
  case Put     (body) => println("put: "     + body)
  case Trace   (body) => println("trace: "   + body)
}

val methods = List(
  Connect("connect body..."),
  Delete ("delete body..."),
  Get    ("get body..."),
  Head   ("head body..."),
  Options("options body..."),
  Post   ("post body..."),
  Put    ("put body..."),
  Trace  ("trace body..."))

methods.foreach { method =>
  handle(method)
  println("body length? " + method.bodyLength)
}

We declared body abstract in HttpMethod. We added a simple bodyLength method that calls body. The loop at the end of the script calls bodyLength. Running this script produces the following output.

connect: connect body...
body length? 15
delete: delete body...
body length? 14
get: get body...
body length? 11
head: head body...
body length? 12
options: options body...
body length? 15
post: post body...
body length? 12
put: put body...
body length? 11
trace: trace body...
body length? 13

As always, every feature has pluses and minuses. Case classes and sealed class hierarchies have very useful properties, but they aren’t suitable for all situations.

The Scala Type Hierarchy

We have mentioned a number of types in Scala’s type hierarchy already. Let’s look at the general structure of the hierarchy, as illustrated in Figure 7.1, “Scala’s type hierarchy.”.

Figure 7.1. Scala’s type hierarchy.

images/TypeHierarchy.png

The following tables discuss the types shown in Figure 7.1, “Scala’s type hierarchy.”, as well as some other important types that aren’t shown. Some details are omitted for clarity. When the underlying “runtime” is discussed, the points made apply equally to the JVM and the .NET CLR, except where noted.

Table 7.2. Any, AnyVal, and AnyRef.

Name Parent Description

Any

none

the root of the hierarchy. Defines a few final methods like ==, !=, isInstanceOf[T] (for type checking), and asInstanceOf[T] (for type casting), as well as default versions of equals, hashCode, and toString, which are designed to be overridden by subclasses.

AnyVal

Any

The parent of all value types, which correspond to the primitive types on the runtime platform, plus Unit. All the AnyVal instances are immutable value instances and all the AnyVal types are abstract final. Hence, none of them can be instantiated with new. Rather, new instances are created with literal values (e.g., 3.14 for a Double) or by calling methods on instances that return new values.

AnyRef

Any

The parent of all reference types, including all java.* and scala.* types. It is equivalent to java.lang.Object for the JVM and object (System.Object) for the .NET runtime. Instances of reference types are created with new.


The value types are children of AnyVal.

Table 7.3. Direct subtypes of AnyVal, the value types.

Name Runtime Primitive Type

Boolean

boolean (true and false).

Byte

byte.

Char

char.

Short

short.

Int

int.

Long

long.

Float

float.

Double

double.

Unit

Serves the same role as void in most imperative languages. Used primarily as a function return value. There is only one instance of Unit, named (). Think of it as a tuple with zero items.


All other types, the reference types, are children of AnyRef. Here are some of the more commonly-used reference types. Note that there are some significant differences between the version 2.7.X and 2.8 collections.

Table 7.4. Direct and indirect subtypes of AnyRef, the reference types.

Name Parent Description

Collection[+T]

Iterable[T]

Trait for collections of known size.

Either[+T1, +T2]

AnyRef

Used most often as a return type when a method could return an instance of one of two unrelated types. For example, an exception or a “successful” result. The Either can be pattern matched for its Left or Right subtypes. (It is analogous to Option, with Some and None) For the exception-handling idiom, it is conventional to use Left for the exception.

FunctionN[-T1, -T2, …, -TN, +R]

AnyRef

Trait representing a function that takes N arguments, each of which can have its own type, and returns a value of type R. (Traits are defined for N = 0 to 22.) The variance annotations (‘+’ and ‘-’) in front of the types will be explained in the section called “Variance Under Inheritance” in Chapter 12, The Scala Type System.

Iterable[+T]

AnyRef

Trait with methods for operating on collections of instances. Users implement the abstract elements method to return an Iterable instance.

List[+T]

Seq[T]

sealed abstract class for ordered collections with functional-style list semantics. It is the most widely-used collection in Scala, so it is defined in the scala package, rather than one of the collection packages. (In Scala version2.8, it is actually defined in scala.collection.immutable and “aliased” in package object scala). It has two subclasses, case object Nil, which extends List[Nothing] and represents an empty list, and case final class ::[T], which represents a non-empty list, characterized by a head element and a tail list, which would be Nil for a one-element list.

Nothing

all other types

Nothing is the subtype of all other types. It has no instances. It is used primarily for defining other types in a type-safe way, such as the special List subtype Nil. See also the section called “Nothing and Null” in Chapter 12, The Scala Type System.

Null

all reference types

Null has one instance, null, corresponding to the runtime’s concept of null.

Option[T]

Product

Wraps an optional item. It is a sealed abstract type and the only allowed instances are an instance of its derived case class Some[T], wrapping an instance of T, or its derived case object None, which extends Option[Nothing].

Predef

AnyRef

An object that defines and imports many commonly-used types and methods. See the section called “The Predef Object” earlier in this chapter for details.

Product

AnyRef

Trait with methods for determining arity and getting the nth item in a “cartesian product”. Subtraits are defined for Product, called ProductN, for dimension N from 1 through 22.

ScalaObject

AnyRef

Mixin trait added to all Scala reference type instances.

Seq[+T]

Collection[T]

Trait for ordered collections.

TupleN

ProductN

Separate case classes for arity N = 1 through 22. Tuples support the literal syntax (x1, x2, …, xN).


Besides List, some of the other library collections include Map, Set, Queue, and Stack. These other collections come in two varieties, mutable and immutable. The immutable collections are in the package scala.collection.immutable, while the mutable collections are in scala.collection.mutable. Only an immutable version of List is provided; for a mutable list, use a ListBuffer, which can return a List via the toList method. For Scala version 2.8, the collections implementations reuse code from scala.collection.generic. Users of the collections would normally not use any types defined in this package. We’ll explore some of these collections in greater detail in the section called “Functional Data Structures” in Chapter 8, Functional Programming in Scala.

Consistent with its emphasis on functional programming (see Chapter 8, Functional Programming in Scala), Scala encourages you to use the immutable collections, since List is automatically imported and Predef defines types Map and Set that refer to the immutable versions of these collections. All other collections have to be imported explicitly.

Predef defines a number of implicit conversion methods for the value types (excluding Unit). There are implicit conversions to the corresponding scala.runtime.RichX types. For example, the byteWrapper method converts a Byte to a scala.runtime.RichByte. There are implicit conversions between the “numeric” types, Byte, Short, Int, Long, and Float to the other types that are “wider” than the original. For example, Byte to Int, Int to Long, Int, to Double, etc. Finally, there are conversions to the corresponding Java wrapper types, e.g., Int to java.lang.Integer. We discuss implicit conversions in more detail in the section called “Implicit Conversions” in Chapter 8, Functional Programming in Scala.

There are several examples of Option elsewhere, e.g., the section called “Option, Some, and None: Avoiding nulls” in Chapter 2, Type Less, Do More. Here is a script that illustrates using an Either return value to handle a thrown exception or successful result (adapted from http://dcsobral.blogspot.com/2009/06/catching-exceptions.html).

// code-examples/ObjectSystem/typehierarchy/either-script.scala

def exceptionToLeft[T](f: => T): Either[java.lang.Throwable, T] = try {
  Right(f)
} catch {
  case ex => Left(ex)
}

def throwsOnOddInt(i: Int) = i % 2 match {
  case 0 => i
  case 1 => throw new RuntimeException(i + " is odd!")
}

for(i <- 0 to 3)
  exceptionToLeft(throwsOnOddInt(i)) match {
    case Left(ex) => println("Oops, got exception " + ex.toString)
    case Right(x) => println(x)
  }

The exceptionToLeft method evaluates f. It catches a Throwable and returns it as the Left value or returns the normal result as the Right value. The for loop uses this method to invoke throwsOnOddInt. It pattern matches on the result and prints an appropriate message. The output of the script is the following.

0
Oops, got exception java.lang.RuntimeException: 1 is odd!
2
Oops, got exception java.lang.RuntimeException: 3 is odd!

A FunctionN trait, where N is 0 to 22, is instantiated for an anonymous function with N arguments. So, consider the following anonymous function.

(t1: T1, ..., tN: TN) => new R(...)

It is syntactic sugar for the following creation of an anonymous class.

new FunctionN {
  def apply(t1: T1, ..., tN: TN): R = new R(...)

  // other methods
}

We’ll revisit FunctionN in the section called “Variance Under Inheritance” and the section called “Function Types” in Chapter 12, The Scala Type System.

Linearization of an Object’s Hierarchy

Because of single inheritance, the inheritance hierarchy would be linear, if we ignored mixed-in traits. When traits are considered, each of which may be derived from other traits and classes, the inheritance hierarchy forms a directed, acyclic graph [ScalaSpec2009]. The term linearization refers to the algorithm used to “flatten” this graph for the purposes of resolving method lookup priorities, constructor invocation order, binding of super, etc.

Informally, we saw in the section called “Stackable Traits” in Chapter 4, Traits that when an instance has more than one trait, they bind right to left, as declared. Consider the following example of linearization.

// code-examples/ObjectSystem/linearization/linearization1-script.scala

class C1 {
  def m = List("C1")
}

trait T1 extends C1 {
  override def m = { "T1" :: super.m }
}

trait T2 extends C1 {
  override def m = { "T2" :: super.m }
}

trait T3 extends C1 {
  override def m = { "T3" :: super.m }
}

class C2 extends T1 with T2 with T3 {
  override def m = { "C2" :: super.m }
}

val c2 = new C2
println(c2.m)

Running this script yields the following output.

List(C2, T3, T2, T1, C1)

This list of strings built up by the m methods reflects the linearization of the inheritance hierarchy, with a few missing pieces we’ll discuss shortly. We’ll also see why C1 is at the end of the list. First, let’s see what the invocation sequence of the constructors looks like.

// code-examples/ObjectSystem/linearization/linearization2-script.scala

var clist = List[String]()

class C1 {
  clist ::= "C1"
}

trait T1 extends C1 {
  clist ::= "T1"
}

trait T2 extends C1 {
  clist ::= "T2"
}

trait T3 extends C1 {
  clist ::= "T3"
}

class C2 extends T1 with T2 with T3 {
  clist ::= "C2"
}

val c2 = new C2
println(clist.reverse)

Running this script yields the following output.

List(C1, T1, T2, T3, C2)

So, the construction sequence is the reverse. (We had to reverse the list on the last line, because the way it was constructed put the elements in the reverse order.) This invocation order makes sense. For proper construction to occur, the parent types need to be constructed before the derived types, since a derived type often uses fields and methods in the parent types during its construction process.

The output of the first linearization script is actually missing three types at the end. The full linearization for reference types actually ends with ScalaObject, AnyRef, and Any. So the linearization for C2 is actually.

List(C2, T3, T2, T1, C1, ScalaObject, AnyRef, Any)

Scala inserts the ScalaObject trait as the last mixin, just before AnyRef and Any that are the penultimate and ultimate parent classes of any reference type. Of course, these three types do not show up in the output of the scripts, because we used an ad hoc m method to figure out the behavior by building up an output string.

The “value types”, subclasses of AnyVal, are all declared abstract final. The compiler manages instantiation of them. Since we can’t subclass them, their linearizations are simple and straightforward.

The linearization defines the order in which method look-up occurs. Let’s examine it more closely.

All our classes and traits define the method m. The one in C2 is called first, since the instance is of that type. C2.m calls super.m, which resolves to T3.m. The search appears to be “breadth-first”, rather than “depth-first”. If it were depth-first, it would invoke C1.m after T3.m. After, T3.m, T2.m, then T1.m, and finally C1.m are invoked. C1 is the parent of the three traits. From which of the traits did we traverse to C1? Actually, it is breadth-first, with "delayed" evaluation, as we will see. Let’s modify our first example and see how we got to C1.

// code-examples/ObjectSystem/linearization/linearization3-script.scala

class C1 {
  def m(previous: String) = List("C1("+previous+")")
}

trait T1 extends C1 {
  override def m(p: String) = { "T1" :: super.m("T1") }
}

trait T2 extends C1 {
  override def m(p: String) = { "T2" :: super.m("T2") }
}

trait T3 extends C1 {
  override def m(p: String) = { "T3" :: super.m("T3") }
}

class C2 extends T1 with T2 with T3 {
  override def m(p: String) = { "C2" :: super.m("C2") }
}

val c2 = new C2
println(c2.m(""))

Now we pass the name of the caller of super.m as a parameter, then C1 prints out who called it. Running this script yields the following output.

List(C2, T3, T2, T1, C1(T1))

It’s the last one, T1. We might have expected T3 from a “naïve” application of breadth-first traversal.

Here is the actual algorithm for calculating the linearization. A more formal definition is given in [ScalaSpec2009].

This explains how we got to C1 from T1 in the previous example. T3 and T2 also have it in their linearizations, but they come before T1, so the C1 terms they contributed were deleted.

Let’s work through the algorithm using a slightly more involved example.

// code-examples/ObjectSystem/linearization/linearization4-script.scala

class C1 {
  def m = List("C1")
}

trait T1 extends C1 {
  override def m = { "T1" :: super.m }
}

trait T2 extends C1 {
  override def m = { "T2" :: super.m }
}

trait T3 extends C1 {
  override def m = { "T3" :: super.m }
}

class C2A extends T2 {
  override def m = { "C2A" :: super.m }
}

class C2 extends C2A with T1 with T2 with T3 {
  override def m = { "C2" :: super.m }
}

def calcLinearization(obj: C1, name: String) = {
  val lin = obj.m ::: List("ScalaObject", "AnyRef", "Any")
  println(name + ":  " + lin)
}

calcLinearization(new C2, "C2 ")
println("")
calcLinearization(new T3 {}, "T3 ")
calcLinearization(new T2 {}, "T2 ")
calcLinearization(new T1 {}, "T1 ")
calcLinearization(new C2A, "C2A")
calcLinearization(new C1, "C1 ")

The output is the following.

C2 :  List(C2, T3, T1, C2A, T2, C1, ScalaObject, AnyRef, Any)

T3 :  List(T3, C1, ScalaObject, AnyRef, Any)
T2 :  List(T2, C1, ScalaObject, AnyRef, Any)
T1 :  List(T1, C1, ScalaObject, AnyRef, Any)
C2A:  List(C2A, T2, C1, ScalaObject, AnyRef, Any)
C1 :  List(C1, ScalaObject, AnyRef, Any)

To help us along, we calculated the linearizations for the other types and we also appended ScalaObject, AnyRef, and Any to remind ourselves that they should also be there. We also removed the logic to pass the caller’s name to m. That caller of C1 will always be the element to its immediate left.

So, let’s work through the algorithm for C2 and confirm our results. We’ll suppress the ScalaObject, AnyRef, and Any for clarity, until the end.

Table 7.5. Hand Calculation of C2 linearization: C2 extends C2A with T1 with T2 with T3 {…}.

# Linearization Description

1

C2

Add the type of the instance.

2

C2, T3, C1

Add the linearization for T3 (farthest on the right).

3

C2, T3, C1, T2, C1

Add the linearization for T2.

4

C2, T3, C1, T2, C1, T1, C1

Add the linearization for T1.

5

C2, T3, C1, T2, C1, T1, C1, C2A, T2, C1

Add the linearization for C2A.

6

C2, T3, T2, T1, C2A, T2, C1

Remove duplicates of C1; all but the last C1.

7

C2, T3, T1, C2A, T2, C1

Remove duplicate T2; all but the last T2.

8

C2, T3, T1, C2A, T2, C1, ScalaObject, AnyRef, Any

Finish!


What the algorithm does is push any shared types to the right until they come after all the types that derive from them.

Try modifying the last script with different hierarchies and see if you can reproduce the results using the algorithm.

Tip

Overly complex type hierarchies can result in method lookup “surprises”. If you have to work through this algorithm to figure out what’s going on, try to simplify your code.

Recap and What’s Next

We have finished our survey of Scala’s object model. If you come from an object-oriented language background, you now know enough about Scala to replace your existing object-oriented language with object-oriented Scala.

However, there is much more to come. Scala supports functional programming, which offers powerful mechanisms for addressing a number of design problems, such as concurrency. We’ll see that functional programming appears to contradict object-oriented programming, at least on the surface. That said, a guiding principle behind Scala is that these two paradigms complement each other more than they conflict. Combined, they give you more options for building robust, scalable software. Scala lets you choose the techniques that work best for your needs.

You must sign in or register before commenting

Chapter 8. Functional Programming in Scala

Every decade or two, a major computing idea goes mainstream. These ideas may have lurked in the background of academic computer science research, or possibly in some lesser-known field of industry. The transition to mainstream acceptance comes in response to a perceived problem for which the idea is well suited. Object-oriented programming, which was invented in the 1960s, went mainstream in the 1980s, arguably in response to the emergence of graphical user interfaces, for which the OOP paradigm is a natural fit.

Functional programming appears to be experiencing a similar break-out. Long the topic of computer science research and even older than object-oriented programming, functional programming offers effective techniques for concurrent programming, which is growing in importance.

Because functional programming is less widely understood than object-oriented programming, we won’t assume that you have prior experience with it. We’ll start this chapter with plenty of background information. As you’ll see, functional programming is not only a very effective way to approach concurrent programming, which we’ll explore in depth in Chapter 9, Robust, Scalable Concurrency with Actors, but functional programming can also improve your objects, as well.

Of course, we can’t provide an exhaustive introduction to functional programming. To learn more about it, [O'Sullivan2009] has a more detailed introduction in the context of the Haskell language. [Abelson1996], [VanRoy2004], and [Turbak2008] offer thorough introductions to general programming approaches, including functional programming. Finally, [Okasaki1998] and [Rabhi1999] discuss functional data structures and algorithms in detail.

What Is Functional Programming?

Don’t all programming languages have functions of some sort? Whether they are called methods, procedures, or GOTOs, programmers are always dealing in functions. Functional programming is based on the behavior of functions in the mathematical sense, with all the implications that starting point implies.

Functions in Mathematics

In mathematics, functions have no side effects. Consider the classic function sin(x).

y = sin(x)

No matter how much work sin(x) does, all the results are returned and assigned to y. No global state of any kind is modified internally by sin(x). Hence, we say that such a function is free of side-effects or pure.

This property simplifies enormously the challenge of analyzing, testing, and debugging a function. You can do these things without having to know anything about the context in which the function is invoked, except for any other functions it might call. However, you can analyze them in the same way, working bottom up to verify the whole “stack”.

This obliviousness to the surrounding context is known as Referential Transparency. You can call such a function anywhere and be confident that it will always behave the same way. If no global state is modified, concurrent invocation of the function is straightforward and reliable.

In functional programming, you can compose functions from other functions. For example, +tan(x) = sin(x)/cos(x). An implication of compsability is that functions can be treated as values. In other words, functions are first class, just like data. You can assign functions to variables. You can pass functions to other functions. You can return functions as values from functions. In the functional paradigm, functions become a primitive type, a building block that’s just as essential to the work of programming as integers or strings.

When a function takes other functions as arguments or returns a function, it is called a higher-order function. In mathematics, two examples of higher-order functions from calculus are derivation and integration.

Variables that Aren’t

The word “variable” takes on a new meaning in functional programming. If you come from a procedural or object-oriented programming background, you are accustomed to variables that are mutable. In functional programming variables are immutable.

This is another consequence of the mathematical orientation. In the expression y = sin(x), once you pick x, then y is fixed. As another example, if you increment the integer 3 by 1, you don’t “modify the 3 object”, you create a new value to represent 4.

To be more precise, it is the values that are immutable. Functional programming languages prevent you from assigning a new value to a variable that already has a value.

Immutability is difficult when you’re not used to it. If you can’t change a variable, then you can’t have loop counters, for example. We’re accustomed to objects that change their state when we call methods on them. Learning to think in immutable terms takes some effort.

However, immutability has enormous benefits for concurrency. Almost all the difficulty of multithreaded programming lies in synchronizing access to shared, mutable state. If you remove mutability, then the problems essentially go away. It is the combination of referentially-transparent functions and immutable values that make functional programming compelling as a “better way” to write concurrent software.

These qualities benefit programs in other ways. Almost all the constructs we have invented in sixty-odd years of computer programming have been attempts to manage complexity. Higher-order functions and referential transparency provide very flexible building blocks for composing programs.

Immutability greatly reduces regression bugs, many of which are caused by unintended state changes in one part of a program due to intended changes in another part. There are other contributers to such non-local effects, but mutability is one of the most important.

It’s common in object-oriented designs to encapsulate access to data structures in objects. If these structures are mutable, we can’t simply share them with clients. We have to add special accessor methods to control access, so clients can’t modify them outside our control. These additions increase code size, which increases the testing and maintenance burden, and they increase the effort required by clients to understand the ad hoc features of our APIs.

In contrast, when we have immutable data structures, many of these problems simply go away. We can provide access to collections without fear of data loss or corruption. Of course, the general principles of minimal coupling still apply; should clients care if a Set or List is used, as long foreach is available?

Immutable data also implies that lots of copies will be made, which can be expensive. Functional data structures optimize for this problem [Okasaki1998] and many of the built-in Scala types are efficient at creating new copies from existing copies.

It’s time to dive into the practicalities of functional programming in Scala. We’ll discuss other aspects and benefits of the approach as we proceed.

Functional Programming in Scala

As a hybrid object-functional language, Scala does not require functions to be pure, nor does it require variables to be immutable. It does, however, encourage you to write your code this way whenever possible. You have the freedom to use procedural or object-oriented techniques when and where they seem most appropriate.

Though functional languages are all about eliminating side effects, a language that never allowed for side effects would be useless. Input and output (IO) is inherently about side effects, and IO is essential to all programming tasks. For this reason, all functional languages provide mechanisms for performing side effects in a controlled way.

Scala doesn’t restrict what you can do, but we encourage you to use immutable values and pure functions and methods whenever possible. When mutability and side effects are necessary, pursue them in a “principled” way, isolated in well-defined modules and focused on individual tasks.

If you’re new to functional programming, keep in mind that it’s easy to fall back to old habits. We encourage you to master the functional side of Scala and to learn to use it effectively.

Tip

A function that returns Unit implies that the function has pure side effects, meaning if it does any useful work, that work must be all side effects, since the function doesn’t return anything.

We’ve seen many examples of higher-order functions and composability in Scala. For example, List.map takes a function to transform each element of the list to something else.

// code-examples/FP/basics/list-map-example-script.scala

List(1, 2, 3, 4, 5) map { _ * 2 }

Recall that _ * 2 is a function literal that is shorthand for i => i * 2. For each argument to the function, you can use _ if the argument is used only once. We also used the infix operator notation to invoke map. Here’s an example that “reduces” the same list by multiplying all the elements together.

// code-examples/FP/basics/list-reduceLeft-example-script.scala

List(1, 2, 3, 4, 5) reduceLeft { _ * _ }

The first _ represents the argument that is accumulating the value of the reduction and the second _ represents the current element of the list.

Both examples successfully “looped” through the list without the use of a mutable counter to track iterations. Most containers in the Scala library provide functionally-pure iteration methods. In other cases, recursion is the preferred way to traverse a data structure or perform an algorithm. We’ll return to this topic in the section called “Recursion”.

Function Literals and Closures

Let’s expand our previous map example a bit.

// code-examples/FP/basics/list-map-closure-example-script.scala

var factor = 3
val multiplier = (i:Int) => i * factor

val l1 = List(1, 2, 3, 4, 5) map multiplier

factor = 5
val l2 = List(1, 2, 3, 4, 5) map multiplier

println(l1)
println(l2)

We defined a variable, factor, to use as the multiplication factor and we pulled out the previous anonymous function into a value called multiplier that now uses factor. Then we map over a list of integers, as we did before. After the first call to map, we change factor and map again. Here is the output.

List(3, 6, 9, 12, 15)
List(5, 10, 15, 20, 25)

Even though multiplier was an immutable function value, its behavior changed when factor changed.

There are two free variables in multiplier, i and factor. One of them, i, is a formal parameter to the function. Hence, it is bound to a new value each time multiplier is called.

However, factor is not a formal parameter, but a reference to a variable in the enclosing scope. Hence, the compiler creates a closure that encompasses (or “closes over”) multiplier and the external context of the unbound variables multiplier references, thereby binding those variables as well.

This is why the behavior of multiplier changed after changing factor. It references factor and reads its current value each time. If a function has no external references, then it is trivially closed over itself. No external context is required.

Purity Inside vs. Outside

If we called sin(x) thousands of times with the same value of x, it would be wasteful if it calculated the same value every single time. Even in “pure” functional libraries, it is common to perform internal optimizations like caching previously-computed values (sometimes called memoization). Caching introduces side effects, as the state of the cache is modified.

However, this lack of purity should be opaque to the user (except perhaps in terms of the performance impact). If you are designing functional libraries, ensure that they preserve the purity of their abstractions, including the behavior of referential transparency and its implications for concurrency.

You can see examples of functional libraries with mutable internals in the Scala library. The methods in List often use mutable local variables for efficient traversal. The local variables are thread safe, as are the traversals, since Lists themselves are immutable.

Recursion

Recursion plays a larger role in pure functional programming than in imperative programming, in part because of the restriction that variables are immutable. For example, you can’t have loop counters, which would change on each pass through a loop. One way to implement looping in a purely functional way is with recursion.

Calculating factorials provides a good example. Here is an imperative loop implementation.

// code-examples/FP/recursion/factorial-loop-script.scala

def factorial_loop(i: BigInt): BigInt = {
  var result = BigInt(1)
  for (j <- 2 to i.intValue)
    result *= j
  result
}

for (i <- 1 to 10)
  format("%s: %s\n", i, factorial_loop(i))

Both the loop counter j and the result are mutable variables. (For simplicity, we’re ignoring input numbers that are less than or equal to zero.) The output of the script is the following.

1: 1
2: 2
3: 6
4: 24
5: 120
6: 720
7: 5040
8: 40320
9: 362880
10: 3628800

Here’s a first pass at a recursive implementation.

// code-examples/FP/recursion/factorial-recur1-script.scala

def factorial(i: BigInt): BigInt = i match {
  case _ if i == 1 => i
  case _ => i * factorial(i - 1)
}

for (i <- 1 to 10)
  format("%s: %s\n", i, factorial(i))

The output is the same, but now there are no mutable variables. Recursion not only helps us avoid mutable variables, it is also the most natural way to express some functions, particularly mathematical functions. The recursive definition in our second factorial is structurally similar to a definition for factorials that you might see in a mathematics book.

However, there are two potential problems with recursion: the performance overhead of repeated function invocations and the risk of stack overflow.

Performance problems in a recursive scenario can sometimes be addressed with memoization, but care should be taken that the space requirements of caching don’t outweigh the performance benefits.

Stack overflow can be avoided by converting the recursive invocation into a loop of some kind. In fact, the Scala compiler can do this conversion for you for some kinds of recursive invocations, which we describe next.

Tail Calls and Tail-Call Optimization

A particular kind of recursion is called tail call recursion, which occurs when a function calls itself as its final operation. Tail call recursion is very important, because it is the easiest kind of recursion to optimize by conversion into a loop. Loops eliminate the potential of a stack overflow, and they improve performance by eliminating the recursive function call overhead. While tail recursion optimizations are not yet supported natively on the JVM, scalac can do them.

However, our factorial example is not a tail recursion, because factorial calls itself and then does a multiplication with the results. There is a way to implement factorial in a tail recursive way. We actually saw an implementation in the section called “Nesting Method Definitions” in Chapter 2, Type Less, Do More. However, that example didn’t use some constructs we’ve learned about since, such as for comprehensions and pattern matching. So, here’s a new implementation of factorial, calculated with tail call recursion.

// code-examples/FP/recursion/factorial-recur2-script.scala

def factorial(i: BigInt): BigInt = {
  def fact(i: BigInt, accumulator: BigInt): BigInt = i match {
    case _ if i == 1 => accumulator
    case _ => fact(i - 1, i * accumulator)
  }
  fact(i, 1)
}

for (i <- 1 to 10)
  format("%s: %s\n", i, factorial(i))

This script produces the same output as before. Now, factorial does all the work with a nested method, fact, that is tail recursive because it passes an accumulator argument to hold the computation in progress. This argument is computed with a multiplication before the recursive call to fact, which is now the very last thing that is done. In our previous implementation, this multiplication was done after the call to fact. When we call fact(1), we simply return the accumulated value.

If you call our original non-tail recursive implementation of factorial with a large number, say 10000, you’ll cause a stack overflow on a typical desktop computer. The tail-recursive implementation works successfully, returning a very large number.

This idiom of nesting a tail-recursive function that uses an accumulator is a very useful technique for converting many recursive algorithms into tail recursions that can be optimized into loops by scalac.

Note

The tail-call optimization won’t be applied when a method that calls itself might be overridden in a derived type. The method must be private or final, defined in an object, or nested in another method (like fact above). The new @tailrec annotation in version 2.8 will trigger an error if the compiler can’t optimize the annotated method. (See the section called “Annotations” in Chapter 13, Application Design)

Trampoline for Tail Calls

A trampoline is a loop that works through a list of functions, calling each one in turn. The metaphor of bouncing the functions off a trampoline is the source of the name.

Consider a kind of recursion where a function A doesn’t call itself recursively, but instead it calls another function B, which calls A, which calls B, etc. This kind of back-and-forth recursion can also be converted into a loop using a trampoline. Note that trampolines impose a performance overhead, but they are ideal for pure functional recursions (vs. an imperative equivalent) that would otherwise exhaust the stack.

Support for this optimization is planned for Scala version 2.8, although it has not yet been implemented at the time of this writing.

Functional Data Structures

There are several data structures that are common in functional programming, most of which are containers, like collections. Languages like Erlang rely on very few types, while other functional languages provide a richer type system.

The common data structures support the same subset of higher-order functions for read-only traversal and access to the elements in the data structures. These features make them suitable as “protocols” for minimizing the coupling between components, while supporting data exchange.

In fact, these data structures and their operations are so useful that many languages support them, including those that are not considered functional languages, like Java and Ruby. Java doesn’t support higher-order functions directly. Instead, function values have to be wrapped in objects. Ruby uses procs and lambdas as function values.

Lists in Functional Programming

Lists are the most common data structure in functional programming. They are the core of the first functional programming language, Lisp.

In the interest of immutability, a new list is created when you add an element to a list. It is conventional to prepend the new element to the list, as we’ve seen before.

// code-examples/FP/datastructs/list-script.scala

val list1 = List("Programming", "Scala")
val list2 = "People" :: "should" :: "read" :: list1
println(list2)

Because the :: operator binds to the right, the definition of list2 is equivalent to both of the following variations.

val list2 = ("People" :: ("should" :: ("read" :: list1)))
val list2 = list1.::("read").::("should").::("People")

In terms of performance, prepending is O(1). We’ll see why when we dive into Scala’s implementation of List in the section called “A Closer Look at Lists” in Chapter 12, The Scala Type System, after we have learned more about parameterized types in Scala.

Unlike some of the other collections, Scala only defines an immutable List. However, it also defines some mutable list types, such as ListBuffer and LinkedList

Maps in Functional Programming

Perhaps the second most common data structure is the map, referred to as a hash or dictionary in other languages, and not to be confused with the map function we saw above. Maps are used to hold pairs of keys and values.

In the interest of minimalism, maps could be implemented with lists. Every even element in the list (counting from zero) could be a key, followed by the value in the next odd position. In practice, maps are usually implemented in other ways for efficiency.

Scala supports the special initialization syntax we saw previously.

val stateCapitals = Map(
  "Alabama" -> "Montgomery",
  "Alaska"  -> "Juneau",
  // ...
  "Wyoming" -> "Cheyenne")

The scala.collection.Map[A,+B] trait only defines methods for reading the Map. There are derived traits for immutable and mutable maps, scala.collection.immutable.Map[A,+B] and scala.collection.mutable.Map[A,B], respectively. They define + and - operators for adding and removing elements, and ++ and -- operators for adding and removing elements defined in Iterators of Pairs, where each Pair is a key-value pair.

Note

You might have noticed that the + does not appear in front of the B type parameters for scala.collection.mutable.Map. You’ll see why in the section called “Variance of Mutable Types” in Chapter 12, The Scala Type System.

Sets in Functional Programming

Sets are like lists, but they require each element to be unique. Sets could also be implemented using lists, as long as the equivalent of the list “cons” operator (::) first checks that the element doesn’t already exist in the storage list. This property means that element insertion would be O(N) if a storage list were used, and the order of the elements in the Set wouldn’t necessarily match the order of “insertion” operations. In practice, sets are usually implemented with more efficient data structures.

Just as for Map, the scala.collection.Set[A] trait only defines methods for reading the Set. There are derived traits for immutable and mutable sets, scala.collection.immutable.Set[A] and scala.collection.mutable.Set[A], respectively. They define + and - operators for adding and removing elements, and ++ and -- operators for adding and removing elements defined in Iterators (which could be other sets, lists, etc.).

Other Data Structures in Functional Programming

Other familiar data structures, like Tuples and Arrays, will appear in functional languages. Typically, they’re used to provide some convenient feature not supported by a more common functional type. In most cases they could be replaced with lists.

Traversing, Mapping, Filtering, Folding, and Reducing

The functional collections we just discussed, lists, maps, sets, as well as tuples and arrays, all support several common operations based on read-only traversal. In fact, this uniformity can be exploited if any “container” type also supports these operations. For example, an Option contains zero or one elements, if it is a None or Some, respectively.

Traversal

The standard traversal method for Scala containers is foreach, which is defined by the Iterable traits that the containers mix in. It is O(N) in the number of elements. Here is an example of its use for lists and maps.

// code-examples/FP/datastructs/foreach-script.scala

List(1, 2, 3, 4, 5) foreach { i => println("Int: " + i) }

val stateCapitals = Map(
  "Alabama" -> "Montgomery",
  "Alaska"  -> "Juneau",
  "Wyoming" -> "Cheyenne")

stateCapitals foreach { kv => println(kv._1 + ": " + kv._2) }

The signature of foreach is the following.

trait Iterable[+A] {
  ...
  def foreach(f : (A) => Unit) : Unit = ...
  ...
}

foreach is a higher-order function that takes a function argument: the operation to perform on each element. Note that for a map, A is actually a tuple, as shown in the example. Also, foreach returns Unit. foreach is not intended to create new collections; we’ll see examples of operations that create collections shortly.

Once you have foreach, you can implement all the other traversal operations we’ll discuss next, and more. A look at Iterable will show that it supports methods for filtering collections, finding elements that match specified criteria, calculating the number of elements, and so forth.

The methods we’ll discuss next are hallmarks of functional programming: mapping, filtering, folding, and reducing.

Mapping

We’ve encountered the map method before. It returns a new collection of the same size as the original collection. It is also a member of Iterable and its signature is

trait Iterable[+A] {
  ...
  def map[B](f : (A) => B) : Iterable[B] = ...
  ...
}

The passed-in function (f) can transform an original element of type A to a new type B. Here is an example.

// code-examples/FP/datastructs/map-script.scala

val stateCapitals = Map(
  "Alabama" -> "Montgomery",
  "Alaska"  -> "Juneau",
  "Wyoming" -> "Cheyenne")

val lengths = stateCapitals map { kv => (kv._1, kv._2.length) }
println(lengths)

This script produces the following output:

ArrayBuffer((Alabama,10), (Alaska,6), (Wyoming,8))

That is, we convert the Pair[String,String] elements to an ArrayBuffer of Pair[String,Int] elements. Where did the ArrayBuffer come from? It turns out that Iterable.map creates and returns an ArrayBuffer as the new Iterable collection.

This brings up a general conflict between immutable types and object-oriented type hierarchies. If a base type creates a new instance on modification, how does it know what kind of type to create?

You could solve this problem two ways. First, you could have each type in the hierarchy override methods like map to return an instance of their own type. This approach is error prone, though, as it would be easy to forget to override all such methods when a new type is added.

Even if you always remember to override each method, you have the dilemma of how to implement the override. Do you call the super method to reuse the algorithm, then iterate through the returned instance to create a new instance of the correct type? That would be inefficient. You could copy and paste the algorithm into each override, but that creates issues of code bloat, maintainability, and skew.

There’s an alternative approach: don’t even try. How is the new instance that is returned actually used? Do we really care if it has the “wrong” type? Keep in mind that all we usually care about are the low-level abstractions like lists, maps, and sets. In the case of functional data structures, the derived types we might implement using object-oriented inheritance are most often implementation optimizations. The Scala type hierarchy for containers does have a few levels of abstractions at the bottom, e.g., Collection extends Iterable extends AnyRef, but above Collection are Seq (parent of List), Map, Set, etc.

That said, if you really need a Map, you can create one easily enough.

// code-examples/FP/datastructs/map2-script.scala

val stateCapitals = Map(
  "Alabama" -> "Montgomery",
  "Alaska"  -> "Juneau",
  "Wyoming" -> "Cheyenne")

val map2 = stateCapitals map { kv => (kv._1, kv._2.length) }

// val lengths = Map(map2)  // ERROR: won't work
val lengths = Map[String,Int]() ++ map2

println(lengths)

The commented-out line suggests that it would be nice if you could simply pass the new Iterable to Map.apply, but this doesn’t work. Here is the signature of Map.apply.

object Map {
  ...
  def apply[A, B](elems : (A, B)*) : Map[A, B] = ...
  ...
}

It expects a variable argument list, not an Iterable. However, we can create an empty map of the right type and then add the new Iterable to it, using the ++ method, which returns a new Map.

So, we can get the Map we want when we must have one. While it would be nice if methods like map returned the same collection type, we saw that there is no easy way to do this. Instead, we accept that map and similar methods return an abstraction like Iterable and then rely on the specific subtypes to take Iterables as input arguments for populating the collection.

// code-examples/FP/datastructs/flatmap-script.scala

val graph = List(
  "a", List("b1", "b2", "b3"), List("c1", List("c21", Nil, "c22"), Nil, "e")
)

def flatten(list: List[_]): List[_] = list flatMap {
  case head :: tail => head :: flatten(tail)
  case Nil => Nil
  case x => List(x)
}

println(flatten(graph))

This script reduces the hierarchical graph to List(a, b1, b2, b3, c1, c21, c22, e). Notice that the Nil elements have been removed. We used List[_] because we won’t know what the type parameters are for any embedded lists when we’re traversing the outer list, due to type erasure.

Here is the signature for flatMap, along with map, for comparison.

trait Iterable[+A] {
  ...
  def map[B]    (f : (A) => B) : Iterable[B] = ...
  def flatMap[B](f : (A) => Iterable[B]) : Iterable[B]
  ...
}

Each pass must return an Iterable[B], not a B. After going through the collection, flatMap will “flatten” all those Iterables into one collection. Note that flatMap won’t flatten elements beyond one level. If our function literal leaves nested lists intact, they won’t be flattened for us.

Filtering

It is common to traverse a collection and extract a new collection from it with elements that match certain criteria.

// code-examples/FP/datastructs/filter-script.scala

val stateCapitals = Map(
  "Alabama" -> "Montgomery",
  "Alaska"  -> "Juneau",
  "Wyoming" -> "Cheyenne")

val map2 = stateCapitals filter { kv => kv._1 startsWith "A" }

println( map2 )

There are several different kinds of methods defined in Iterable for filtering or otherwise returning part of the original collection. (Comments adapted from the Scaladocs.)

trait Iterable[+A] {
  ...
  // Returns this iterable without its n first elements. If this iterable
  // has less than n elements, the empty iterable is returned.
  def drop (n : Int) : Collection[A] = ...

  // Returns the longest suffix of this iterable whose first element does
  // not satisfy the predicate p.
  def dropWhile (p : (A) => Boolean) : Collection[A] = ...

  // Apply a predicate p to all elements of this iterable object and
  // return true, iff there is at least one element for which p yields true.
  def exists (p : (A) => Boolean) : Boolean = ...

  // Returns all the elements of this iterable that satisfy the predicate p.
  // The order of the elements is preserved.
  def filter (p : (A) => Boolean) : Iterable[A] = ...

  // Find and return the first element of the iterable object satisfying a
  // predicate, if any.
  def find (p : (A) => Boolean) : Option[A] = ...

  // Returns index of the first element satisying a predicate, or -1.
  def findIndexOf (p : (A) => Boolean) : Int = ...

  // Apply a predicate p to all elements of this iterable object and return
  // true, iff the predicate yields true for all elements.
  def forall (p : (A) => Boolean) : Boolean = ...

  // Returns the index of the first occurence of the specified object in
  // this iterable object.
  def indexOf [B >: A](elem : B) : Int = ...

  // Partitions this iterable in two iterables according to a predicate.
  def partition (p : (A) => Boolean) : (Iterable[A], Iterable[A]) = ...

  // Checks if the other iterable object contains the same elements.
  def sameElements [B >: A](that : Iterable[B]) : Boolean = ...

  // Returns an iterable consisting only over the first n elements of this
  // iterable, or else the whole iterable, if it has less than n elements.
  def take (n : Int) : Collection[A] = ...

  // Returns the longest prefix of this iterable whose elements satisfy the
  // predicate p.
  def takeWhile (p : (A) => Boolean) : Iterable[A] = ...
}

Types like Map and Set have additional methods.

Folding and Reducing

We’ll discuss folding and reducing in the same section, as they’re similar. Both are operations for “shrinking” a collection down to a smaller collection or a single value.

Folding starts with an initial “seed” value and processes each element in the context of that value. In contrast, reducing doesn’t start with a user-supplied initial value. Rather, it uses the first element as the initial value.

// code-examples/FP/datastructs/foldreduce-script.scala

List(1,2,3,4,5,6) reduceLeft(_ + _)

List(1,2,3,4,5,6).foldLeft(10)(_ * _)

This script reduces the list of integers by adding them together, returning 21. It then folds the same list using multiplication with a seed of 10, returning 7200.

Reducing can’t work on an empty collection, since there would be nothing to return. In this case, an exception is thrown. Folding on an empty collection will simply return the seed value.

Folding also offers more options for the final result. Here is a “fold” operation that is really a map operation.

// code-examples/FP/datastructs/foldleft-map-script.scala

List(1, 2, 3, 4, 5, 6).foldLeft(List[String]()) {
  (list, x) => ("<" + x + ">") :: list
}.reverse

It returns List(<1>, <2>, <3>, <4>, <5>, <6>). Note that we had to call reverse on the result to get back a list in the same order as the input list.

Here are the signatures for the various fold and reduce operations in Iterable.

trait Iterable[+A] {
  ...
  // Combines the elements of this iterable object together using the
  // binary function op, from left to right, and starting with the value z.
  def foldLeft [B](z : B)(op : (B, A) => B) : B

  // Combines the elements of this list together using the binary function
  // op, from right to left, and starting with the value z.
  def foldRight [B](z : B)(op : (A, B) => B) : B

  // Similar to foldLeft but can be used as an operator with the order of
  // list and zero arguments reversed. That is, z /: xs is the same as
  // xs foldLeft z
  def /: [B](z : B)(op : (B, A) => B) : B

  // An alias for foldRight. That is, xs :\ z is the same as xs foldRight z
  def :\ [B](z : B)(op : (A, B) => B) : B

  // Combines the elements of this iterable object together using the
  // binary operator op, from left to right
  def reduceLeft [B >: A](op : (B, A) => B) : B

  // Combines the elements of this iterable object together using the
  // binary operator op, from right to left
  def reduceRight [B >: A](op : (A, B) => B) : B

Many people consider the operator forms, :\ for foldRight and /: for foldLeft, to be a little too obscure and hard to remember. Don’t forget the importance of communicating with your readers when writing code.

Why are their left and right forms of fold and reduce? For the first examples we showed, adding and multiplying a list of integers, they would return the same result. Consider a foldRight version of our last example that used fold to map the integers to strings.

// code-examples/FP/datastructs/foldright-map-script.scala

List(1, 2, 3, 4, 5, 6).foldRight(List[String]()) {
  (x, list) => ("<" + x + ">") :: list
}

This script produces List(<1>, <2>, <3>, <4>, <5>, <6>), without having to call reverse, as we did before. Note also that the arguments to the function literal are reversed compared to the arguments for foldLeft, as required by the definition of foldRight.

Both foldLeft and reduceLeft process the elements from left to right. Here is the foldLeft sequence for List(1,2,3,4,5,6).foldLeft(10)(_ * _).

((((((10 * 1) * 2) * 3) * 4) * 5) * 6)
((((((10) * 2) * 3) * 4) * 5) * 6)
(((((20) * 3) * 4) * 5) * 6)
((((60) * 4) * 5) * 6)
(((240) * 5) * 6)
((1200) * 6)
(7200)

Here is the foldRight sequence.

(1 * (2 * (3 * (4 * (5 * (6 * 10))))))
(1 * (2 * (3 * (4 * (5 * (60))))))
(1 * (2 * (3 * (4 * (300)))))
(1 * (2 * (3 * (1200))))
(1 * (2 * (3600)))
(1 * (7200))
(7200)

It turns out that foldLeft and reduceLeft have one very important advantage over their “right-handed” brethren: they are tail-call recursive, and as such they can benefit from tail-call optimization.

If you stare at the previous breakdowns for multiplying the integers, you can probably see why they are tail-call recursive. Recall that a tail call must be the last operation in an iteration. For each line in the foldRight sequence, the outer-most multiplication can’t be done until the inner-most multiplications all complete, so the operation isn’t tail recursive.

In the following script, the first lines prints 1784293664, while the second line causes a stack overflow.

// code-examples/FP/datastructs/reduceleftright-script.scala

println((1 to 1000000) reduceLeft(_ + _))
println((1 to 1000000) reduceRight(_ + _))

So why have both kinds of recursion? If you’re not worried about overflow, a right recursion might be the most natural fit for the operation you are doing. Recall that when we used foldLeft to map integers to strings, we had to reverse the result. That was easy enough to do in that case, but in general, the result of a left recursion might not always be easy to convert to the right form.

Functional Options

You’ll find the functional operations we’ve explored throughout the Scala library, and not exclusively on collection classes. The always handy Option container supports filter, map, flatMap, and other functionally-oriented methods that are only applied if the Option isn’t empty (that is, if it’s a Some and not a None).

Let’s see this in practice.

// code-examples/FP/datastructs/option-script.scala

val someNumber = Some(5)
val noneNumber = None

for (option <- List(noneNumber, someNumber)) {
  option.map(n => println(n * 5))
}

In this example, we attempt to multiply the contents of two Options by five. Normally, trying to multiply a null value would result in an error. But because the implementation of map on Option only applies the passed-in function when it’s non-empty, we don’t have to worry about testing for the presence of a value or handling an exception when we map over the None.

Functional operations on Options save us from extra conditional expressions or pattern matching. Pattern matching, though, is a powerful tool within the context of functional programming, as we’ll explore in the next section.

Pattern Matching

We’ve seen many examples of pattern matching throughout the book. We got our first taste in the section called “A Taste of Concurrency” in Chapter 1, Zero to Sixty: Introducing Scala, where we used pattern matching in our Actor that drew geometric shapes. We discussed pattern matching in depth in the section called “Pattern Matching” in Chapter 3, Rounding Out the Essentials.

Pattern matching is a fundamental tool in functional programming. It’s just as important as polymorphism is in object-oriented programming, although the goals of the two techniques are very different.

Pattern matching is an elegant way to decompose objects into their constituent parts for processing. On the face of it, pattern matching for this purpose seems to violate the goal of encapsulation that objects provide. Immutability, though, largely rectifies this conflict. The risk that the parts of an object might be changed outside of the control of the enclosing object is avoided.

For example, if we have a Person class that contains a list of addresses, we don’t mind exposing that list to clients if the list is immutable. They can’t unexpectedly change the list.

However, exposing constituent parts potentially couples clients to the types of those parts. we can’t change how the parts are implemented without breaking the clients. A way to minimize this risk is to expose the lowest-level abstractions possible. When clients access a person’s addresses, do they really need to know that they are stored in a List or is it sufficient to know that they are stored in an Iterable or Seq? If so, then we can change the implementation of the addresses as long as they still support those abstractions. Of course, we’ve known for a long time in object-oriented programming that you should only couple to abstractions, not concrete details (see e.g., [Martin2003]).

Functional pattern matching and object-oriented polymorphism are powerful complements to each other. We saw this in the Actor example in the section called “A Taste of Concurrency”, where we matched on the Shape abstraction, but called the polymorphic draw operation.

Partial Functions

You’ve seen partially applied functions, or partial functions, throughout this book. When you’ve seen an underscore passed to a method, you’ve probably seen partial application at work.

Partial functions are expressions in which not all of the arguments defined in a function are supplied as parameters to the function. In Scala, partial functions are used to bundle up a function, including its parameters and return type, and assign that function to a variable or pass it as an argument to another function.

This is a bit confusing until we see it in practice.

// code-examples/FP/partial/partial-script.scala

def concatUpper(s1: String, s2: String): String = (s1 + " " + s2).toUpperCase

val c = concatUpper _
println(c("short", "pants"))

val c2 = concatUpper("short", _: String)
println(c2("pants"))

Calling concatUpper with an underscore (_) turns the method into a function value. In the first part of the above example, we’ve assigned a partially applied version of concatUpper to the value c. We then apply it, implicitly calling the apply method on c by passing parameters to it directly. The returned value is then printed.

In the second part, we’ve specified the first parameter to concatUpper but not the second, although we have specified the type of the second parameter. We’ve assigned this variant to a second value, c2. To produce the same output as we saw above, we need only pass in a single value when we apply c2. We’ve applied part of the function in the assignment to c2, and we “fill in the blanks” when we call c2 on the next line.

We’ve seen partially applied functions without the underscore syntax, as well.

List("short", "pants").map(println)

In this example, println is the partially applied function. It’s applied when invoked by mapping over each element in the list. Because the map operation expects a function as an argument, we don’t need to write map(println _). The trailing underscore is that turns println into a function value is implied, in this context.

Another way of thinking of partial functions is as functions that will inform you when you supply them with parameters that are out of their domain. Every partial function is, as you might guess, of the type PartialFunction. This trait defines a method orElse that takes another PartialFunction. Should the first partial function not apply, the second will be invoked.

Again, this is easier understood in practice.

// code-examples/FP/partial/orelse-script.scala

val truthier: PartialFunction[Boolean, String] = { case true => "truthful" }
val fallback: PartialFunction[Boolean, String] = { case x => "sketchy" }
val tester = truthier orElse fallback

println(tester(1 == 1))
println(tester(2 + 2 == 5))

In this example, tester is a partial function composed of two other partial functions, truthier and fallback. In the first println statement, truthier is executed because the partial function’s internal case matches. In the second, fallback is executed because the value of the expression is outside of the domain of truthier.

The case statements we’ve seen through our exploration of Scala are expanded internally to partially applied functions. The functions provide the abstract method isDefinedAt, a feature of the PartialFunction trait used to specify the boundaries of a partial function’s domain.

// code-examples/FP/partial/isdefinedat-script.scala

val pantsTest: PartialFunction[String, String] = {
  case "pants" => "yes, we have pants!"
}

println(pantsTest.isDefinedAt("pants"))
println(pantsTest.isDefinedAt("skort"))

Here, our partial function is a test for the string "pants". When we inquire as to whether the string "pants" is defined for this function, the result is true. But for the string "skort", the result is false. Were we defining our own partial function, we could provide an implementation of isDefinedAt that performs any arbitrary test for the boundaries of our function.

Currying

Just as you encountered partially applied functions before we defined them, you’ve also seen curried functions. Named after mathematician Haskell Curry (from whom the Haskell language also get its name), currying transforms a function that takes multiple parameters into a chain of functions, each taking a single parameter.

In Scala, curried functions are defined with multiple parameter lists, as below.

def cat(s1: String)(s2: String) = s1 + s2

Of course, we could define more than two parameters on a curried function, if we like.

We can also use the following syntax to define a curried function:

def cat(s1: String) = (s2: String) => s1 + s2

While the previous syntax is more readable, in our estimation, using this syntax eliminates the requirement of a trailing underscore when treating the curried function as a partially applied function.

Calling our curried string concatenation function looks like this in the Scala REPL.

scala> cat("foo")("bar")
res1: java.lang.String = foobar

We can also convert methods that take multiple parameters into a curried form with the Function.curried method.

scala> def cat(s1: String, s2: String) = s1 + s2
cat: (String,String)java.lang.String

scala> val curryCat = Function.curried(cat _)
curryCat: (String) => (String) => java.lang.String = <function>

scala> cat("foo", "bar") == curryCat("foo")("bar")
res2: Boolean = true

In this example, we transform a function that takes two arguments, cat, into its curried equivalent that takes multiple parameter lists. If cat had taken three parameters, its curried equivalent would take three lists of arguments, and so on. The two forms are functionally equivalent, as demonstrated by the equality test, but curryCat can now be used as the basis of a partially applied function as well.

scala> val partialCurryCat = curryCat("foo")(_)
partialCurryCat: (String) => java.lang.String = <function>

scala> partialCurryCat("bar")
res3: java.lang.String = foobar

In practice, the primary use for currying is to specialize functions for particular types of data. You can start with an extremely general case, and use the curried form of a function to narrow down to particular cases.

As a simple example of this approach, the following code provides specialized forms of a base function that handles multiplication.

def multiplier(i: Int)(factor: Int) = i * factor
val byFive = multiplier(5) _
val byTen = multiplier(10) _

We start with multiplier, which takes two parameters: an integer, and another integer to multiply the first one by. We then curry two special cases of multiplier into function values. Note the trailing underscores, which indicate to the compiler that the preceding expression is to be curried. In particular, the wild-card underscores indicate that the remaining arguments (in this example, one argument) are unspecified.

In the Scala console, we get predictable output when calling our curried functions.

scala> byFive(2)
res4: Int = 10

scala> byTen(2)
res5: Int = 20

We’ll revisit the curry method in the section called “Function Types” in Chapter 12, The Scala Type System.

As you can see, currying and partially applied functions are closely related concepts. You may see them referred to almost interchangeably, but what’s important is their application (no pun intended).

Implicits

There are times when you have an instance of one type and you need to use it in a context where a different, but perhaps similar type is required. For the “one-off” case, you might create an instance of the required type using the state of the instance you already have. However, for the general case, if there are many such occurrences in the code, you would rather have an automated conversion mechanism.

A similar problem occurs when you call one or more functions repeatedly and have to pass the same value to all the invocations. You might like a way of specifying a default value for that parameter, so it is not necessary to specify it explicit all the time.

The Scala keyword implicit can be used to support both needs.

Implicit Conversions

Consider the following code fragment.

val name: String = "scala"
println(name.capitalize.reverse)

It prints the following.

alacS

We saw in the section called “The Predef Object” in Chapter 7, The Scala Object System that Predef defines the String type to be java.lang.String, yet the methods capitalize and reverse aren’t defined on java.lang.String. How did this code work?

The Scala library defines a “wrapper” class called scala.runtime.RichString that has these methods and the compiler converted the name string to it implicitly using a special method defined in Predef called stringWrapper.

implicit def stringWrapper(x: String) = new runtime.RichString(x)

The implicit keyword tells the compiler it can use this method for an “implicit” conversion from a String to a RichString, whenever the latter is required. The compiler detected an attempt to call a capitalize method and it determined that RichString has such a method. Then it looked within the current scope for an implicit method that converts String to RichString, finding stringWrapper.

As we’ll see in the section called “Views and View Bounds” in Chapter 12, The Scala Type System, these conversion methods are sometimes called views, in the sense that our stringWrapper conversion provides a view from String to RichString.

Predef defines many other implicit conversion methods, most of which follow the naming convention old2New, where old is the type of object available and New is the desired type. However, there is no restriction on the names of conversion methods. There are also a number of other “Rich” wrapper classes defined in the scala.runtime package.

Here is a summary of the lookup rules used by the compiler to find and apply conversion methods. For more details, see [ScalaSpec2009].

  1. No conversion will be attempted if the object and method combination type check successfully.
  2. Only methods with the implicit keyword are considered.
  3. Only implicit methods in the current scope are considered, as well as implicit methods defined in the companion object of the target type.
  4. Implicit methods aren’t chained to get from the available type, through intermediate types, to the target type. Only a method that takes a single available type instance and returns a target type instance will be considered.
  5. No conversion is attempted if there are more than one possible conversion methods that could be applied. There must be one and only one possibility.

What if you can’t define a conversion method in a companion object, to satisfy the third rule, perhaps because you can’t modify or create the companion object? In this case, define the method somewhere else and import it. Normally, you will define an object with just the conversion method(s) needed. Here is an example.

// code-examples/FP/implicits/implicit-conversion-script.scala
import scala.runtime.RichString

class FancyString(val str: String)

object FancyString2RichString {
    implicit def fancyString2RichString(fs: FancyString) =
        new RichString(fs.str)
}

import FancyString2RichString._

val fs = new FancyString("scala")
println(fs.capitalize.reverse)

We can’t modify RichString or Predef to add an implicit conversion method for our custom FancyString class. Instead, we define an object named FancyString2RichString and define the conversion method in it. We then import the contents of this object and the converter gets invoked implicitly in the last line. The output of this script is the following.

alacS

This pattern for effectively adding new methods to classes has been called pimp my library [Odersky2006].

Implicit Function Parameters

We saw in Chapter 2, Type Less, Do More that Scala version 2.8 adds support for default argument values, like you find in other languages like Ruby and C++. There are two other ways to achieve the same effect in all versions of Scala. The first is to use function currying, as we have seen. The second way is to define implicit values, using the implicit keyword.

Let’s examine how implicit values work.

// code-examples/FP/implicits/implicit-parameter-script.scala
import scala.runtime.RichString

def multiplier(i: Int)(implicit factor: Int) {
  println(i * factor)
}

implicit val factor = 2

multiplier(2)
multiplier(2)(3)

Our multiplier takes two lists of parameters. The latter includes an integer value, factor, marked implicit. This keyword informs the compiler to seek the value for factor from the surrounding scope, if available, or to use whatever parameter has been explicitly supplied to the function.

We’ve defined our own factor value in scope, and that value is used in the first call to multiplier. In the second call, we’re explicitly passing in a value for factor, and it overrides the value in the surrounding scope.

Essentially, implicit function parameters behave as parameters with a default value, with the key difference being that the value comes from the surrounding scope. Had our factor value resided in a class or object, we would have had to import it into the local scope. If the compiler can’t determine the value to use for an implicit parameter, an error of "no implicit argument matching parameter" will occur.

Final Thoughts on Implicits

Implicits can be perilously close to “magic”. When used excessively, they obfuscate the code’s behavior for the reader. Also, be careful about the implementation of a conversion method, especially if the return type is not explicitly declared. If a future change to the method also changes the return type in some subtle way, the conversion may suddenly fail to work. In general, implicits can cause mysterious behavior that is hard to debug!

When deciding how to implement “default” values for method arguments, a major advantage of using default argument values (in Scala version 2.8) is that the method maintainer decides what to use as the default value. The implementation is more straightforward and you avoid the “magic” of implicit methods. However, a disadvantage of using default argument values is that it might be desirable to use a different “default” value based on the context in which the method is being called. Scala version 2.8 provides some flexibility, as you can use an expression for an argument, not just a constant value. However, that flexibility might not be enough, in which case implicits are a very flexible and powerful alternative.

Tip

Use implicits sparingly and cautiously. Also, consider adding an explicit return type to “nontrivial” conversion methods.

Call by Name, Call by Value

Typically, parameters to functions are by-value parameters; that is, the value of the parameter is determined before it is passed to the function. In most circumstances, this is the behavior we want and expect.

But what if we need to write a function that accepts as a parameter an expression that we don’t want evaluated until it’s called within our function? For this circumstance, Scala offers by-name parameters.

A by-name parameter is specified by omitting the parentheses that normally accompany a function parameter, as below.

def myCallByNameFunction(callByNameParameter: => ReturnType)

Without this syntactic shortcut, this method definition would look like the following:

def myCallByNameFunction(callByNameParameter: () => ReturnType)

And what’s more, we would have to include those unsightly, empty parenthesis in every call to that method. Use of by-name parameters removes that requirement.

We can use by-name parameters to implement powerful looping constructs, amongst other things. Let’s go crazy and implement our own while loop, throwing currying into the mix.

// code-examples/FP/overrides/call-by-name-script.scala

def whileAwesome(conditional: => Boolean)(f: => Unit) {
  if (conditional) {
    f
    whileAwesome(conditional)(f)
  }
}

var count = 0

whileAwesome(count < 5) {
  println("still awesome")
  count += 1
}

What would happen if we removed the arrow between conditional: and Boolean? The expression count < 5 would be evaluated to true before being passed into our custom while loop, and the message "still awesome" would be printed to the console indefinitely. By delaying evaluation until conditional is called inside our function with a by-name parameter, we get the behavior we expect.

Lazy Vals

In the section called “Overriding Abstract and Concrete Fields in Traits” in Chapter 6, Advanced Object-Oriented Programming In Scala, we showed several scenarios where the order of initialization for fields in override scenarios can be problematic. We discussed one solution, pre-initialized fields. Now we discuss the other solution we mentioned previously, lazy vals.

Here is that example rewritten with a lazy val.

// code-examples/FP/overrides/trait-lazy-init-val-script.scala

trait AbstractT2 {
  println("In AbstractT2:")
  val value: Int
  lazy val inverse = { println("initializing inverse:"); 1.0/value }
  //println("AbstractT2: value = "+value+", inverse = "+inverse)
}

val c2d = new AbstractT2 {
  println("In c2d:")
  val value = 10
}

println("Using c2d:")
println("c2d.value = "+c2d.value+", inverse = "+c2d.inverse)

The is the output of the script.

In AbstractT2:
In c2d:
Using c2d:
initializing inverse:
c2d.value = 10, inverse = 0.1

As before, we are using an anonymous inner class that implicitly extends the trait. The body of the class, which initializes value, is evaluated after the trait’s body. However, note that inverse is declared lazy, which means that the right-hand side will be evaluated only when inverse is actually used. In this case, that happens in the last println statement. Only then is inverse initialized, using value, which is properly initialized at this point.

Try uncommenting the println statement at the end of the AbstractT2 body. What happens now?

In AbstractT2:
initializing inverse:
AbstractT2: value = 0, inverse = Infinity
In c2d:
Using c2d:
c2d.value = 10, inverse = Infinity

This println forces inverse to be evaluated inside the body of AbstractT2, before value is initialized by the class body, thereby reproducing the problem we had before.

This example raises an important point; if other val's use the lazy val in the same class or trait body, they should be declared lazy, too. Also, watch out for function calls in the body that use the lazy val.

Tip

If a val is lazy, make sure all uses of the val are also lazy!

So, how is a lazy val different from a method call? In a method call, the body is executed every time the method is invoked. For a lazy val, the initialization “body” is evaluated only once, when the variable is used for the first time. This one-time evaluation makes little sense for a mutable field. Therefore, the lazy keyword is not allowed on var's. (They can’t really make use of it anyway.)

You can also use lazy val's to avoid costly initializations that you may not actually need and to defer initializations that slow down application startup. They work well in constructors, where it’s clear to other programmers that all the one-time heavy lifting for initializing an instance is done in one place.

Another use for “laziness” is to manage potentially infinite data structures where only a manageable subset of the data will actually be used. In fact, mathematic notation is inherently lazy. When we write the Fibonacci sequence, for example, we might write it as an infinite sequence, something like this.

Fib = 1, 1, 2, 3, 5, 8, ...

Some pure functional languages are lazy by default, so they mimic this behavior as closely as possible. This can work without exhausting resources if the user never tries to use more than a finite subset of these values. Scala is not lazy by default, but it does offer support for working with infinite data structures. We’ll address this topic in the section called “Infinite Data Structures and Laziness” in Chapter 12, The Scala Type System.

Recap: Functional Component Abstractions

When object-oriented programming went mainstream in the late 80’s and early 90’s, there was great hope that it would usher in an era of reusable software components. It didn’t really work out that way, except in some rare cases, like the windowing API’s of various platforms.

Why did this not happen? There are certainly many reasons, but a likely source is the fact that simple, source or binary interoperability protocols never materialized that would glue these components together. The richness of object APIs was the very factor that undermined componentization.

Component models that have succeeded are all based on very simple foundations. Integrated circuits (ICs) in electronics plug into buses with 2n signaling wires that are boolean, either on or off. From that very simple protocol, the most explosive growth of any industry in human history was born.

HTTP is another good example. With a handful of message types and a very simple standard for message content, it set the stage for the Internet revolution. RESTful web services built on top of HTTP are also proving successful as components, but they are just complex enough that care is required to ensure that they work successfully.

So, is there hope for a binary or source-level component model? It probably won’t be object-oriented, as we’ve seen. Rather, it could be more functional.

Components should interoperate by exchanging a few immutable data structures, e.g., lists and maps, that carry both data and “commands”. Such a component model would have the simplicity necessary for success, but the richness required to perform real work. Notice how that sounds a lot like HTTP and REST.

In fact, the Actor model has many of these qualities, as we’ll explore in the next chapter.

You must sign in or register before commenting

Chapter 9. Robust, Scalable Concurrency with Actors

The Problems of Shared, Synchronized State

Concurrency isn’t easy. Getting a program to do more than one thing at a time has traditionally meant hassling with mutexes, race conditions, lock contention, and the rest of the unpleasant baggage that comes along with multithreading. Event-based concurrency models alleviate some of these concerns, but can turn large programs into a rat’s nest of callback functions. No wonder, then, that concurrent programming is a task most programmers dread, or avoid altogether by retreating to multiple independent processes that share data externally (for example, through a database or message queue).

A large part of the difficulty of concurrent programming comes down to state: how do you know what your multithreaded program is doing, and when? What value does a particular variable hold when you have two threads running, or five, or fifty? How can you guarantee that your program’s many tendrils aren’t clobbering one another in a race to take action? A thread-based concurrency paradigm poses more questions than it answers.

Thankfully, Scala offers a reasonable, flexible approach to concurrency that we’ll explore in this chapter.

Actors

Though you may have heard of Scala and Actors in the same breath, Actors aren’t a concept unique to Scala. Actors, originally intended for use in Artificial Intelligence research, were first put forth in 1973 (see [Hewitt1973] and [Agha1987]). Since then, variations on the idea of Actors have appeared in a number of programming languages, most notably in Erlang and Io. As an abstraction, Actors are general enough that they can be implemented as a library (as in Scala), or as the fundamental unit of a computational system.

Actors in Abstract

Fundamentally, an Actor is an object that receives messages and takes action on those messages. The order in which messages arrive is unimportant to an Actor, though some Actor implementations (such as Scala’s) queue messages in order. An Actor might handle a message internally, or it might send a message to another Actor, or it might create another Actor to take action based on the message. Actors are a very high-level abstraction.

Unlike traditional object systems (which, you might be thinking to yourself, have many of the same properties we’ve described), Actors don’t enforce a sequence or ordering to their actions. This inherent eschewing of sequentiality, coupled with independence from shared global state, allow Actors to do their work in parallel. As we’ll see later on, the judicious use of immutable data fits the Actor model ideally, and further aids in safe, comprehensible concurrent programming.

Enough theory. Let’s see Actors in action.

Actors in Scala

At their most basic, Actors in Scala are objects that inherit from scala.actors.Actor.

// code-examples/Concurrency/simple-actor-script.scala

import scala.actors.Actor

class Redford extends Actor {
  def act() {
    println("A lot of what acting is, is paying attention.")
  }
}

val robert = new Redford
robert.start

As we can see, an Actor defined in this way must be both instantiated and started, similar to how threads are handled in Java. It must also implement the abstract method act, which returns Unit. Once we’ve started this simple Actor, the following sage advice for thespians is printed to the console.

A lot of what acting is, is paying attention.

The scala.actors package contains a factory method for creating Actors that avoids much of the setup in the above example. We can import this method and other convenience methods from scala.actors.Actors._. Here is a factory-made Actor.

// code-examples/Concurrency/factory-actor-script.scala

import scala.actors.Actor
import scala.actors.Actor._

val paulNewman = actor {
  println("To be an actor, you have to be a child.")
}

While a subclass that extends the Actor class must define act in order to be concrete, a factory-produced Actor has no such limitation. In this shorter example, the body of the method passed to actor is effectively promoted to the act method from our first example. Predictably, this Actor also prints a message when run. Illuminating, but we still haven’t shown the essential piece of the Actors puzzle: sending messages.

Sending Messages to Actors

Actors can receive any sort of object as a message, from strings of text to numeric types to whatever classes you’ve cooked up in your programs. For this reason, Actors and pattern matching go hand in hand. An Actor should only act on messages of familiar types; a pattern match on the class and/or contents of a message is good defensive programming, and increases the readability of Actor code.

// code-examples/Concurrency/pattern-match-actor-script.scala

import scala.actors.Actor
import scala.actors.Actor._

val fussyActor = actor {
  loop {
    receive {
      case s: String => println("I got a String: " + s)
      case i: Int => println("I got an Int: " + i.toString)
      case _ => println("I have no idea what I just got.")
    }
  }
}

fussyActor ! "hi there"
fussyActor ! 23
fussyActor ! 3.33

This example prints the following when run.

I got a String: hi there
I got an Int: 23
I have no idea what I just got.

The body of fussyActor is a receive method wrapped in a loop. loop is essentially a nice shortcut for while(true); it does whatever is inside its block repeatedly. receive blocks until it gets a message of a type that will satisfy one of its internal pattern matching cases.

The final lines of this example demonstrate use of the ! (exclamation point, or bang) method to send messages to our Actor. If you’ve ever seen Actors in Erlang, you’ll find this syntax familiar. The Actor is always on the left-hand side of the bang, and the message being sent to said Actor is always on the right. If you need a mnemonic for this granule of syntactic sugar, imagine that you’re an irate director shouting commands at your Actors.

The Mailbox

Every Actor has a mailbox in which messages sent to that Actor are queued. Let’s see an example where we inspect the size of an Actor’s mailbox.

// code-examples/Concurrency/actor-mailbox-script.scala

import scala.actors.Actor
import scala.actors.Actor._

val countActor = actor {
  loop {
    react {
      case "how many?" => {
        println("I've got " + mailboxSize.toString + " messages in my mailbox.")
      }
    }
  }
}

countActor ! 1
countActor ! 2
countActor ! 3
countActor ! "how many?"
countActor ! "how many?"
countActor ! 4
countActor ! "how many?"

This example produces the following output.

I've got 3 messages in my mailbox.
I've got 3 messages in my mailbox.
I've got 4 messages in my mailbox.

Note that the first and second lines of output are identical. Because our Actor was set up solely to process messages of the string “how many?”, those messages didn’t remain in its mailbox. Only the messages of types we didn’t know about - in this case, Int - remained unprocessed.

Tip

If you see an Actor’s mailbox size ballooning unexpectedly, you’re probably sending messages of a type that the Actor doesn’t know about. Include a catchall case (_) when pattern matching received messages to find out what’s harassing your Actors.

Actors in Depth

Now that we’ve got a basic sense of what Actors are and how they’re used in Scala, let’s put them to work. Specifically, let’s put them to work cutting hair. The sleeping barber problem ([SleepingBarberProblem]) is one of a popular set of computer science hypotheticals designed to demonstrate issues of concurrency and synchronization.

The problem is this: a hypothetical barber shop has just one barber with one barber chair, and three chairs in which customers may wait for a haircut. Without customers around, the barber sleeps. When a customer arrives, the barber wakes up to cut his hair. If the barber is busy cutting hair when a customer arrives, the customer sits down in an available chair. If a chair isn’t available, the customer leaves.

The sleeping barber problem is usually solved with semaphores and mutexes, but we’ve got better tools at our disposal. Straightaway, we see several things to model as Actors: the barber is clearly one, as are the customers. The barbershop itself could be modeled as an Actor, too; there need not be a real-world parallel to verbal communication in an Actor system, even though we’re sending messages.

Let’s start with the sleeping barber’s customers, as they have the simplest responsibilities.

// code-examples/Concurrency/sleepingbarber/customer.scala

package sleepingbarber

import scala.actors.Actor
import scala.actors.Actor._

case object Haircut

class Customer(val id: Int) extends Actor {
  var shorn = false

  def act() = {
    loop {
      react {
        case Haircut => {
          shorn = true
          println("[c] customer " + id + " got a haircut")
        }
      }
    }
  }
}

For the most part, this should look pretty familiar: we declare the package in which this code lives, we import code from the scala.actors package, and we define a class that extends Actor. There are a few details worth noting, however.

First of all, there’s our declaration of case object Haircut. A common pattern when working with Actors in Scala is to use a case object to represent a message without internal data. If we wanted to include, say, the time at which the haircut was completed, we’d use a case class instead. We declare Haircut here because it’s a message type that will be sent solely to customers.

Note as well that we’re storing one bit of mutable state in each Customer: whether or not they’ve gotten a haircut. In their internal loop, each Customer waits for a Haircut message and, upon receipt of one, we set the shorn boolean to true. Customer uses the asynchronous react method to respond to incoming messages. If we needed to return the result of processing the message, we would use receive, but we don’t, and in the process we save some memory and thread use under the hood.

Let’s move on to the barber himself. Because there’s only one barber, we could have used the actor factory method technique mentioned above to create him. For testing purposes, we’ve instead defined our own Barber class.

// code-examples/Concurrency/sleepingbarber/barber.scala

package sleepingbarber

import scala.actors.Actor
import scala.actors.Actor._
import scala.util.Random

class Barber extends Actor {
  private val random = new Random()

  def helpCustomer(customer: Customer) {
    if (self.mailboxSize >= 3) {
      println("[b] not enough seats, turning customer " + customer.id + " away")
    } else {
      println("[b] cutting hair of customer " + customer.id)
      Thread.sleep(100 + random.nextInt(400))
      customer ! Haircut
    }
  }

  def act() {
    loop {
      react {
        case customer: Customer => helpCustomer(customer)
      }
    }
  }
}

The core of the Barber class looks very much like the Customer. We loop around react, waiting for a particular type of object. To keep that loop tight and readable, we call a method, helpCustomer, when a new Customer is sent to the barber. Within that method we employ a check on the mailbox size to serve as our “chairs” that customers may occupy; we could have the Barber or Shop classes maintain an internal Queue, but why bother when each actor’s mailbox already is one?

If three or more customers are in the queue, we simply ignore that message; it’s then discarded from the barber’s mailbox. Otherwise, we simulate a semi-random delay (always at least 100 milliseconds) for the time it takes to cut a customer’s hair, then send off a Haircut message to that customer. (Were we not trying to simulate a real-world scenario, we would of course remove the call to Thread.sleep() and allow our barber to run full tilt.)

Next up, we have a simple class to represent the barbershop itself.

// code-examples/Concurrency/sleepingbarber/shop.scala

package sleepingbarber

import scala.actors.Actor
import scala.actors.Actor._

class Shop extends Actor {
  val barber = new Barber()
  barber.start

  def act() {
    println("[s] the shop is open")

    loop {
      react {
        case customer: Customer => barber ! customer
      }
    }
  }
}

By now, this should all look very familiar. Each Shop creates and starts a new Barber, prints a message telling the world that the shop is open, and sits in a loop waiting for customers. When a Customer comes in, he’s sent to the barber. We now see an unexpected benefit of Actors: they allow us to describe concurrent business logic in easily understood terms. “Send the customer to the barber” makes perfect sense, much more so than “notify the barber, unlock the mutex around the customer seats, increment the number of free seats,” and so forth. Actors get us closer to our domain.

Finally, we have a driver for our simulation.

// code-examples/Concurrency/sleepingbarber/barbershop-simulator.scala

package sleepingbarber

import scala.actors.Actor._
import scala.collection.{immutable, mutable}
import scala.util.Random

object BarbershopSimulator {
  private val random = new Random()
  private val customers = new mutable.ArrayBuffer[Customer]()
  private val shop = new Shop()

  def generateCustomers {
    for (i <- 1 to 20) {
      val customer = new Customer(i)
      customer.start()
      customers += customer
    }

    println("[!] generated " + customers.size + " customers")
  }

  // customers arrive at random intervals
  def trickleCustomers {
    for (customer <- customers) {
      shop ! customer
      Thread.sleep(random.nextInt(450))
    }
  }

  def tallyCuts {
    // wait for any remaining concurrent actions to complete
    Thread.sleep(2000)

    val shornCount = customers.filter(c => c.shorn).size
    println("[!] " + shornCount + " customers got haircuts today")
  }

  def main(args: Array[String]) {
    println("[!] starting barbershop simulation")
    shop.start()

    generateCustomers
    trickleCustomers
    tallyCuts

    System.exit(0)
  }
}

After “opening the shop”, we generate a number of Customer objects, assigning a numeric ID to each and storing the lot in an ArrayBuffer. Next, we “trickle” the customers in by sending them as messages to the shop and sleeping for a semi-random amount of time between loops. At the end of our simulated day, we tally up the number of customers who got haircuts by filtering out the customers whose internal shorn boolean was set to true and asking for the size of the resulting sequence.

Compile and run the code within the sleepingbarber directory as follows:

fsc *.scala
scala -classpath . sleepingbarber.BarbershopSimulator

Throughout our code, we’ve prefixed console messages with abbreviations for the classes from which the messages were printed. When we look at an example run of our simulator, it’s easy to see where each message came from.

[!] starting barbershop simulation
[s] the shop is open
[!] generated 20 customers
[b] cutting hair of customer 1
[b] cutting hair of customer 2
[c] customer 1 got a haircut
[c] customer 2 got a haircut
[b] cutting hair of customer 3
[c] customer 3 got a haircut
[b] cutting hair of customer 4
[b] cutting hair of customer 5
[c] customer 4 got a haircut
[b] cutting hair of customer 6
[c] customer 5 got a haircut
[b] cutting hair of customer 7
[c] customer 6 got a haircut
[b] not enough seats, turning customer 8 away
[b] cutting hair of customer 9
[c] customer 7 got a haircut
[b] not enough seats, turning customer 10 away
[c] customer 9 got a haircut
[b] cutting hair of customer 11
[b] cutting hair of customer 12
[c] customer 11 got a haircut
[b] cutting hair of customer 13
[c] customer 12 got a haircut
[b] cutting hair of customer 14
[c] customer 13 got a haircut
[b] not enough seats, turning customer 15 away
[b] not enough seats, turning customer 16 away
[b] not enough seats, turning customer 17 away
[b] cutting hair of customer 18
[c] customer 14 got a haircut
[b] cutting hair of customer 19
[c] customer 18 got a haircut
[b] cutting hair of customer 20
[c] customer 19 got a haircut
[c] customer 20 got a haircut
[!] 15 customers got haircuts today

You’ll find that each run’s output is, predictably, slightly different. Every time the barber takes a bit longer to cut hair than it does for several customers to enter, the “chairs” (the barber’s mailbox queue) fill up, and new customers simply leave.

Of course, we have to include the standard caveats that come with simple examples. For one, it’s possible that our example may not be suitably random, particularly if random values are retrieved within a millisecond of one another. This is a byproduct of the way the JVM generates random numbers, and a good reminder to be careful about randomness in concurrent programs. You’d also want to replace the sleep inside tallyCuts with a clearer signal that the various actors in the system are done doing their work, perhaps by making the BarbershopSimulation an Actor and sending it messages that indicate completion.

Try modifying the code to introduce more customers, additional message types, different delays, or to remove the randomness altogether. If you’re an experienced multithreaded programmer, you might try writing your own sleeping barber implementation just to compare and contrast. We’re willing to bet that an implementation in Scala with Actors will be terser and easier to maintain.

Effective Actors

In order to get the most out of Actors, there are few things to remember. First off, note that there are several methods you can use to get different types of behavior out of your Actors. The table below should help clarify when to use each method.

Table 9.1. Actor Methods

Method Returns Description

act

Unit

Abstract, top-level method for an Actor. Typically contains one of the following methods inside it.

receive

Result of processing message

Blocks until a message of matched type is received.

receiveWithin

Result of processing message

Like receive but unblocks after specified number of milliseconds.

react

Nothing

Requires less overhead (threads) than receive.

reactWithin

Nothing

Like react but unblocks after specified number of milliseconds.


Typically, you’ll want to use react wherever possible. If you need the results of processing a message (that is, you need a synchronous response from sending a message to an Actor), use the receiveWithin variant to reduce your chances of blocking indefinitely on an Actor that’s gotten wedged.

Another strategy to keep your Actor-based code asynchronous is the use of futures. A future is a placeholder object for a value that hasn’t yet been returned from an asynchronous process. You can send a message to an Actor with the !! method; a variant of this method allows you pass along a partial function which is applied to the future value. As you can see from the example below, retrieving a value from a Future is as straightforward as invoking its apply method. Note that retrieving a value from a Future is a blocking operation.

// code-examples/Concurrency/future-script.scala
import scala.actors.Futures._

val eventually = future(5 * 42)
println(eventually())

Each Actor in your system should have clear responsibilities. Don’t use Actors for general-purpose, highly stateful tasks. Instead, think like a director: what are the distinct roles in the “script” of your application, and what’s the least amount of information each Actor needs to do its job? Give each Actor just a couple of responsibilities, and use messages (usually in the form of a case class or case object) to delegate those responsibilities to other Actors.

Don’t be hesitant to copy data when writing Actor-centric code. The more immutable your design, the less likely you are to end up with unexpected state. The more you communicate via messages, the less you have to worry about synchronization. All those messages and immutable variables might appear to be overly costly. But, with today’s plentiful hardware, trading memory overhead for clarity and predictability seems more than fair for most applications.

Lastly, know when Actors aren’t appropriate. Just because Actors are a great way to handle concurrency in Scala doesn’t mean they’re the only way, as we’ll see below. Traditional threading and locking may better suit write-heavy critical paths for which a messaging approach would incur too much overhead. In our experience, you can use a purely Actor-based design to prototype a concurrent solution, then use profiling tools to suss out parts of your application that might benefit from a different approach.

Traditional Concurrency in Scala: Threading and Events

While Actors are a great way to handle concurrent operations, they’re not the only way to do so in Scala. As Scala is interoperable with Java, the concurrency concepts that you may be familiar with on the JVM still apply.

One-Off Threads

For starters, Scala provides a handy way to run a block of code in a new thread.

// code-examples/Concurrency/threads/by-block-script.scala

new Thread { println("this will run in a new thread") }

A similar construct is available in the scala.concurrent package, as a method on the ops object to run a block asynchronously with spawn.

// code-examples/Concurrency/threads/spawn-script.scala

import scala.concurrent.ops._

object SpawnExample {
  def main(args: Array[String]) {
    println("this will run synchronously")

    spawn {
      println("this will run asychronously")
    }
  }
}

Using java.util.concurrent

If you’re familiar with the venerable java.util.concurrent package, you’ll find it just as easy to use from Scala (or hard to use, depending on your point of view). Let’s use Executors to create a pool of threads. We’ll use the thread pool to run a simple class, implementing Java’s Runnable interface for thread-friendly classes, that identifies which thread it’s running on.

// code-examples/Concurrency/threads/util-concurrent-script.scala

import java.util.concurrent._

class ThreadIdentifier extends Runnable {
  def run {
    println("hello from Thread " + currentThread.getId)
  }
}

val pool = Executors.newFixedThreadPool(5)

for (i <- 1 to 10) {
  pool.execute(new ThreadI