Posts

In Why you should know Monix I’ve taken a brief look at some of Monix’s abstractions and utilities, but I haven’t dived into implementing reactive streams elements. This time I’m going to build Consumer and Observer for RabbitMQ message broker. Read more

If you are a Scala developer for some time you are probably familiar with the concept of Lenses. It got a lot of traction in community as it resolves very common problem of modifying deeply nested case classes. But what is not that universally known is that there are more similar abstractions. They are usually referred to as Optics.

In this post I will try to present some of them and to give some intuition what are possible applications for them. This article is focused more on the applications rather than on mathematical foundations. Moreover, it attempts to highlight that idea of Optics goes much, much further than manipulation of nested records. Read more

Why Monix

In this short blog post – in just 10 minutes or less –  I’m going to attempt to present what Monix library is and try to convince you why you really need to get to know it.

Formerly known as Monifu, Monix is a library for asynchronous programming in Scala and Scala.js

It contains several useful abstractions and sometimes can be found superior to its vanilla Scala or Akka counterparts. But this post is definitely not going to be about knocking the use of Akka actors or streams. Rather, it’s about another tool in the Scala programmer’s box. I am going to be presenting some of the abstractions that Monix gives and conclude why they are invaluable.

Read more

In this post, we will look at how primitive Scala types such as Int and Long are represented down to the bytecode level. This will help us understand what the performance effects of using them in generic classes are. We will also explore the functionalities that the Scala compiler provides us for mitigating such performance penalties.

Furthermore, we will take a look at concrete benchmark results and convince ourselves that boxing/unboxing can have a significant effect on the latency of an application. Read more

In this post I will try to present what is GraphStage in Akka Streams. My goal is to describe when it’s useful and how to use it correctly. I will start with outlining key terminology, then proceed with simple example and after that the main use case will be covered. For the latter the most upvoted issue of akka-http will serve.

At the end, I will show how to properly test GraphStage. Besides of learning API you’ll gain deeper understanding how backpressure works. Read more

Type classes in Scala

Type classes are a powerful and flexible concept that adds ad-hoc polymorphism to Scala. They are not a first-class citizen in the language, but other built-in mechanisms allow to writing them in Scala. This is the reason why they are not so obvious to spot in code and one can have some confusion over what the ‘correct’ way of writing them is.

This blog post summarizes the idea behind type classes, how they work and the way of coding them in Scala.

Idea

Type classes were introduced first in Haskell as a new approach to ad-hoc polymorphism. Philip Wadler and Stephen Blott described it in How to make ad-hoc polymorphism less ad hoc. Type classes in Haskell are an extension to the Hindley–Milner type system, implemented by that language.

Type class is a class (group) of types, which satisfies some contract-defined trait, additionally, such functionality (trait and implementation) can be added without any changes to the original code. One could say that the same could be achieved by extending a simple trait, but with type classes,there is no need to predict such a demand beforehand.

There is no special syntax in Scala to express a type class, but the same functionality can be achieved using constructs that already exist in the language. That’s what makes it a little difficult for newcomers to spot a type class in code. A typical implementation of a type class uses some syntactic sugar as well, which also doesn’t make it clear right away what we are dealing with.

So let’s start doing our baby steps to implement a type class and understand it.
.

Implementation

Let’s write a type class that adds a function for getting the string representation of a given type. We make it possible for a given value to show itself. This is a .toString equivalent. We can start by defining a trait:

We want to have show functionality, but defined outside of each specific type definition. Let’s start by implementing show for an Int.

We have defined a companion object for Show to add functionality there. intCanShow holds an implementation of Show trait for Int. This is a just the first step. Of course, usage is still very cumbersome, to use this function we have to:

The full implementation containing all needed imports can be found in the repo.

The next step is to write the show function, in Show’s companion object, to avoid calling intCanShow explicitly.

The show function takes some parameter of type A and an implementation of the Show trait for that type A. Marking the intCanShow value as implicit allows the compiler to find this implementation of Show[A] when there is a call to:

That is basically a type class. We’re going to transform it a little bit to make it look more like a real code (all required parts are there). We have a trait that describes the functionality and implementations for each type we care about. There is also a function which applies an implicit instance’s function to the given parameter.

There is a more common way of writing the show function by having an implicit parameter. Instead of writing:

we can use implicitly and rewrite it to:

We also used the context bound syntax: A: Show, which is a syntactic sugar in Scala, mainly introduced to support type classes, it basically does the rewrite we have done above (without the use of implicitly), more information can be found here.

There is one more trick (convention) often used in type classes. Instead of using implicitly we can add an apply function (to the Show companion object) with only an implicit parameter list:

and use it in show function:

This, of course, can be shortened even more:

We can improve our type class with the possibility of calling the show function as if it were a method on the given object – with a simple .show notation. By convention it is very often called a Ops class.

The Ops class allow us to write our clients’ code like this:

To avoid a runtime overhead it is possible to make the ShowOps a value class and move the type class’ constraint to the show function, like this:

After some of the rewrites placed above, the companion object of Show looks like this:

Now we can add one more instance of our type class, the one responsible for showing strings. It’s similar to the one showing ints.

In fact, this is so similar that we want to abstract it – what can be done with a function to create instances for different types? We can rephrase it as a “constructor” for type class instances.

The snippet above presents a helper function instance that abstracts the common code and its usage for Int and String instances. With Scala 2.12 we can use Single Abstract Methods, in result, the code is even more concise.

This is a simple type class that defines two ways of calling the show function (show() and .show). It also defines instances for two types: Int and String.

We may encounter a need to redefine some default type class instances. With the implementation above, if all default instances were imported into scope we cannot achieve that. The compiler will have ambiguous implicits in scope and will report an error.

We may decide to move the show function and the ShowOps implicit class to another object (let say ops) to allow users of this type class to redefine the default instance behaviour (with Category 1 implicits, more on categories of implicits). After such a modification, the Show object looks like this:

Usage does not change, but now the user of this type class may import only:

Default implicit instances are not brought as Category 1 implicits (although they are available as Category 2 implicits), so it’s possible to define our own implicit instance where we use such type class.

This is a basic type-class that we have been coding from the very beginning.

Own types

Own types

The creator of a type class often provides its instances for popular types, but our own types are not always supported (it depends on the library provider, whether some kind of products/coproducts derivation is implemented). Nothing stops us from writing our implementation for the type class.

That, of course, looks exactly the same as if we would like to redefine the default instance that was provided by the implementer of the type class.

While implementing our own instance the code follows the same pattern but could be implemented in a different location than the trait and the ops classes of the type class. Moreover, the type class is in our code base we may add this instance next to the default instances defined in type class trait’s companion object. As an example, let’s define a way to show a Foo case class and its instance outside of the type class companion object:

Shapeless

This paragraph is a little off topic, but worth mentioning.

The way type classes are implemented in Scala (with implicits) makes it possible to automatically derive type-class instances for our own created types using Shapeless.

For example, we could derive the show function (from previous paragraphs) for every case-class (actually for every product type) defined in our code.

We would need to define instances for basic types and define show for product types, but it would have reduced so much boilerplate in our code!

Similar derivation can be achieved with runtime reflection or compile-time macros.

Simulacrum

Simulacrum is a project that adds syntax for type classes using macros. Whether to use it or not depends on your preferences. If it is used, it’s trivial to find all type classes in our code and reduce some boilerplate. Moreover, a project that uses @typeclass has to depend on the macro paradise compiler plugin.

The equivalent of our Show example with an instance only for Int would look like this:

As you can see, the definition of a type class is very concise. On the usage side nothing changes – we would use it like this:

There is an additional annotation @op that may change the name of generated function and/or add some alias to generated method (i.e. |+| notation for summing).

Proper imports can be found in repo.

Implicits

Type classes use implicits as a mechanism for matching instances with code that uses them. Type classes come with benefits and costs related to implicits.

It is possible to define multiple instances of type class for the same type. The compiler uses implicit resolution to find an instance that is the closest in the scope. In comparison, a type class in Haskell can only have one instance. In Scala, we can define an instance and pass it as a parameter explicitly (not relying on implicit resolution), which makes the usage less convenient, but may be useful.

Our Show example needs a little modification to allow usage in a scenario, where we would like to pass instances explicitly. Let’s add a showExp function to the ShowOps class:

Now, it’s possible to only run the .showExp function or define and provide an instance of Show to showExp explicitly:

The first invocation uses the implicit found in scope, to the second invocation we pass the hipsterString instance of Show.

The other way (more common) to achieve the same result – without adding an extra function, but fully relying on implicits – is to create a Category 1 implicit that would take precedence over the default instance (a Category 2 implicit). This would look like this:

"baz" would use the default instance defined in Show, but "bazbaz" would use hipsterString instance.

The Scala way of implementing a type class (using implicits) could also cause some problems, which are described in the next paragraph.

Problems

With the power of implicits comes a cost.

We can’t have two type class instances for some type T with the same precedence. This doesn’t sound like a terrible problem, but it does cause some real issues. It’s quite easy to get a compiler error (about ambiguous implicits) while using libraries like Cats or Scalaz, which rely heavily on type classes and build their types as a hierarchy (by subtyping). That is in detail described here.

The problem is mainly related to the way type classes are implemented. Very often both ambiguous implicits implement exactly the same behavior, but the compiler can’t know about it. There are ongoing discussions on how to fix this.

Errors may also be misleading, because the compiler doesn’t know what a type class is, e.g. for our Show type class used in such a way:

compiler can only say that value show is not a member of Boolean.

A similar error message is even reported when ambiguous implicit definitions are found, but the .show notation was used.

Open source examples

Open source is a perfect place to look for examples of type classes. I would like to name two projects:

  • Cats uses type classes as a primary way to model things and simulacrum. Instances are implemented in separate traits, Ops are grouped in syntax traits.
  • Shapeless relies heavily on type classes. The power of shapeless is the ability to work on HLists and derive type classes to add new functionality.

Future of type classes

There are different attempts and discussions on how to add syntax for type classes:

There are also some ongoing discussion on coherence of type classes:

Summary

Type classes as a concept are quite easy, but there are various corner cases when it comes to its implementation in Scala. The concept is rather used in libraries than in business applications, but it’s good to know type classes and potential risks of using them.

A very common scenario in many kinds of software is when the input data is potentially unlimited and it can appear at arbitrary intervals. The common way of handling such cases is using the Observer pattern in it’s imperative form – callbacks.

But this approach creates what’s commonly called “Callback Hell”. It’s a concept basically identical to the more commonly known “GOTO Hell” as they both mean erratic jumps in flow of control that can be very hard to reason about and work with. When writing an application we need to analyze all the callbacks to be sure e.g. we’re not using a value that can be changed by a callback at a random point of time.

But there exists a declarative approach to solving this problem that let’s us reason about it in a much more predictable and less chaotic fashion – Streams. Read more

For some time now Spark has been offering a Pipeline API (available in MLlib module) which facilitates building sequences of transformers and estimators in order to process the data and build a model. Moreover, Spark MLlib module ships with a plethora of custom transformers that make the process of data transformation easy and painless. But what happens if there is no transformer that supports a particular use case? Read more

conductR logo

Part of the success of modern application is targeting it globally – all over the world. It isn’t possible to run such application on a single machine, even with most powerful hardware.

Definitions like Distributed computing or Reactive applications were born in the process of IT globalization. Nowadays, applications run on multiple virtual machines distributed over multiple physical machines which are often spread around the world. Such applications aren’t easy to maintain.

Every service has different hardware requirements and dependencies, so it has to be deployed and upgraded continuously. In addition each machine has to be configured in such a way that allows communication within the cluster and with external services. Although Devops have helpful deployment tools like Chef, Puppet or Ansible, these tasks still aren’t easy, trust me. Read more

Introduction

Any application sooner or later will fail. Imperative style programming usually handles this using side-effects by propagating exceptions and handling them later on. This approach introduces statefulness and deferring the error to outer bounds of the application. This creates hidden control-flow paths, that are difficult to reason about and debug properly when the code grows too much. Read more