Duolingo Case Study: Why They Chose Scala?
One of the biggest language learning apps in the world, Duolingo is serving hundreds of millions of users every single day. Most of the app’s backend was written in Python, yet the tech team has decided to rewrite one of Duolingo’s core features in Scala.
With this article, you’re going to find out:
- why Duolingo chose Scala
- how Scala helped Duolingo create a better product
- why Scala can be the right choice for your company
Want to learn more? Let’s dive in:
Duolingo before Scala
If you’ve never learned a language with Duolingo, here’s a quick summary of how it works:
As you can see, the app’s users learn through short lessons consisting of different types of exercises.
Here comes the question: which exercises, and in what order, should the users see?
Duolingo’s Session Generator is the module that makes these decisions for the users creating adequate educational sequences.
This core feature has been around since the very beginning of Duolingo. The startup, as we already know, was gaining popularity pretty fast, thus growing rapidly.The hectic pace was both a blessing and a curse. Duolingo’s tech team had to act fast and didn’t always have the time and resources to optimize all the processes accordingly. And this led to significant technical debt.
For instance, the Session Generator’s architecture featured many hard dependencies. This means that if any of those failed, the whole module would stop working. In other words, many features of the module affected the performance of other features. As you can imagine, this makes potential bugs much more likely.
Because of that, the Duolingo team decided to redesign the module’s architecture and eventually rewrite the Session Generator. Here’s how they decided on the language:
Moving to Scala
As in many companies, Duolingo’s backend was originally written in Python, one of the world’s most popular programming languages. It’s a common choice, as it’s easy to understand for developers with different kinds of experience.
There are, however, some challenges that come with Python, one of them being the speed.
This language tends to be visibly slower than Java or C, while Java is also considered rather slow and verbose. What’s more, Python’s dynamic typing can be the cause of many runtime bugs.
To avoid these issues, the Duolingo team decided to move their product to Scala. Contrary to Python, Scala is a statically-typed language. It’s widely used in other complex big data apps like for example Kafka that was developed by LinkedIn and is used to handle real-time data feeds. Not surprisingly, it turned out to be a good fit for Duolingo’s Session Generator as well. The team claims that they were looking for a more modern programming language. They also think that Scala is very mature in the back end so it fits the bill.
Implementing Scala in Duolingo
JVM and Functional Programming
There was no need to reinvent the wheel. As Scala is built on the Java Virtual Machine, developers could use existing Java libraries. One of the biggest changes was the shift towards functional programming. Andre Kenji Horie, Duolingo’s senior software engineer, is a big fan of this solution. According to him, around 99% of Duolingo’s codebase is functional. He claims that it makes things much easier to debug, much easier to maintain.
Many developers have the impression that Scala seems to have learned from the mistakes of other languages. Here’s how the Duolingo team perceives it:
- More concise – While Java is known for being elaborate and rather slow, Scala does a lot to make the language less verbose. Thanks to it being a functional language, the developers can use simple one-liners in many cases too.
- Less prone to bugs – As we’ve already mentioned, Scala’s static typic makes it easier to catch bugs with a compiler. Less verbosity makes the work smoother, too. Contrary to Python, Scala is much less prone to runtime bugs.
- Saves a lot of time – Andre Kenji Horie has stressed that writing code in Scala is significantly faster than programming in Java. In the same amount of time he’d use for coding alone, he can also write unit tests for Scala. This makes the developers more confident about the quality of their work.
- Easier maintenance – When compared to Python, the development speed is similar, yet Scala’s maintenance is much faster. Andre mentioned that after moving to Scala he suspected that refactoring would take him around an hour, like in Python. To his surprise, it only took around one minute.
- Efficient performance – As you already know, Scala is running on a Java Virtual Machine and is doing very well in a multi-threaded environment. In other words, it means that it can do a lot of things at once. Scala’s efficiency also means that Duolingo can use 10 times fewer servers to handle the same amount of traffic, which again is a huge saving.
Of course, there were some challenges along the way. The team has described the two main pain points:
- Java integrations – Some of the integrations were not compatible with certain Scala-specific features.
- Documentation – As Scala is a rather new language, finding some of the libraries required extra effort.
Scala learning curve
What’s more, many people claim that Scala’s learning curve is quite steep. In this case, improving readability and optimizing the product’s maintenance process was far more important, and here’s where Scala is doing really well. When asked about the learning curve, software engineers from Duolingo say that it was actually a lot easier than anticipated. It proves that even though the language is new, learning it is not an obstacle.
To sum up, the team is happy with Scala despite the tiny downsides. Andre Kenji Horie has said in a Software Engineer Daily podcast that There are some small things here and there in the language but […] it is not a big deal. He claims that pain points didn’t come with Scala itself but were more related to changing from one language to another. For instance, when they had a function that was used in Python but not in Scala, they needed to find a workaround.
Note that this is only related to the onboarding phases. In general, the team appreciates how Scala offers a lot of useful features that are not present in Python or Java.
Duolingo and Scala: the results
The service turned out to be much more robust after the rewrite. It’s easiest to see in the speed test results. The latency before the rewrite was 750 milliseconds, while after implementing Scala it went down to as little as 14 milliseconds. Quite impressive, isn’t it?
To top it all off, the rewrite proved to be significantly more stable. With the old infrastructure, the average downtime was around 2 hours every quarter. After switching to Scala, the first few months boasted zero downtime. Obviously, it probably won’t last forever, yet it still looks very promising.
The team claims that Scala meets their needs as it’s very mature in the back end and, as we’ve mentioned before, makes a great fit for big data applications.
Last but not least, the shift to Scala has improved morale among developers. They feel more confident deploying the code, as they don’t have to fear they’ll break the entire app. The improving testing suite that is used in Scala makes their work much more pleasant. Considering unit tests are prepared by developers who test individual code before integration, issues can be found at an early stage. Thanks to that, they can be resolved without impacting the other parts of the code.
Scala as the right choice for your business
Does it sound like Scala could be the right fit for your product? If you’d like to learn more about implementing Scala in other global companies, you’re sure to enjoy this Zalando case study. Of course, if you have any questions about implementing Scala,we’re always happy to help, as we’re of the biggest Scala development companies. Feel free to get in touch via live chat on the website!
The article is based on the following sources: