The OpenTelemetry + Mesmer duo: state of the Mesmer project

The Mesmer project is an Open-Source initiative in Scalac aiming to provide OpenTelemetry Metrics auto-instrumentation for Scala libraries. We have recently started investing more and more time in this project and decided to introduce some changes in its design that hopefully will make it even more useful.

If you’re not familiar with the terms “OpenTelemetry” and “auto-instrumentation”, that’s ok – I will briefly go through them below. If you are, feel free to skip to the section where I describe where we currently are with the project. You can also take a look at this previous blogpost about Mesmer, where Piotr and Jakub go way deeper into the details. 

OpenTelemetry

The first and most important thing I need to spend at least a few moments on is OpenTelemetry. This is a set of tools that you can use to instrument your application with, to generate telemetry data for various signals (metrics, traces, and logs). But that’s not all – OpenTelemetry provides a standardized approach to telemetry. It aims to become a common tongue when it comes to collecting telemetry signals. It already implements several APIs and SDKs in different languages that are all able to send the data to collectors in a unified format. Thanks to this, you can use a standardized way to instrument all your apps, without the need to learn any new approaches.

Here’s a link to the OpenTelemetry project in case you want to dive in: https://opentelemetry.io/.

Ok. I get what OpenTelemetry is. But what does “auto-instrumentation” mean exactly?

Typically, when you want to report telemetry signals such as logs, traces or metrics, you need to use an API to precisely define “where”, “what” and “how” you’d like to collect them. A boring but omnipresent example would be sending logs to a console: 

logger.error(“There’s a fly in my soup! Fly Count ${flyCount}”) 

The act of defining “where”, “what” and “how” I want to collect the signals is called “Instrumentation”. For examples like the one above, the OpenTelemetry community even adds the word “manual” before “instrumentation” to emphasize the act of doing it, well… manually. 

In contrast to the above example, sometimes there are ways to not involve already busy developers in the act of writing the instrumentation code and this is what we can call an “automatic instrumentation”. On JVM, with OpenTelemetry, this is done by using a Javaagent, which is loaded along with your application to modify the bytecode of the libraries you use and inject the instrumentation code. So no manual coding is involved. The only thing you as a developer have to do is to add the Java agent and the rest will happen automatically with sensible default settings (which of course you can modify).

So when you want to collect telemetry signals from third-party libraries, there’s a chance somebody has already created an auto-instrumentation solution for that. Make sure to check if you can leverage it. For your code, you can use manual instrumentation to get your signals collected.

Mesmer before the latest release

Before I dive into the details of what exactly we changed in the project in the latest release, I’d like to briefly remind you what Mesmer was previously made of. You could distinguish the following main pieces:

  • A custom Mesmer Javaagent. Responsible for collecting all of Mesmer’s instrumentation code and injecting it into an instrumented application. It was configured using HOCON yaml files, similarly to how one could configure an Akka application so that you could turn some features on/off depending on your preferences.
  • ByteBudy Advices + Instrumentation Code. This is the code that’s injected by the Agent. The instrumentation code focused only on instrumenting metrics.
  • Akka Extension (for Akka Metrics). This piece holds information to calculate the metrics from modified classes and passes the data to the OpenTelemetry Exporter.
  • OpenTelemetry Exporter and Collector: these are the pieces that you need to be able to collect telemetry signals with OpenTelemetry. The Exporter translates signals to the right protocol (otel protocol, prometheus, others) and the Collector receives the data for further processing.

This is how the whole system could be depicted:

opentelemetry mesmer

These pieces allowed us to set up the project with metrics with very little effort and no coding involved on the client’s side. But the downside was that we were using a custom Java agent that was far from perfect, was maintained by a small team, and had very limited functionality.

The Mesmer OpenTelemetry Javaagent Extension

Therefore, we decided to drop the agent and use the OpenTelemetry Java Agent instead:

opentelemetry mesmer

In the whole OpenTelemetry initiative there’s a great group of experts devoted to developing high-quality, performant observability solutions. There’s just no point in us competing with them. So instead, we decided to focus on using their solutions as much as we could and transform Mesmer into an OpenTelemetry Javaagent Extension. But what does that mean for us?

First of all, we reduced our codebase while keeping the same functionality. Both our old Agent and OpenTelemetry use ByteBuddy for bytecode manipulation, so it was relatively easy to adapt our instrumentations. Now it’s the OpenTelemetry Agent that loads and uses our code to auto-instrument the user’s code with Mesmer’s metrics. For us, that means less code to maintain, for you, it means that if you already are using OpenTelemetry in your Scala application, you can extend your Metric collection by adding just one parameter: 

-Dotel.javaagent.extensions=/path/to/the/mesmer-otel-extension.jar

Secondly, the OpenTelemetry initiative is not just about metrics. The OpenTelemetry Agent, among other libraries (listed here), already supports Akka with tracing. So when you’re using the OpenTelemetry Agent with the Mesmer extension you get the best of both worlds: OpenTelemetry’s Traces and Mesmer’s Metrics. This was previously not the case with our custom Agent. Of course, the same goes for OpenTelemetry Logging but please keep in mind that at the time of writing, OTEL Logging api is marked as “experimental”.

We have also changed the way Mesmer is configured. Since we are now so close to the OpenTelemetry agent, it makes sense to use the same configuration mechanism that the Agent does. So we said goodbye to our HOCON configuration files and followed the same convention as the OpenTelemetry project. Now you can use system properties, environment variables or property configuration files that the OpenTelemetry Agent will read and apply for you. So, for example, if you want to turn on or off a metric or some particular library instrumentation, just do this:

-Dotel.instrumentation.mesmer-akka-persistence.enabled=true
-Dio.scalac.mesmer.module.akkapersistence.persistent.event.total=true
-Dio.scalac.mesmer.module.akkapersistence.recovery.time=false

or if you want to use a configuration file, set the OTEL_JAVAAGENT_CONFIGURATION_FILE variable to point to the following config:  

otel.instrumentation.mesmer-akka-persistence.enabled=true
io.scalac.mesmer.module.akkapersistence.persistent.event.total=false
io.scalac.mesmer.module.akkapersistence.recovery.time=false

For more info about configuring, please see the OpenTelemetry docs. For more technical details about the Mesmer extension in general, see the Proposal doc we have created. 

Demo time!

Now let me show you what it all looks like with our demo application:

  1. Run the example/docker/docker-compose.yaml. It will: 
    1. Start a Postgres DB (for the example app)
    2. Start an OpenTelemetry Collector
    3. Start Prometheus and Grafana (this is for metrics presentation)
    4. Start Jaeger (this is for showing traces)
  2. Start the demo application:
sbt “project example” runExampleWithOtelAgent

The above task is equivalent to running something along these lines:

java \
-javaagent:opentelemetry-javaagent110.jar 
-Dotel.metric.export.interval=10000 \ # Some agent custom config, just for convenience
-Dotel.service.name=mesmer-example \
-Dotel.metrics.exporter=otlp \
-Dio.scalac.mesmer.module.akkapersistence.recovery.time=false \ # I turned this metric off
-Dotel.javaagent.extensions=mesmer-otel-extension.jar \ # This is us, the extension!
-jar mesmer-akka-example.jar

Once you do this and collect the metrics over some time, you can observe for example how many persistence events there were and how it all changed over time. This is of course only one of the metrics we provide:

opentelemetry mesmer

At the same time, looking at Jaeger UI, you can get even more insights thanks to Tracing provided by the OpenTelemetry Agent:

opentelemetry mesmer

Plans for the future

Of course we don’t want to stop there. I can even say this is just the beginning. We already have some plans for reducing our codebase even further. This will allow us to focus solely on the real problem of developing more and more insightful metrics, which, after all, is the thing we should be trying to solve in the first place. We are also already working on expanding our extension to support other Scala libraries and metrics, not only Akka. More on that in the following posts, hopefully really soon. 

Conclusion

So to sum up: we decided to bring our Mesmer project even closer to OpenTelemetry and it seems to work for us pretty well. Thanks to using the OpenTelemetry Agent, we (the authors) and you (the user) get a lot more out of the box with much less effort. Other than that, we’re not stopping there and will continue to expand the project. If you wish to participate in this endeavor – please feel invited to reach out to us!

Github link: https://github.com/ScalaConsultants/mesmer 

Authors

Łukasz Gajowy
Łukasz Gajowy

I’m a (currently Scala) Software Developer, Apache Committer (Beam), and a good software design enthusiast. I like statically-typed functional programming, working with cutting-edge technologies and doing technical research for new projects. You can follow me on Twitter (@lgajowy)!

Latest Blogposts

28.06.2022 / By  Jorge Vasquez

A Prelude of Purity: Scaling Back ZIO

ZIO World is the leading annual global ZIO-based event created by Ziverge. The event aims to inspire new trends, promote innovation, and reveal significant developments across the ZIO ecosystem. ZIO World invites developers to share their expertise in using ZIO. This year, ZIO World hosted Scalac developer and ZIO contributor Jorge Vásquez. His presentation focused on […]

20.06.2022 / By  John Jimenez , Francois Armand

Functional Programming vs OOP

As a young, bright-eyed, bushy-tailed engineer starting my career at NASA in the 90s, I was fortunate enough to develop engineering-oriented software that modeled the different parts of the International Space Station. The million-line codebase was based on objects. Almost every part of the space station was represented as an object, from the overall segments […]

14.06.2022 / By  Łukasz Gajowy

OpenTelemetry from a bird’s eye view: a few noteworthy parts of the project

OpenTelemetry provides you with a set of tools, integrations, APIs, and SDKs in different languages to more easily increase the observability of your application. We figured that, since we’re working on an OpenTelemetry agent extension called Mesmer, we could show you the project from a developer’s perspective and point you to the parts that could […]

Need a successful project?

Estimate project