Home
/
Blog
/
The OpenTelemetry + Mesmer duo: state of the Mesmer project

14.04.2022 / By Łukasz Gajowy

The OpenTelemetry + Mesmer duo: state of the Mesmer project

The Mesmer project is an Open-Source initiative in Scalac aiming to provide OpenTelemetry Metrics auto-instrumentation for Scala libraries. We have recently started investing more and more time in this project and decided to introduce some changes in its design that hopefully will make it even more useful.

If you’re not familiar with the terms “OpenTelemetry” and “auto-instrumentation”, that’s ok – I will briefly go through them below. If you are, feel free to skip to the section where I describe where we currently are with the project. You can also take a look at this previous blogpost about Mesmer, where Piotr and Jakub go way deeper into the details.

OpenTelemetry

The first and most important thing I need to spend at least a few moments on is OpenTelemetry. This is a set of tools that you can use to instrument your application with, to generate telemetry data for various signals (metrics, traces, and logs). But that’s not all – OpenTelemetry provides a standardized approach to telemetry. It aims to become a common tongue when it comes to collecting telemetry signals. It already implements several APIs and SDKs in different languages that are all able to send the data to collectors in a unified format. Thanks to this, you can use a standardized way to instrument all your apps, without the need to learn any new approaches.

Here’s a link to the OpenTelemetry project in case you want to dive in: https://opentelemetry.io/.

Ok. I get what OpenTelemetry is. But what does “auto-instrumentation” mean exactly?

Typically, when you want to report telemetry signals such as logs, traces or metrics, you need to use an API to precisely define “where”, “what” and “how” you’d like to collect them. A boring but omnipresent example would be sending logs to a console:

logger.error(“There’s a fly in my soup! Fly Count ${flyCount}”)

The act of defining “where”, “what” and “how” I want to collect the signals is called “Instrumentation”. For examples like the one above, the OpenTelemetry community even adds the word “manual” before “instrumentation” to emphasize the act of doing it, well… manually.

In contrast to the above example, sometimes there are ways to not involve already busy developers in the act of writing the instrumentation code and this is what we can call an “automatic instrumentation”. On JVM, with OpenTelemetry, this is done by using a Javaagent, which is loaded along with your application to modify the bytecode of the libraries you use and inject the instrumentation code. So no manual coding is involved. The only thing you as a developer have to do is to add the Java agent and the rest will happen automatically with sensible default settings (which of course you can modify).

So when you want to collect telemetry signals from third-party libraries, there’s a chance somebody has already created an auto-instrumentation solution for that. Make sure to check if you can leverage it. For your code, you can use manual instrumentation to get your signals collected.

Mesmer before the latest release

Before I dive into the details of what exactly we changed in the project in the latest release, I’d like to briefly remind you what Mesmer was previously made of. You could distinguish the following main pieces:

A custom Mesmer Javaagent. Responsible for collecting all of Mesmer’s instrumentation code and injecting it into an instrumented application. It was configured using HOCON yaml files, similarly to how one could configure an Akka application so that you could turn some features on/off depending on your preferences.
ByteBudy Advices + Instrumentation Code. This is the code that’s injected by the Agent. The instrumentation code focused only on instrumenting metrics.
Akka Extension (for Akka Metrics). This piece holds information to calculate the metrics from modified classes and passes the data to the OpenTelemetry Exporter.
OpenTelemetry Exporter and Collector: these are the pieces that you need to be able to collect telemetry signals with OpenTelemetry. The Exporter translates signals to the right protocol (otel protocol, prometheus, others) and the Collector receives the data for further processing.

This is how the whole system could be depicted:

These pieces allowed us to set up the project with metrics with very little effort and no coding involved on the client’s side. But the downside was that we were using a custom Java agent that was far from perfect, was maintained by a small team, and had very limited functionality.

The Mesmer OpenTelemetry Javaagent Extension

Therefore, we decided to drop the agent and use the OpenTelemetry Java Agent instead:

In the whole OpenTelemetry initiative there’s a great group of experts devoted to developing high-quality, performant observability solutions. There’s just no point in us competing with them. So instead, we decided to focus on using their solutions as much as we could and transform Mesmer into an OpenTelemetry Javaagent Extension. But what does that mean for us?

First of all, we reduced our codebase while keeping the same functionality. Both our old Agent and OpenTelemetry use ByteBuddy for bytecode manipulation, so it was relatively easy to adapt our instrumentations. Now it’s the OpenTelemetry Agent that loads and uses our code to auto-instrument the user’s code with Mesmer’s metrics. For us, that means less code to maintain, for you, it means that if you already are using OpenTelemetry in your Scala application, you can extend your Metric collection by adding just one parameter:

-Dotel.javaagent.extensions=/path/to/the/mesmer-otel-extension.jar

Secondly, the OpenTelemetry initiative is not just about metrics. The OpenTelemetry Agent, among other libraries (listed here), already supports Akka with tracing. So when you’re using the OpenTelemetry Agent with the Mesmer extension you get the best of both worlds: OpenTelemetry’s Traces and Mesmer’s Metrics. This was previously not the case with our custom Agent. Of course, the same goes for OpenTelemetry Logging but please keep in mind that at the time of writing, OTEL Logging api is marked as “experimental”.

We have also changed the way Mesmer is configured. Since we are now so close to the OpenTelemetry agent, it makes sense to use the same configuration mechanism that the Agent does. So we said goodbye to our HOCON configuration files and followed the same convention as the OpenTelemetry project. Now you can use system properties, environment variables or property configuration files that the OpenTelemetry Agent will read and apply for you. So, for example, if you want to turn on or off a metric or some particular library instrumentation, just do this:

-Dotel.instrumentation.mesmer-akka-persistence.enabled=true
-Dio.scalac.mesmer.module.akkapersistence.persistent.event.total=true
-Dio.scalac.mesmer.module.akkapersistence.recovery.time=false

or if you want to use a configuration file, set the OTEL_JAVAAGENT_CONFIGURATION_FILE variable to point to the following config:

otel.instrumentation.mesmer-akka-persistence.enabled=true
io.scalac.mesmer.module.akkapersistence.persistent.event.total=false
io.scalac.mesmer.module.akkapersistence.recovery.time=false

For more info about configuring, please see the OpenTelemetry docs. For more technical details about the Mesmer extension in general, see the Proposal doc we have created.

Demo time!

Now let me show you what it all looks like with our demo application:

Run the example/docker/docker-compose.yaml. It will:
1. Start a Postgres DB (for the example app)
2. Start an OpenTelemetry Collector
3. Start Prometheus and Grafana (this is for metrics presentation)
4. Start Jaeger (this is for showing traces)
Start the demo application:

sbt “project example” runExampleWithOtelAgent

The above task is equivalent to running something along these lines:

java \
-javaagent:opentelemetry-javaagent110.jar 
-Dotel.metric.export.interval=10000 \ # Some agent custom config, just for convenience
-Dotel.service.name=mesmer-example \
-Dotel.metrics.exporter=otlp \
-Dio.scalac.mesmer.module.akkapersistence.recovery.time=false \ # I turned this metric off
-Dotel.javaagent.extensions=mesmer-otel-extension.jar \ # This is us, the extension!
-jar mesmer-akka-example.jar

Once you do this and collect the metrics over some time, you can observe for example how many persistence events there were and how it all changed over time. This is of course only one of the metrics we provide:

At the same time, looking at Jaeger UI, you can get even more insights thanks to Tracing provided by the OpenTelemetry Agent:

Plans for the future

Of course, we don’t want to stop there. I can even say this is just the beginning. We already have some plans for reducing our codebase even further. This will allow us to focus solely on the real problem of developing more and more insightful metrics, which, after all, is the thing we should be trying to solve in the first place. We are also already working on expanding our extension to support other Scala libraries and metrics, not only Akka. More on that in the following posts, hopefully really soon.

Conclusion

So to sum up: we decided to bring our Mesmer project even closer to OpenTelemetry and it seems to work for us pretty well. Thanks to using the OpenTelemetry Agent, we (the authors) and you (the user) get a lot more out of the box with much less effort. Other than that, we’re not stopping there and will continue to expand the project. If you wish to participate in this endeavor – please feel invited to reach out to us!

Github link: https://github.com/ScalaConsultants/mesmer

Authors

Łukasz Gajowy

I’m a (currently Scala) Software Developer, Apache Committer (Beam), and a good software design enthusiast. I like statically-typed functional programming, working with cutting-edge technologies and doing technical research for new projects. You can follow me on Twitter (@lgajowy)!

The OpenTelemetry + Mesmer duo: state of the Mesmer project

OpenTelemetry

Ok. I get what OpenTelemetry is. But what does “auto-instrumentation” mean exactly?

Mesmer before the latest release

The Mesmer OpenTelemetry Javaagent Extension

Demo time!

Plans for the future

Conclusion

Read more

Authors

Categories

Index

Latest Blogposts

CrowdStrike Falcon Down: How a single security update shutdown Windows worldwide

Scalendar July 2024

How Akka Specialists Drive Innovation in Software Projects

Need a successful project?

The OpenTelemetry + Mesmer duo: state of the Mesmer project

OpenTelemetry

Ok. I get what OpenTelemetry is. But what does “auto-instrumentation” mean exactly?

Mesmer before the latest release

The Mesmer OpenTelemetry Javaagent Extension

Demo time!

Plans for the future

Conclusion

Read more

Download e-book:

Authors

Newsletter

Free Consultations

Download e-book:

Popular Posts in category

Categories

Index

Latest Blogposts

CrowdStrike Falcon Down: How a single security update shutdown Windows worldwide

Scalendar July 2024

How Akka Specialists Drive Innovation in Software Projects

Need a successful project?