Javascript is required
·
10 min read

Monitoring Spring Boot Application With Micrometer, Prometheus And Grafana Using Custom Metrics

Monitoring Spring Boot Application With Micrometer, Prometheus And Grafana Using Custom Metrics Image

It is important to monitor an application's metrics and health which helps us to improve performance, manage the app in a better way, and notice unoptimized behavior. Monitoring each service is important to be able to maintain a system that consists of many microservices.

In this blog post, I will demonstrate how a Spring Boot web application can be monitored using Micrometer which exposes metrics from our application, Prometheus which stores the metric data, and Grafana to visualize the data in graphs.

Implementing these tools can be done quite easily by adding just a few configurations. Additional to the default JVM metrics I will show how you can expose custom metrics like a user counter.

As always, the code for the demo used in this article can be found on GitHub.

Spring Boot

The base for our demo is a Spring Boot application which we initialize using Spring Initializr:

Spring Initializr

We initialized the project using spring-boot-starter-actuator which already exposes production-ready endpoints.

If we start our application we can see that some endpoints like health and info are already exposed to the /actuator endpoint per default.

Triggering the /actuator/health endpoint gives us a metric if the service is up and running:

bash
1 http GET "http://localhost:8080/actuator/health"
2HTTP/1.1 200
3Connection: keep-alive
4Content-Type: application/vnd.spring-boot.actuator.v3+json
5Date: Wed, 21 Oct 2020 18:11:35 GMT
6Keep-Alive: timeout=60
7Transfer-Encoding: chunked
8
9{
10    "status": "UP"
11}

Spring Boot Actuator can be integrated into Spring Boot Admin which provides a visual admin interface for your application. But this approach is not very popular and has some limitations. Therefore, we use Prometheus instead of Spring Boot Actuator and Grafana instead of Spring Boot Admin to have a more popular and framework/language-independent solution.

This solution approach needs vendor-neutral metrics and Micrometer is a popular tool for this use case.

Micrometer

Micrometer provides a simple facade over the instrumentation clients for the most popular monitoring systems, allowing you to instrument your JVM-based application code without vendor lock-in. Think SLF4J, but for metrics.

Micrometer is an open-source project and provides a metric facade that exposes metric data in a vendor-neutral format that a monitoring system can understand. These monitoring systems are supported:

  • AppOptics
  • Azure Monitor
  • Netflix Atlas
  • CloudWatch
  • Datadog
  • Dynatrace
  • Elastic
  • Ganglia
  • Graphite
  • Humio
  • Influx/Telegraf
  • JMX
  • KairosDB
  • New Relic
  • Prometheus
  • SignalFx
  • Google Stackdriver
  • StatsD
  • Wavefront

Micrometer is not part of the Spring ecosystem and needs to be added as a dependency. In our demo application, this was already done in the Spring Initializr configuration.

Next step is to expose the Prometheus metrics in application.properties:

management.endpoints.web.exposure.include=prometheus,health,info,metric

Now we can trigger this endpoint and see the Prometheus metrics:

See response output
bash
1 http GET "http://localhost:8080/actuator/prometheus"
2HTTP/1.1 200
3Connection: keep-alive
4Content-Length: 8187
5Content-Type: text/plain; version=0.0.4;charset=utf-8
6Date: Thu, 22 Oct 2020 09:19:36 GMT
7Keep-Alive: timeout=60
8
9# HELP tomcat_sessions_rejected_sessions_total
10# TYPE tomcat_sessions_rejected_sessions_total counter
11tomcat_sessions_rejected_sessions_total 0.0
12# HELP system_cpu_usage The "recent cpu usage" for the whole system
13# TYPE system_cpu_usage gauge
14system_cpu_usage 0.0
15# HELP jvm_buffer_count_buffers An estimate of the number of buffers in the pool
16# TYPE jvm_buffer_count_buffers gauge
17jvm_buffer_count_buffers{id="mapped",} 0.0
18jvm_buffer_count_buffers{id="direct",} 3.0
19# HELP jvm_memory_used_bytes The amount of used memory
20# TYPE jvm_memory_used_bytes gauge
21jvm_memory_used_bytes{area="heap",id="G1 Survivor Space",} 1.048576E7
22jvm_memory_used_bytes{area="heap",id="G1 Old Gen",} 3099824.0
23jvm_memory_used_bytes{area="nonheap",id="Metaspace",} 3.9556144E7
24jvm_memory_used_bytes{area="nonheap",id="CodeHeap 'non-nmethods'",} 1206016.0
25jvm_memory_used_bytes{area="heap",id="G1 Eden Space",} 3.3554432E7
26jvm_memory_used_bytes{area="nonheap",id="Compressed Class Space",} 5010096.0
27jvm_memory_used_bytes{area="nonheap",id="CodeHeap 'non-profiled nmethods'",} 6964992.0
28# HELP jvm_gc_pause_seconds Time spent in GC pause
29# TYPE jvm_gc_pause_seconds summary
30jvm_gc_pause_seconds_count{action="end of minor GC",cause="Metadata GC Threshold",} 1.0
31jvm_gc_pause_seconds_sum{action="end of minor GC",cause="Metadata GC Threshold",} 0.009
32# HELP jvm_gc_pause_seconds_max Time spent in GC pause
33# TYPE jvm_gc_pause_seconds_max gauge
34jvm_gc_pause_seconds_max{action="end of minor GC",cause="Metadata GC Threshold",} 0.009
35# HELP jvm_gc_live_data_size_bytes Size of old generation memory pool after a full GC
36# TYPE jvm_gc_live_data_size_bytes gauge
37jvm_gc_live_data_size_bytes 4148400.0
38# HELP jvm_gc_max_data_size_bytes Max size of old generation memory pool
39# TYPE jvm_gc_max_data_size_bytes gauge
40jvm_gc_max_data_size_bytes 4.294967296E9
41# HELP tomcat_sessions_active_current_sessions
42# TYPE tomcat_sessions_active_current_sessions gauge
43tomcat_sessions_active_current_sessions 0.0
44# HELP process_files_open_files The open file descriptor count
45# TYPE process_files_open_files gauge
46process_files_open_files 69.0
47# HELP http_server_requests_seconds
48# TYPE http_server_requests_seconds summary
49http_server_requests_seconds_count{exception="None",method="GET",outcome="SUCCESS",status="200",uri="/actuator/health",} 1.0
50http_server_requests_seconds_sum{exception="None",method="GET",outcome="SUCCESS",status="200",uri="/actuator/health",} 0.041047824
51# HELP http_server_requests_seconds_max
52# TYPE http_server_requests_seconds_max gauge
53http_server_requests_seconds_max{exception="None",method="GET",outcome="SUCCESS",status="200",uri="/actuator/health",} 0.041047824
54# HELP jvm_threads_peak_threads The peak live thread count since the Java virtual machine started or peak was reset
55# TYPE jvm_threads_peak_threads gauge
56jvm_threads_peak_threads 32.0
57# HELP process_uptime_seconds The uptime of the Java virtual machine
58# TYPE process_uptime_seconds gauge
59process_uptime_seconds 13.385
60# HELP process_cpu_usage The "recent cpu usage" for the Java Virtual Machine process
61# TYPE process_cpu_usage gauge
62process_cpu_usage 0.0
63# HELP jvm_memory_max_bytes The maximum amount of memory in bytes that can be used for memory management
64# TYPE jvm_memory_max_bytes gauge
65jvm_memory_max_bytes{area="heap",id="G1 Survivor Space",} -1.0
66jvm_memory_max_bytes{area="heap",id="G1 Old Gen",} 4.294967296E9
67jvm_memory_max_bytes{area="nonheap",id="Metaspace",} -1.0
68jvm_memory_max_bytes{area="nonheap",id="CodeHeap 'non-nmethods'",} 7553024.0
69jvm_memory_max_bytes{area="heap",id="G1 Eden Space",} -1.0
70jvm_memory_max_bytes{area="nonheap",id="Compressed Class Space",} 1.073741824E9
71jvm_memory_max_bytes{area="nonheap",id="CodeHeap 'non-profiled nmethods'",} 2.44105216E8
72# HELP logback_events_total Number of error level events that made it to the logs
73# TYPE logback_events_total counter
74logback_events_total{level="warn",} 0.0
75logback_events_total{level="debug",} 0.0
76logback_events_total{level="error",} 0.0
77logback_events_total{level="trace",} 0.0
78logback_events_total{level="info",} 8.0
79# HELP system_load_average_1m The sum of the number of runnable entities queued to available processors and the number of runnable entities running on the available processors averaged over a period of time
80# TYPE system_load_average_1m gauge
81system_load_average_1m 3.18994140625
82# HELP jvm_gc_memory_promoted_bytes_total Count of positive increases in the size of the old generation memory pool before GC to after GC
83# TYPE jvm_gc_memory_promoted_bytes_total counter
84jvm_gc_memory_promoted_bytes_total 0.0
85# HELP jvm_threads_states_threads The current number of threads having NEW state
86# TYPE jvm_threads_states_threads gauge
87