How to collect Prometheus metrics from Node.js or PM2 cluster mode

The problem

If you’re here, then you probably have a Node.js application running in Cluster mode, either through the native Node APIs or through a package manager like PM2. In this mode, however, there is usually a load balancer that switches between several child processes to do computational work. Each of these child processes has their own statistics for resource usage. If you’re using something like Prometheus, to collect custom metrics, they are also saved per process. This results in jagged or incorrect results, when trying to display and analyze the data in a tool such as Grafana. The question is, how do we collect all these metrics and aggregate them for easy consumtion at one place?

These methods use the NPM Package prom-client, to work with Prometheus.
I will not get into detail how to setup the package, instead I will show mostly the code that makes the aggregation.
In addition, we will be using Express.js to serve the results as a web page.

Node.js cluster mode

First, we need to setup the Prometheus client and an aggregator registry in our Node.js entry script (main.js).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
'use strict';

const cluster = require('cluster');
const express = require('express');
const metricsServer = express();
const AggregatorRegistry = require('prom-client').AggregatorRegistry;
const aggregatorRegistry = new AggregatorRegistry();

if (cluster.isMaster) {
for (let i = 0; i < 4; i++) {
cluster.fork();
}

metricsServer.get('/cluster_metrics', async (req, res) => {
try {
const metrics = await aggregatorRegistry.clusterMetrics();
res.set('Content-Type', aggregatorRegistry.contentType);
res.send(metrics);
} catch (ex) {
res.statusCode = 500;
res.send(ex.message);
}
});

metricsServer.listen(3001);
console.log(
'Cluster metrics server listening to 3001, metrics exposed on /cluster_metrics',
);
} else {
require('./server.js');
}

Then, in our application script (server.js), we need to start metrics collection for every child process that is being run.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
'use strict';

const express = require('express');
const cluster = require('cluster');
const server = express();
const register = require('prom-client').register;

// Enable collection of default metrics

require('prom-client').collectDefaultMetrics({
gcDurationBuckets: [0.001, 0.01, 0.1, 1, 2, 5], // These are the default buckets.
});

server.get('/metrics', async (req, res) => {
try {
res.set('Content-Type', register.contentType);
res.end(await register.metrics());
} catch (ex) {
res.status(500).end(ex);
}
});

server.get('/metrics/counter', async (req, res) => {
try {
res.set('Content-Type', register.contentType);
res.end(await register.getSingleMetricAsString('test_counter'));
} catch (ex) {
res.status(500).end(ex);
}
});

const port = process.env.PORT || 3000;

console.log(
`Server listening to ${port}, metrics exposed on /metrics endpoint`,
);

server.listen(port);

Run the application npm server.js.

Note: this may not work for third party packages, which sometimes use custom registries.
If you want to use a custom registry, call the following from the child processes:

1
client.AggregatorRegistry.setRegistries(registryOrArrayOfRegistries)

PM2 Process Manager

For PM2, the idea is the same, but the implementation is a bit different. Since PM2 occupies the master process, we cannot set the metrics collection server inside it. Instead we will be spawning a separate application to act as the metrics server.

I have the following ecosystem configuration file:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
pm2config = {
apps: [
{
name: 'Metrics Collector',
instances: 1,
exec_mode: 'fork',
script: 'main.js',
env: {
METRICS_COLLECTOR: true,
},
},
{
name: 'My application',
instances: 4,
exec_mode: 'cluster',
script: 'main.js',
},
],
};

Note that i am using the same script for the server and for the application.
Here is the code of the main.js script:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
const pm2 = require('pm2');
const promClient = require('prom-client');
const metricsServer = require('express')();
const runMyApp = require('./my-app');

if (process.env.METRICS_COLLECTOR) {
const metrics = {};

const metricsServerPort = 9000;

metricsServer.get('/', (req, res) => {
// This is how to get all metrics
const response = promClient.AggregatorRegistry.aggregate(
Object.values(metrics).map((o) => o),
);
res.json(await response.metrics());
});

metricsServer.listen(metricsServerPort, '0.0.0.0', () => {
process.send('ready');

console.log(`Metrics server listening on 0.0.0.0:${metricsServerPort}`);
});

// Ask other processes for metrics Every 10 seconds
setInterval(() => {
pm2.connect(() => {
pm2.describe('Worker app name', (err, processInfo) => {
processInfo.forEach((processData) => {
console.log(`Asking process ${processData.pm_id} for metrics.`);

pm2.sendDataToProcessId(
processData.pm_id,
{
// Data and topic are required
data: null,
topic: 'getMetrics',
from: process.env.pm_id,
},
(err, res) => {},
);
});
});
});
}, 10e3);

process.on('message', (msg) => {
if (msg.from !== process.env.pm_id && msg.topic === 'returnMetrics') {
metrics[msg.from] = msg.data;
}
});
} else {
const prom = require('prom-client');

process.on('message', (msg) => {
if (msg.from !== process.env.pm_id && msg.topic === 'getMetrics') {
pm2.connect(() => {
prom.register.getMetricsAsJSON().then((data) => {
pm2.sendDataToProcessId(
msg.from,
{
// Data and topic are required
from: process.env.pm_id,
data: data,
topic: 'returnMetrics',
},
(err, res) => {},
);
});
});
}
});

runMyApp();

process.on('SIGINT', () => {
console.info('SIGINT signal received.');

// Close connections to PM2
pm2.disconnect(() => {});

// Stops the server from accepting new connections and finishes existing connections.
server.close((err) => {
if (err) {
console.error(err);
process.exit(1);
}
});
});
}

In the above example, you can see how to setup a communication pipeline to send messages between the different applications, managed by PM2.
Now the only thing left is to implement your application’s prometheus metric colletion.

my-app.js

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
import express from 'express';
import promBundle from 'express-prom-bundle';

// This middleware defines configuration to use with Prometheus monitoring
export const metricsMiddleware = promBundle({
buckets: [0.1, 0.4, 0.7],
includeMethod: true,
includeStatusCode: true,
includePath: true,
promClient: {
collectDefaultMetrics: {},
},
autoregister: false,
});

const app = express();
app.use(metricsMiddleware);

app.listen(process.env.PORT, process.env.HOST, () => {
console.log(
`Running on http://${process.env.HOST}:${process.env.PORT}`,
);
});

Inspiration:

Some parts of the aggregation method were inspired by the following projects. I am adding links, as well as explanation why(if) I didn’t use these projects directly.

  • Prom-client - NPM Package - This package is what most other Prometheus packages use as a base. In its README there is description on how to set up metrics collection for Node.js cluster application. I am using this package, as a dependency for other packages (Express prom-bundle) and directly, in some cases. However, the method that is described in its readme, doesn’t work directly with PM2 applications, since PM2 occupies the Node.js master process. From here, I’ve borrowed the general idea of metrics collection and aggregation to a single process.

  • PM2 Cluster Prometheus - PM2 Package - This PM2 package might be a “one-click” stop for you, as it implements everything that you will need. I will leave learning how to use it by yourselves. I didn’t use this package, because it’s outdated. Plus it has some weird dependencies (NPM consul and address), which are outdated too. Note: This package uses pmx, which i saw is deprecated.

  • PM2 Cluster Prometheus - NPM Package - This package has a totally different author than the previous one - don’t be confused by the name. Actually, this is the package that I’ve used to figure out how to aggregate the Prometheus statistics into one registry. This is another “one-click” solution that you might want to explore. I didn’t use this package, because even if it’s signifficantly newer, it’s still quite old and, according to Snyk.io, has some security issues.

  • PM2 Messages - A package for sending Inter-Process Communication(IPC) messages in PM2. I didn’t use it, because I don’t need it for my simple use case and I want to reduce dependencies for my project to a minimum..

Evaluating 'Vector' - a new observability tool Custom maintenance page on Nginx

Comments