Commit Graph

6 Commits

Author SHA1 Message Date
George Joseph 53c702e1cc res_prometheus: Clone containers before iterating
The channels, bridges and endpoints scrape functions were
grabbing their respective global containers, getting the
count of entries, allocating metric arrays based on
that count, then iterating over the container.  If the
global container had new objects added after the count
was taken and the metric arrays were allocated, we'd run
out of metric entries and attempt to write past the end
of the arrays.

Now each of the scape functions clone their respective
global containers and all operations are done on the
clone.  Since the clone is stable between getting the
count and iterating over it, we can't run past the end
of the metrics array.

ASTERISK-29130
Reported-By: Francisco Correia
Reported-By: BJ Weschke
Reported-By: Sébastien Duthil

Change-Id: If0c8e40853bc0e9429f2ba9c7f5f358d90c311af
2021-04-02 07:37:41 -05:00
Matt Jordan 0bb38796b7 res_prometheus: Add metrics for PJSIP outbound registrations
When monitoring Asterisk instances, it's often useful to know when an
outbound registration fails, as this often maps to the notion of a trunk
and having a trunk fail is usually a "bad thing". As such, this patch
adds monitoring metrics that track the state of PJSIP outbound registrations.
It does this by looking for the Registry events coming across the Stasis
system topic, and publishing those as metrics to Prometheus. Note that
while this may support other outbound registration types (IAX2, SIP, etc.)
those haven't been tested. Your mileage may vary.

(And why are you still using IAX2 and SIP? It's 2019 folks. Get with the
program.)

This patch also adds Sorcery observers to handle modifications to the
underlying PJSIP outbound registration objects. This is useful when a
reload is triggered that modifies the properties of an outbound registration,
or when ARI push configuration is used and an object is updated or
deleted. Because we rely on properties of the registration object to
define the metric (label key/value pairs), we delete the relevant metric when
we notice that something has changed and wait for a new Stasis message to
arrive to re-create the metric.

ASTERISK-28403

Change-Id: If01420e38530fc20b6dd4aa15cd281d94cd2b87e
2019-05-22 08:25:19 -05:00
Matt Jordan a2648b22eb res_prometheus: Add CLI commands
This patch adds a few CLI commands to the res_prometheus module to aid
system administrators setting up and configuring the module. This includes:

* prometheus show status: Display basic statistics about the Prometheus
  module, including its essential configuration, when it was last scraped,
  and how long the scrape took. The last two bits of information are useful
  when Prometheus isn't generating metrics appropriately, as it will at
  least tell you if Asterisk has had its HTTP route hit by the remote
  server.

* prometheus show metrics: Dump the current metrics to the CLI. Useful for
  system administrators to see what metrics are currently available without
  having to cURL or go to Prometheus itself.

ASTERISK-28403

Change-Id: Ic09813e5e14b901571c5c96ebeae2a02566c5172
2019-05-22 08:24:39 -05:00
Matt Jordan 066280f0cc res_prometheus: Add Asterisk bridge metrics
This patch adds basic Asterisk bridge statistics to the res_prometheus
module. This includes:

* asterisk_bridges_count: The current number of bridges active on the
  system.

* asterisk_bridges_channels_count: The number of channels active in a
  bridge.

In all cases, enough information is provided with each bridge metric
to determine a unique instance of Asterisk that provided the data, along
with the technology, subclass, and creator of the bridge.

ASTERISK-28403

Change-Id: Ie27417dd72c5bc7624eb2a7a6a8829d7551788dc
2019-05-21 21:43:02 -05:00
Matt Jordan ed6cd13b5b res_prometheus: Add Asterisk endpoint metrics
This patch adds basic Asterisk endpoint statistics to the res_prometheus
module. This includes:

* asterisk_endpoints_state: The current state (unknown, online, offline)
  for each defined endpoint.

* asterisk_endpoints_channels_count: The current number of channels
  associated with a given endpoint.

* asterisk_endpoints_count: The current number of defined endpoints.

In all cases, enough information is provided with each endpoint metric
to determine a unique instance of Asterisk that provided the data, as well
as the underlying technology and resource definition.

ASTERISK-28403

Change-Id: I46443963330c206a7d12722d08dcaabef672310e
2019-05-21 20:47:50 -05:00
Matt Jordan 0760af71ad res_prometheus: Add Asterisk channel metrics
This patch adds basic Asterisk channel statistics to the res_prometheus
module. This includes:

* asterisk_calls_sum: A running sum of the total number of
  processed calls

* asterisk_calls_count: The current number of calls

* asterisk_channels_count: The current number of channels

* asterisk_channels_state: The state of any particular channel

* asterisk_channels_duration_seconds: How long a channel has existed,
  in seconds

In all cases, enough information is provided with each channel metric
to determine a unique instance of Asterisk that provided the data, as
well as the name, type, unique ID, and - if present - linked ID of each
channel.

ASTERISK-28403

Change-Id: I0db306ec94205d4f58d1e7fbabfe04b185869f59
2019-05-21 11:03:13 -05:00