The channels, bridges and endpoints scrape functions were
grabbing their respective global containers, getting the
count of entries, allocating metric arrays based on
that count, then iterating over the container. If the
global container had new objects added after the count
was taken and the metric arrays were allocated, we'd run
out of metric entries and attempt to write past the end
of the arrays.
Now each of the scape functions clone their respective
global containers and all operations are done on the
clone. Since the clone is stable between getting the
count and iterating over it, we can't run past the end
of the metrics array.
ASTERISK-29130
Reported-By: Francisco Correia
Reported-By: BJ Weschke
Reported-By: Sébastien Duthil
Change-Id: If0c8e40853bc0e9429f2ba9c7f5f358d90c311af
When monitoring Asterisk instances, it's often useful to know when an
outbound registration fails, as this often maps to the notion of a trunk
and having a trunk fail is usually a "bad thing". As such, this patch
adds monitoring metrics that track the state of PJSIP outbound registrations.
It does this by looking for the Registry events coming across the Stasis
system topic, and publishing those as metrics to Prometheus. Note that
while this may support other outbound registration types (IAX2, SIP, etc.)
those haven't been tested. Your mileage may vary.
(And why are you still using IAX2 and SIP? It's 2019 folks. Get with the
program.)
This patch also adds Sorcery observers to handle modifications to the
underlying PJSIP outbound registration objects. This is useful when a
reload is triggered that modifies the properties of an outbound registration,
or when ARI push configuration is used and an object is updated or
deleted. Because we rely on properties of the registration object to
define the metric (label key/value pairs), we delete the relevant metric when
we notice that something has changed and wait for a new Stasis message to
arrive to re-create the metric.
ASTERISK-28403
Change-Id: If01420e38530fc20b6dd4aa15cd281d94cd2b87e
This patch adds a few CLI commands to the res_prometheus module to aid
system administrators setting up and configuring the module. This includes:
* prometheus show status: Display basic statistics about the Prometheus
module, including its essential configuration, when it was last scraped,
and how long the scrape took. The last two bits of information are useful
when Prometheus isn't generating metrics appropriately, as it will at
least tell you if Asterisk has had its HTTP route hit by the remote
server.
* prometheus show metrics: Dump the current metrics to the CLI. Useful for
system administrators to see what metrics are currently available without
having to cURL or go to Prometheus itself.
ASTERISK-28403
Change-Id: Ic09813e5e14b901571c5c96ebeae2a02566c5172
This patch adds basic Asterisk bridge statistics to the res_prometheus
module. This includes:
* asterisk_bridges_count: The current number of bridges active on the
system.
* asterisk_bridges_channels_count: The number of channels active in a
bridge.
In all cases, enough information is provided with each bridge metric
to determine a unique instance of Asterisk that provided the data, along
with the technology, subclass, and creator of the bridge.
ASTERISK-28403
Change-Id: Ie27417dd72c5bc7624eb2a7a6a8829d7551788dc
This patch adds basic Asterisk endpoint statistics to the res_prometheus
module. This includes:
* asterisk_endpoints_state: The current state (unknown, online, offline)
for each defined endpoint.
* asterisk_endpoints_channels_count: The current number of channels
associated with a given endpoint.
* asterisk_endpoints_count: The current number of defined endpoints.
In all cases, enough information is provided with each endpoint metric
to determine a unique instance of Asterisk that provided the data, as well
as the underlying technology and resource definition.
ASTERISK-28403
Change-Id: I46443963330c206a7d12722d08dcaabef672310e
This patch adds basic Asterisk channel statistics to the res_prometheus
module. This includes:
* asterisk_calls_sum: A running sum of the total number of
processed calls
* asterisk_calls_count: The current number of calls
* asterisk_channels_count: The current number of channels
* asterisk_channels_state: The state of any particular channel
* asterisk_channels_duration_seconds: How long a channel has existed,
in seconds
In all cases, enough information is provided with each channel metric
to determine a unique instance of Asterisk that provided the data, as
well as the name, type, unique ID, and - if present - linked ID of each
channel.
ASTERISK-28403
Change-Id: I0db306ec94205d4f58d1e7fbabfe04b185869f59