asterisk

Commit Graph

Author	SHA1	Message	Date
Matt Jordan	0bb38796b7	res_prometheus: Add metrics for PJSIP outbound registrations When monitoring Asterisk instances, it's often useful to know when an outbound registration fails, as this often maps to the notion of a trunk and having a trunk fail is usually a "bad thing". As such, this patch adds monitoring metrics that track the state of PJSIP outbound registrations. It does this by looking for the Registry events coming across the Stasis system topic, and publishing those as metrics to Prometheus. Note that while this may support other outbound registration types (IAX2, SIP, etc.) those haven't been tested. Your mileage may vary. (And why are you still using IAX2 and SIP? It's 2019 folks. Get with the program.) This patch also adds Sorcery observers to handle modifications to the underlying PJSIP outbound registration objects. This is useful when a reload is triggered that modifies the properties of an outbound registration, or when ARI push configuration is used and an object is updated or deleted. Because we rely on properties of the registration object to define the metric (label key/value pairs), we delete the relevant metric when we notice that something has changed and wait for a new Stasis message to arrive to re-create the metric. ASTERISK-28403 Change-Id: If01420e38530fc20b6dd4aa15cd281d94cd2b87e	2019-05-22 08:25:19 -05:00
Matt Jordan	a2648b22eb	res_prometheus: Add CLI commands This patch adds a few CLI commands to the res_prometheus module to aid system administrators setting up and configuring the module. This includes: * prometheus show status: Display basic statistics about the Prometheus module, including its essential configuration, when it was last scraped, and how long the scrape took. The last two bits of information are useful when Prometheus isn't generating metrics appropriately, as it will at least tell you if Asterisk has had its HTTP route hit by the remote server. * prometheus show metrics: Dump the current metrics to the CLI. Useful for system administrators to see what metrics are currently available without having to cURL or go to Prometheus itself. ASTERISK-28403 Change-Id: Ic09813e5e14b901571c5c96ebeae2a02566c5172	2019-05-22 08:24:39 -05:00
Matt Jordan	066280f0cc	res_prometheus: Add Asterisk bridge metrics This patch adds basic Asterisk bridge statistics to the res_prometheus module. This includes: * asterisk_bridges_count: The current number of bridges active on the system. * asterisk_bridges_channels_count: The number of channels active in a bridge. In all cases, enough information is provided with each bridge metric to determine a unique instance of Asterisk that provided the data, along with the technology, subclass, and creator of the bridge. ASTERISK-28403 Change-Id: Ie27417dd72c5bc7624eb2a7a6a8829d7551788dc	2019-05-21 21:43:02 -05:00
Matt Jordan	ed6cd13b5b	res_prometheus: Add Asterisk endpoint metrics This patch adds basic Asterisk endpoint statistics to the res_prometheus module. This includes: * asterisk_endpoints_state: The current state (unknown, online, offline) for each defined endpoint. * asterisk_endpoints_channels_count: The current number of channels associated with a given endpoint. * asterisk_endpoints_count: The current number of defined endpoints. In all cases, enough information is provided with each endpoint metric to determine a unique instance of Asterisk that provided the data, as well as the underlying technology and resource definition. ASTERISK-28403 Change-Id: I46443963330c206a7d12722d08dcaabef672310e	2019-05-21 20:47:50 -05:00
Matt Jordan	0760af71ad	res_prometheus: Add Asterisk channel metrics This patch adds basic Asterisk channel statistics to the res_prometheus module. This includes: * asterisk_calls_sum: A running sum of the total number of processed calls * asterisk_calls_count: The current number of calls * asterisk_channels_count: The current number of channels * asterisk_channels_state: The state of any particular channel * asterisk_channels_duration_seconds: How long a channel has existed, in seconds In all cases, enough information is provided with each channel metric to determine a unique instance of Asterisk that provided the data, as well as the name, type, unique ID, and - if present - linked ID of each channel. ASTERISK-28403 Change-Id: I0db306ec94205d4f58d1e7fbabfe04b185869f59	2019-05-21 11:03:13 -05:00
Matt Jordan	c50f29dfad	Add core Prometheus support to Asterisk Prometheus is the defacto monitoring tool for containerized applications. This patch adds native support to Asterisk for serving up Prometheus compatible metrics, such that a Prometheus server can scrape an Asterisk instance in the same fashion as it does other HTTP services. The core module in this patch provides an API that future work can build on top of. The API manages metrics in one of two ways: (1) Registered metrics. In this particular case, the API assumes that the metric (either allocated on the stack or on the heap) will have its value updated by the module registering it at will, and not just when Prometheus scrapes Asterisk. When a scrape does occur, the metrics are locked so that the current value can be retrieved. (2) Scrape callbacks. In this case, the API allows consumers to be called via a callback function when a Prometheus initiated scrape occurs. The consumers of the API are responsible for populating the response to Prometheus themselves, typically using stack allocated metrics that are then formatted properly into strings via this module's convenience functions. These two mechanisms balance the different ways in which information is generated within Asterisk: some information is generated in a fashion that makes it appropriate to update the relevant metrics immediately; some information is better to defer until a Prometheus server asks for it. Note that some care has been taken in how metrics are defined to minimize the impact on performance. Prometheus's metric definition and its support for nesting metrics based on labels - which are effectively key/value pairs - can make storage and managing of metrics somewhat tricky. While a naive approach, where we allow for any number of labels and perform a lot of heap allocations to manage the information, would absolutely have worked, this patch instead opts to try to place as much information in length limited arrays, stack allocations, and vectors to minimize the performance impacts of scrapes. The author of this patch has worked on enough systems that were driven to their knees by poor monitoring implementations to be a bit cautious. Additionally, this patch only adds support for gauges and counters. Additional work to add summaries, histograms, and other Prometheus metric types may add value in the future. This would be of particular interest if someone wanted to track SIP response types. Finally, this patch includes unit tests for the core APIs. ASTERISK-28403 Change-Id: I891433a272c92fd11c705a2c36d65479a415ec42	2019-05-20 20:33:58 -05:00

6 Commits