A regression was introduced where searching for realtime PJSIP objects
by regex by starting the regex with a leading "^" would cause no items
to be returned.
This was due to a change which attempted to drop the requirement for a
leading "^" to be present due to how some CLI commands formulate their
regexes. However, the change, rather than simply eliminating the
requirement, caused any regexes that did begin with "^" to end up not
returning the expected results.
This change fixes the problem by inspecting the regex and formulating
the realtime query differently depending on if it begins with "^".
ASTERISK-25702 #close
Reported by Nic Colledge
Patches:
realtime_retrieve_regex.patch submitted by Alexei Gradinari License #5691
Change-Id: I055df608a6e6a10732044fa737a9fe8dca602693
A recent change to queue channel variable setting to the Stasis control
queue caused a regression. When setting channel variables, it is
possible to give a NULL channel variable value in order to unset the
variable (i.e. remove it from the channel variable list). The change
introduced a call to ast_variable_new(), which is not tolerant of NULL
channel variable values.
This new change switches from using ast_variable to using a custom
channel variable struct that is lighter weight and NULL value-tolerant.
Change-Id: I784d7beaaa3c036ea936d103e7caf0bb1562162d
A test recently uncovered that running an ill-timed AMI command to show
inbound subscriptions could cause a crash since Asterisk will try to
operate on a freed subscription.
The fix for this is to remove the subscription tree from the list of
subscriptions at the time that we are sending our final NOTIFY request
out. This way, as the subscription is in the process of dying, it is
inaccessible from AMI.
Change-Id: Ic0239003d8d73e04c47c12dd2a7e23867e5b5b23
When queuing tasks onto the Stasis control queue, you can pass an
arbitrary data pointer and a function to free that data. All ARI
commands that use the Stasis control queue made the assumption that the
destructor function would be called in all paths, whether the task was
queued successfully or not. However, this was not correct. If a task was
queued onto a control structure that was already completed, the
allocated data would not be freed properly.
This patch corrects this by making sure that all return paths call the
data destructor.
Change-Id: Ibf06522094f8e5c4cce652537dc5d7222b1c4fcb
A crash occurred when attempting to set a channel variable on a channel
that had already been hung up. This is because there is a small window
between when a control is grabbed and when the channel variable is set
that the channel can be hung up.
The fix here is to queue the setting of the channel variable onto the
control queue. This way, the manipulation of the channel happens in a
thread where it is safe to be done.
In this change, I also noticed that the setting of bridge roles on
channels was being done outside of the control queue, so I also changed
those operations to be done in the control queue.
ASTERISK-25709 #close
Reported by Mark Michelson
Change-Id: I2a0a4d51bce6fba6f1d9954e40935e42f366ea78
Asterisk by default will create a single database connection and share
it among all threads that attempt to access the database. In previous
versions of Asterisk, this was tolerable, because the most used channel
driver, chan_sip, mostly accessed the database from a single thread.
With PJSIP, however, many threads may be attempting to perform database
operations, and there is the potential for many more database accesses,
meaning the concurrency is a horrible bottleneck if only one connection
is shared.
Asterisk has a connection pooling facility built into it, but the
implementation has flaws. For one, there is a strict limit on the number
of simultaneous connections that could be made to the database. Anything
beyond the maximum would result in a failed operation. Attempting to
predict what the maximum should be is nearly impossible even for someone
intimately familiar with Asterisk's threading model. In addition, use of
transactions in the dialplan can cause some severe bugs if connection
pooling is enabled.
This commit seeks to fix the concurrency problem by removing all
connection management code from Asterisk and leaving that to the
underlying unixODBC code instead. Now, Asterisk does not share a single
connection, nor does it try to maintain a connection pool. Instead, all
Asterisk ever does is request a connection from unixODBC and allow
unixODBC to either allocate those connections or retrieve them from a
pool.
Doing this has a bit of a ripple effect. For one, since connections are
not long-lived objects, several of the safeguards that previously
existed have been removed. We don't have to worry about trying to use a
connection that has gone stale. In every case, when we request a
connection, it has just been made and we don't need to perform any
sanity checks to be sure it's still active.
Another major player affected by this change is transactions.
Transactions and their respective connections were so tightly coupled
that it was almost pornographic. This code change moves
transaction-related code to its own file separate from the core ODBC
functionality. This way, the core of ODBC does not even have to know
that transactions exist.
In making this large change, I had to look at a lot of code and
understand it. When making this change, I discovered several places
where the behavior is definitely not ideal, but it seemed outside the
scope of this change to be fixing it. Instead, any place where I saw
some sort of room for improvement has had a XXX comment added explaining
what could be altered to improve it.
Change-Id: I37a84def5ea4ddf93868ce8105f39de078297fbf
Dump the res_pjsip endpt internals.
In non-developer mode we will not document or make easily accessible the
"details" option even though it is still available. The user has to know
it exists to use it. Presumably they would also be aware of the potential
crash warning below.
Warning: PJPROJECT documents that the function used by this CLI command
may cause a crash when asking for details because it tries to access all
active memory pools.
Change-Id: If2d98a3641c9873364d1daaad971376311aef3cb
res_pjsip_log_forwarder has been renamed to res_pjproject
and enhanced as follows:
As a follow-on to the recent 'Add CLI "pjsip show buildopts"' patch,
a new ast_pjproject_get_buildopt function has been added. It
allows the caller to get the value of one of the buildopts.
The initial use case is retrieving the runtime value of
PJ_MAX_HOSTNAME to insure we don't send a hostname greater
than pjproject can handle. Since it can differ between
the version of pjproject that Asterisk was compiled against
and the version of pjproject that Asterisk is running against,
we can't use the PJ_MAX_HOSTNAME macro directly in Asterisk
source code.
Change-Id: Iab6e82fec3d7cf00c1cf6185c42be3e7569dee1e
Added new global option (regcontext) to pjsip. When set, Asterisk will
dynamically create and destroy a NoOp priority 1 extension
for a given endpoint who registers or unregisters with us.
ASTERISK-25670 #close
Reported-by: Daniel Journo
Change-Id: Ib1530c5b45340625805c057f8ff1fb240a43ea62
There are two ways in which the reload() function in res_musiconhold can be
called from the CLI:
* module reload res_musiconhold.so
* moh reload
In the former case, the module loader holds a lock that prevents multiple
concurrent calls, but in the latter there is no such protection.
This patch changes the 'moh reload' CLI command to invoke the module loader
directly, rather than call reload() explicitly.
ASTERISK-25687 #close
Change-Id: I408968b4c8932864411b7f9ad88cfdc7b9ba711c
PJPROJECT has a function available to dump the compile time
options used when building the library.
* Add CLI "pjsip show buildopts" command.
* Update contrib/scripts/autosupport to get pjproject information.
Change-Id: Id93a6a916d765b2a2e5a1aeb54caaf83206be748
res_sorcery_realtime's search-by-regex callback performed a check to
ensure that the passed-in regex began with a caret (^). If it did not,
then no results would be returned.
This callback only started to become used when "like" support was added
to PJSIP CLI commands. The CLI command for listing objects would pass an
empty regex ("") to the sorcery backend if no "like" statement was
present. For most sorcery backends, this resulted in returning all
objects. However, for realtime, this resulted in returning no objects.
This commit seeks to fix the regression by removing the requirement from
res_sorcery_realtime for the passed-in-regex to begin with a caret.
ASTERISK-25689 #close
Reported by Marcelo Terres
Change-Id: I22b4dc5d7f3f11bb29ac2e42ef94682e9bab3b20
On a system with multiple ip addresses in the same subnet, if a
transport is bound to a specific ip address and endpoint/media_address
is set, the SIP/SDP will have the correct address in all fields but
the rtp stream MAY still originate from one of the other ip addresses,
most probably the "primary" ip address. This happens because
res_pjsip_sdp_rtp/create_rtp always calls ast_instance_new with
the "all" ip address (0.0.0.0 or ::).
The new option causes res_pjsip_sdp_rtp/create_rtp to call
ast_rtp_instance_new with the endpoint's media_address (if specified)
instead of the "all" address. This causes the packets to originate from
the specified address.
ASTERISK-25632
ASTERISK-25637
Reported-by: Olivier Krief
Reported-by: Dan Journo
Change-Id: I3dfaa079e54ba7fb7c4fd1f5f7bd9509bbf8bd88
Due to locking issues within pjnath these changes are being
reverted until pjnath can be changed.
ASTERISK-25645
Revert "res_rtp_asterisk.c: Fix DTLS negotiation delays."
This reverts commit 24ae124e4f.
Change-Id: I2986cfb2c43dc14455c1bcaf92c3804f9da49705
Revert "res_rtp_asterisk: Resolve further timing issues with DTLS negotiation"
This reverts commit 965a0eee46.
Change-Id: Ie68fafde27dad4b03cb7a1e27ce2a8502c3f7bbe
This reverts commit 0a9941de9d.
Matt,
This patch causes another problem and should not have been needed.
Before this patch, persistent_endpoint_contact_deleted_observer WAS
deleting the contact_status when ast_sip_location_delete_contact was
called. By deleting it yourself in ast_sip_location_delete_contact
it was gone before the observer could run and the observer therefore
was throwing an error and not sending stasis/AMI/statsd messages.
So, I don't think this was the cause of your original issue. I also
had verified the contact AMI and statsd lifecycle and it was working.
I'll double check now though.
ASTERISK-25675
Reported-by: Daniel Journo
Change-Id: Ib586a6b7f90acb641b0c410f659743ab90e84f1a
This change causes res_crypto to unregister CLI at shutdown while still
preventing the module from being unloaded.
ASTERISK-25673 #close
Change-Id: Ie5d57338dc2752abfc0dd05d0eec86413f2304fc
A deadlock was observed where the monitor thread was stuck, therefore
resulting in no incoming SIP traffic being processed.
The problem occurred when two 200 OK responses arrived in response to a
terminating NOTIFY request sent from Asterisk. The first 200 OK was
dispatched to a threadpool worker, who locked the corresponding
transaction. The second 200 OK arrived, resulting in the monitor thread
locking the dialog. At this point, the two threads are at odds, because
the monitor thread attempts to lock the transaction, and the threadpool
thread loops attempting to try to lock the dialog.
In this case, the fix is to not have the monitor thread attempt to hold
both the dialog and transaction locks at the same time. Instead, we
release the dialog lock before attempting to lock the transaction.
There have also been some debug messages added to the process in an
attempt to make it more clear what is going on in the process.
ASTERISK-25668 #close
Reported by Mark Michelson
Change-Id: I4db0705f1403737b4360e33a8e6276805d086d4a
The menuselect conflict between app_voicemail and res_mwi_external
makes it hard to package 1 version of Asterisk. There no actual
build dependencies between the 2 so moving this check to runtime
seems like a better solution.
The ast_vm_register and ast_vm_greeter_register functions in app.c
were modified to return AST_MODULE_LOAD_DECLINE instead of -1 if there
is already a voicemail module registered. The modules' load_module
functions were then modified to return DECLINE instead of -1 to the
loader. Since -1 is interpreted by the loader as AST_MODULE_LOAD_FAILURE,
the modules were incorrectly causing Asterisk to stop so this needed
to be cleaned up anyway.
Now you can build both and use modules.conf to decide which voicemail
implementation to load.
The default menuselect options still build app_voicemail and not
res_mwi_external but if both ARE built, res_mwi_external will load
first and become the voicemail provider unless modules.conf rules
prevent it. This is noted in CHANGES.
Change-Id: I7d98d4e8a3b87b8df9e51c2608f0da6ddfb89247
This patch adds a new module, res_pjsip_history, that provides a slightly
better way of debugging SIP message traffic on a busy Asterisk system. The
existing mechanisms all rely on passively dumping a SIP message to the CLI.
While this is perfectly fine for logging purposes and well controlled
environments, on many installations, the amount of SIP messages Asterisk
receives will quickly swamp the CLI. This makes it difficult to view/capture
those messages that you want to diagnose in real time.
This patch provides another way of handling this. When enabled, the module
will store SIP message traffic in memory. This traffic can then be queried
at leisure.
In order to make the querying useful, a CLI command has been implemented,
'pjsip show history', that supports a basic expression syntax similar to
SQL or other query languages. A small number of useful fields have been
added in this initial patch; additional fields can easily be added in
later improvements. Those fields are:
- number: The entry index in the history
- timestamp: The time the message was recieved
- addr: The source/destination address of the message
- sip.msg.request.method: The request method
- sip.msg.call-id: The Call-ID header
Note - this is a resurrection of the module initially proposed on Review Board
here: https://reviewboard.asterisk.org/r/4053/
Change-Id: I39bd74ce998e99ad5ebc0aab3e84df3a150f8e36
Updated ast_websocket_write to encode the entire frame in to one
write operation, to ensure that we don't end up with a situation
where the websocket header has been sent, while the body can not
be written.
Previous to August's patch in commit b9bd3c14, certain network
conditions could cause the header to be written, and then the
sub-sequent body to fail - which would cause the next successful
write to contain a new header, and a new body (resulting in
the peer receiving two headers - the second of which would be
read as part of the body for the first header).
This was patched to have both write operations individually fail
by closing the websocket.
In a case available to the submitter of this patch, the same
body which would consistently fail to write, would succeed
if written at the same time as the header.
This update merges the two operations in to one, adds debug messages
indicating the reason for a websocket connection being closed during
a write operation, and clarifies some variable names for code legibility.
Change-Id: I4db7a586af1c7a57184c31d3d55bf146f1a40598
In 450579e908, a change was made that removed the deletion of the
'contact_status' object when a 'contact' object is deleted in sorcery.
This unfortunately means that the 'contact_status' object persists, even when
something has explicitly removed a contact. The result is that the state of
the contact will not be regenerated if that contact is re-created, and the
stale state will be reported/used for that contact. It also results in
no ContactStatusChanged events being generated for either ARI or AMI.
This patch restores the deletion logic that was removed. Doing so now
results in the expected events being generated again.
Change-Id: I28789a112e845072308b5b34522690e3faf58f07
Resolves an edge case dtls negotiation delay for certain networks which
somehow manage to drop the rtcp side's packet when these are both sent
ast_rtp_remote_address_set, causing it to have to time-out and restart
the handshake.
Move dtls pending bio flush in to it's own function, and call it from
ast_rtp_on_ice_complete, when we're rtp->ice, rather than when
ast_rtp_remote_address_set.
Keep the existing flush from the recent change to res_rtp_remote_address_set
if ice is not being used.
ASTERISK-25614 #close
Reported-by: XenCALL
Tested by: XenCALL
Change-Id: Ie2caedbdee1783159f375589b6fd3845c8577ba5
This change introduces the configuration option 'full_backend_cache'
which changes the cache to be a full mirror of the backend instead
of a per-object cache. This allows all sorcery retrieval operations
to be carried out against it and is useful for object types which
are used in a "retrieve all" or "retrieve some" pattern.
ASTERISK-25625 #close
Change-Id: Ie2993487e9c19de563413ad5561c7403b48caab5
The JSON library Asterisk uses, jansson, is not thread
safe for us in a few ways. To help with this wrappers for JSON
object reference count increasing and decreasing were added
which use a global lock to ensure they don't clobber over
each other. This does not extend to reference count manipulation
within the jansson library itself. This means you can't safely
use the object borrowing specifier (O) in ast_json_pack and
you can't share JSON instances between objects.
This change removes uses of the O specifier and replaces them
with the o specifier and an explicit ast_json_ref. Some cases
of instance sharing have also been removed.
ASTERISK-25601 #close
Change-Id: I06550d8b0cc1bfeb56cab580a4e608ae4f1ec7d1
- Trigger pending DTLS packets to send out, once the RTP instance's remote
address is set.
- Avoids locking the DTLS structure unnecessarily by only doing this if
DTLS is passive.
- Add DTLS locks around the structurally sensitive calls in the SSL
portion of __rtp_recvfrom, since dtls_srtp_check_pending does not lock
inside of itself, and we're dealing with the SSL BIO in at least two
threads.
WebRTC channels may receive a DTLS handshake before
ast_rtp_remote_address_set is called, which causes there to be a pending
response to send out. Previous to 1ad827, this was handled by calling
dtls_srtp_check_pending on receipt of any RTP packet - a STUN or RTP
packet could trigger the pending handshake response. Since that was
rightfully removed, whenever the DTLS handshake is received before the
remote address is set, we would have to wait until another SSL packet
arrives.
As of Chrome M47's optimizations to their handshake process, WebRTC
conversations between Chrome M47+ and Asterisk, where Asterisk is passive,
experience a 1 second delay without this patch, because the SSL handshake
is received before ICE negotation stores the remote_address, and the next
SSL packet isn't received until after a 1 second timeout in Chrome, which
causes a new handshake request.
ASTERISK-25614 #close
Change-Id: I547f1be7e302dbf71f6553dd8cbc0657b1d0b908
pjproject < 2.5.0 will segfault on a tls transport if async_operations
is greater than 1. A runtime version check has been added to throw
an error if the version is < 2.5.0 and async_operations > 1.
To assist in the check, a new api "ast_compare_versions" was added
to utils which compares 2 major.minor.patch.extra version strings.
ASTERISK-25615 #close
Change-Id: I8e88bb49cbcfbca88d9de705496d6f6a8c938a98
Reported-by: George Joseph
Tested-by: George Joseph
Fixed a bug that originally would show a negative number of
active calls occuring in Asterisk. A gauge is persistent so
incrementing and decrementing it results in a more consistent
performance. Also changed to the call to StatsD to use
ast_statsd_log_string() so that a "+" could be sent to StatsD.
ASTERISK-25619 #close
Change-Id: Iaaeff5c4c6a46535366b4d16ea0ed0ee75ab2ee7
Both transport and endpoint now check for the existence and readability
of tls certificate and key files before passing them on to pjproject.
This will cause the object to not load rather than waiting for pjproject
to discover that there's a problem when a session is attempted.
NOTE: chan_sip also uses ast_rtp_dtls_cfg_parse but it's located
in build_peer which is gigantic and I didn't want to disturb it.
Error messages will emit but it won't interrupt chan_sip loading.
ASTERISK-25618 #close
Change-Id: Ie43f2c1d653ac1fda6a6f6faecb7c2ebadaf47c9
Reported-by: George Joseph
Tested-by: George Joseph
See ASTERISK-25615.
If the transport protocol is tls and async_operations > 1, pjproject
will segfault if more than one operation is attempted on the same socket.
Until this is fixed upstream, a check has been added to throw an error
if a tls transport config has async_operations set to > 1.
ASTERISK-25615
Change-Id: I76b9a5b2a5a0054fe71ca5851e635f2dca7685a6
Reported-by: George Joseph
Tested-by: George Joseph
It will never be perfect or even pretty, mostly because of the differences
between static and dynamic contacts.
Created:
Can't use the contact or contact_status alloc functions
because the objects come and go regardless of the actual state.
Can't use the contact_apply_handler, ast_sip_location_add_contact or
a sorcery created handler because they only get called for dynamic
contacts. Similarly, permanent_uri_handler only gets called for
static contacts.
So, Matt had it right. :) ast_res_pjsip_find_or_create_contact_status is
the only place it can go and not have duplicated code. Both
permanent_uri_handler and contact_apply_handler call find_or_create.
Removed:
Can't use the destructors for the same reason as above. The only
place to put this is in persistent_endpoint_contact_deleted_observer
which I believe is the "correct" place but even that will handle only
dynamic contacts. This doesn't called on shutdown however. There is
no hook to use for static contacts that may be removed because of a
config change while asterisk is in operation.
I moved the cleanup of contact_status from ast_sip_location_delete_contact
to the handler as well.
Status Change and RTT:
Although they worked fine where they were (in update_contact_status) I
moved them to persistent_endpoint_contact_status_observer to make it
more consistent with removed. There was logic there already to detect
a state change.
Finally, fixed a nit in permanent_uri_handler rmudgett reported
eralier.
ASTERISK-25608 #close
Change-Id: I4b56e7dfc3be3baaaf6f1eac5b2068a0b79e357d
Reported-by: George Joseph
Tested-by: George Joseph