Commit Graph

28040 Commits

Author SHA1 Message Date
Alexei Gradinari 820ed3d4b3 fix: memory leaks, resource leaks, out of bounds and bugs
ASTERISK-26119 #close

Change-Id: Iecbf7d0f360a021147344c4e83ab242fd1e7512c
2016-06-20 13:08:18 -04:00
zuul 947f76a971 Merge "chan_sip: bigger buffers for headers, better failure mode" 2016-06-16 17:59:32 -05:00
Richard Mudgett 3c80f84cd0 res_pjsip_transport_management.c: Misc cleanups to survive shutdown.
* In unload_module(), reordered destroying things to minimize the window
that the global transports container could be used by other threads on
shutdown.  When shutting down you need to stop things in the opposite
order of creation.

* Put the global transports container into an AO2_GLOBAL_OBJ_STATIC to
eliminate the crash potential by other threads using the container on
shutdown.

* Made struct monitored_transport.sip_received not use
ast_atomic_fetchadd_int() since it is used as a boolean value that is only
set TRUE.  It was previously incremented for every received SIP message
and could theoretically overflow.

* In monitored_transport_state_callback(), allocated the monitored
transport object without a lock since the lock was unused.

* In keepalive_global_loaded(), removed releasing the transports container
if the keepalive_thread could not be started.  I set it up to be tried
again if the user reloads the configuration.

Change-Id: I8d12d16ef564290fa6d25a32334bb5ce8fdf87ff
2016-06-15 14:43:36 -05:00
Richard Mudgett 7c59f2126f res_pjsip.c: Add check that timer actually got scheduled.
Change-Id: Iabaa2e5dccf0762c258101ea0eb1487cf6959ad1
2016-06-14 16:46:49 -05:00
zuul 181766748f Merge "res_pjsip_session.c: Reorganize ast_sip_session_terminate()." 2016-06-14 13:36:41 -05:00
Richard Mudgett 51cc5c31c4 res_rtp_multicast.c: Fix warning message typo.
Change-Id: Ic9928208b9957e09866abe3d9649030942ec52b3
2016-06-13 13:35:08 -05:00
Richard Mudgett 3d0632a9c2 res_pjsip_session.c: Reorganize ast_sip_session_terminate().
Change-Id: I68a2128bcba4830985d2d441e70dfd1ac5bd712b
2016-06-10 17:40:06 -05:00
zuul d9b5aea9c3 Merge "core: Not the configured but granted number of possible file descriptors." 2016-06-10 15:50:35 -05:00
Alexander Traud ac683f13c9 core: Not the configured but granted number of possible file descriptors.
With CLI "core show settings", simply the parameter maxfiles of the file
asterisk.conf was shown. If that parameter was not set, nothing was displayed
although the environment might have set a default number itself. Or if maxfiles
were not granted (completely), still maxfiles was shown. Now, the maximum number
of possible file descriptors in the environment is shown.

ASTERISK-26097

Change-Id: I2df5c58863b5007b34b77adbe28b885dfcdf7e0b
2016-06-10 21:04:44 +02:00
Joshua Colp d1006faba9 Merge "astfd: With RLIMIT_NOFILE only the current value is sensible." 2016-06-10 13:46:48 -05:00
Joshua Colp 4eb8cf2684 translate: Enables native Packet-Loss Concealment (PLC) for supporting codecs.
This reverts commit 5bfef2a8b4 as it
caused fax test failures.

ASTERISK-25629

Change-Id: I79de974dc4f63a1cafe0d2509169fd9a6b3cbaf4
2016-06-10 12:40:00 -03:00
Alexander Traud 0bf1a53db3 astfd: With RLIMIT_NOFILE only the current value is sensible.
With menuselect "DEBUG_FD_LEAKS" and CLI "core show fd", both the maximum max
and current max of possible file descriptors were shown. Both show the same
value always. Not to confuse users, just the current maximum is shown now.

ASTERISK-26097

Change-Id: I49cf7952d73aec9e3f6a88942842c39be18380fa
2016-06-10 10:13:20 +02:00
zuul dcb6875428 Merge "cel: Ensure only one dial status per channel exists." 2016-06-09 22:38:52 -05:00
zuul 39e6d80937 Merge "ARI: Ensure proper channel state on operations." 2016-06-09 21:50:07 -05:00
zuul 850a09b099 Merge "test_http_media_cache: Fix failing test." 2016-06-09 21:50:05 -05:00
zuul 88dfcd21b2 Merge "chan_sip: Support auth username for callbackextension feature" 2016-06-09 21:35:42 -05:00
Joshua Colp 9e4efe9e16 Merge "res_pjsip_registrar.c: Eliminate rx REGISTER request race condition." 2016-06-09 16:45:59 -05:00
Joshua Colp 914a1502fa Merge "stasis: Add setting subscription congestion levels." 2016-06-09 16:45:54 -05:00
Joshua Colp 67a45e0a38 Merge "sorcery: Add setting object type congestion levels." 2016-06-09 16:45:48 -05:00
Joshua Colp 48c2c3b8da Merge "taskprocessors: Implement high/low water mark alerts." 2016-06-09 16:45:44 -05:00
Joshua Colp 2b840fbc3f Merge "res_pjsip_session: Use distributor serializer for incoming calls." 2016-06-09 16:45:39 -05:00
Joshua Colp 66e1e0969b Merge "res_pjsip_pubsub.c: Recreate subscriptions using distributor serializer." 2016-06-09 16:45:34 -05:00
Joshua Colp 2b2ca82e71 Merge "res_pjsip_pubsub.c: Use distributor serializer for incoming subscriptions." 2016-06-09 16:45:29 -05:00
Joshua Colp 9acb5e3084 Merge "pjsip_distributor.c: Consistently pick a serializer for messages." 2016-06-09 16:45:24 -05:00
zuul f5ffcb1b72 Merge "pjsip_distributor.c: Ignore messages until fully booted." 2016-06-09 16:17:33 -05:00
Joshua Colp d338343dac cel: Ensure only one dial status per channel exists.
CEL wrongly assumed that a channel would only have a single dial
event on it. This is incorrect. Particularly in a queue each
call attempt to a member will result in a dial event, adding
a new dial status in CEL without removing the old one. This
would cause the container to grow with only one dial status
being removed when the channel went away. The other dial status
entries would remain leaking memory.

This change fixes the memory leak by ensuring that only one dial
status will only ever exist for each channel.

The behavior during the scenario where multiple events are received
has also been improved. For failure cases the first failure will
be the dial status. If an answer dial status is received, though,
it will take priority and the dial status for the channel will be
answer.

Memory usage has also been decreased by storing the minimal
amount of information and the code has been cleaned up slightly.

ASTERISK-25262 #close

Change-Id: I5944eb923db17b6a0faa7317ff6abc9307c009fe
2016-06-09 14:46:04 -05:00
Mark Michelson 1fd3a7849e ARI: Ensure proper channel state on operations.
ARI was recently outfitted with operations to create and dial channels.
This leads to the ability to try funny stuff. You could create a channel
and then immediately try to play back media on it. You could create a
channel, dial it, and while it is ringing attempt to make it continue in
the dialplan.

This commit attempts to fix this by adding a channel state check to
operations that should not be able to operate on outbound channels that
have not yet answered. If a channel is in an invalid state, we will send
a 412 response.

ASTERISK-26047 #close
Reported by Mark Michelson

Change-Id: I2ca51bf9ef2b44a1dc5a73f2d2de35c62c37dfd8
2016-06-09 14:43:15 -05:00
Mark Michelson 10019dc70c test_http_media_cache: Fix failing test.
The retrieve_cache_control_directives test has been failing occasionally
in Jenkins. The apparent failure occurs when attempting to validate the
expiration of the retrieved file.

After reproducing, the problem was pretty clear. At the beginning of the
test, the current time is retrieved. The seconds value of this timestamp
is X. When the file is retrieved, res_http_media_cache calculates the
expiration and in doing so retrieves the current time. In most cases,
since the test executes quickly, it will also retrieve a timestamp with
X seconds. However, if the test starts very near to when the timestamp
seconds are set to increment, res_http_media_cache may retrieve a
timestamp with X+1 seconds instead.

The test attempted to account for this by allowing a tolerance of 1
second when validating the expiration. However, the problem was that the
comparisons being used in the validation used > and < operations. This
meant that values that fell within the tolerance (because they equaled
the upper bound of the tolerance) would fail.

The solution is to use >= and <= operators in the expiration validation.

However, I estimated that while the one second tolerance should be
fine on most machines, it would still be possible on a very slow machine
to end up falling outside the one second tolerance. So I have also
relaxed the tolerance of expiration validation to be three seconds
instead.

The final change here is to add a debug message when validating
expiration so that we can see what values are being compared.

ASTERISK-25959 #close
Reported by Joshua Colp

Change-Id: Ic1a0e10722c1c5d276d5a4d6a67136d6ec26c247
2016-06-09 14:25:05 -05:00
zuul 0388c40b8c Merge "chan_pjsip: Lock channel when checking for RTP changes." 2016-06-09 13:53:58 -05:00
George Joseph f0855358a6 cdr.c: Remove assert in base_process_dial_end
Scenario: Caller blonde transfer
Bob calls Charlie who answers.
Bob puts Charlie on hold and calls Alice.
Before Alice answers, Bob transfers Charlie to Alice.

Charlie's channel triggers an assert because he gets an "ANSWERED"
event even though he never dialed anything. With recent changes to dial
events, this is now a valid scenario so the assert needed to be removed.

ASTERISK-26103 #close

Change-Id: I2679b517b696e7952ab7fb29403df9140e7d1de2
2016-06-09 11:03:45 -05:00
Mark Michelson cdb7edbe7b chan_pjsip: Lock channel when checking for RTP changes.
bridge_native_rtp can call into an RTP-capable channel driver in order
for the driver to update information about who the channel is
communicating with. For SIP channel drivers, this means deactivating
RTCP and sending a reinvite so that the endpoints can communicate
directly.

bridge_native_rtp does the right thing and has the channel locked when
calling into the channel driver. chan_pjsip can't alter session
properties in this thread, though. chan_pjsip queues a task on the
session serializer in order to update properties there.

The problem is that this queued task was not locking the channel. This
meant that the queued task could attempt to deactivate RTCP at the same
time that the channel thread was attempting to process an incoming RTCP
packet. This could lead to a crash.

This patch fixes the issue by locking the channel in the queued task
when altering RTP properties.

ASTERISK-26092 #close
Reported by Niklas Larsson

Change-Id: I3464e226a3c41f6b915f97891e07fa1599e2a159
2016-06-09 10:43:46 -05:00
Richard Mudgett 04ec9c745e res_pjsip_registrar.c: Eliminate rx REGISTER request race condition.
This patch fixes a race condition processing received REGISTER requests
and their retransmissions caused by REGISTER requests being processed by
two threads.  The "sip_transaction Unable to register REGISTER transaction
(key exists)" message is a notable symptom of this issue.

This issue was more likely to happen before the pjsip/distributor
serializers were created.  Instead of steps one and two below placing the
REGISTER messages into the same pjsip/distributor they were placed in
random pjsip/default serializers.

1) REGISTER requests come in and get placed on the pjsip/distributor
serializer.

2) Before the first request is processed a retransmission comes in and is
placed on the same pjsip/distributor serializer.

3) The first request goes up the pjsip stack and is then shunted off to
the pjsip/aor/<aor> serializer.

4) Before the first request is completed processing in the pjsip/aor/<aor>
serializer, the second request goes up the pjsip stack and is also shunted
off to the pjsip/aor/<aor> serializer.

5) The first request completes processing and sends out its response.

6) The second request completes processing and tries to send out its
response but pjlib complains that the REGISTER transaction key already
exists.

7) Sadness ensues.

* The race is eliminated by removing the pjsip/aor/<aor> serializer and
continuing the processing in the pjsip/distributor serializer.  Now any
retransmissions queued in the pjsip/distributor serializer will be
processed after the first message is completely processed.

ASTERISK-26088 #close
Reported by:  Richard Mudgett

Change-Id: I842d714346088bf717ea27437f1dd85bff0bab5a
2016-06-09 10:32:07 -05:00
Richard Mudgett dcfef53ee2 stasis: Add setting subscription congestion levels.
Stasis subscriptions and message routers create taskprocessors to process
the event messages.  API calls are needed to be able to set the congestion
levels of these taskprocessors for selected subscriptions and message
routers.

* Updated CDR, CEL, and manager's stasis subscription congestion levels
based upon stress testing.  Increased the congestion levels to reduce the
potential for bursty call setup/teardown activity from triggering the
taskprocessor overload alert.  CDRs in particular need an extra high
congestion level because they can take awhile to process the stasis
messages.

ASTERISK-26088
Reported by:  Richard Mudgett

Change-Id: Id0a716394b4eee746dd158acc63d703902450244
2016-06-09 10:32:07 -05:00
Richard Mudgett 4879cd875c sorcery: Add setting object type congestion levels.
Sorcery creates taskprocessors for object types to process object observer
callbacks.  An API call is needed to be able to set the congestion levels
of these taskprocessors for selected object types.

* Updated PJSIP's contact and contact_status sorcery object type observer
default congestion levels based upon stress testing.  Increased the
congestion levels to reduce the potential for bursty register/unregister
and subscribe/unsubscribe activity from triggering the taskprocessor
overload alert.

ASTERISK-26088
Reported by:  Richard Mudgett

Change-Id: I4542e83b556f0714009bfeff89505c801f1218c6
2016-06-09 10:32:07 -05:00
Richard Mudgett 2cd67d5b07 taskprocessors: Implement high/low water mark alerts.
When taskprocessors get backed up, there is a good chance that we are
being overloaded and need to defer adding new work to the system.

* Implemented a high/low water alert mechanism for modules to check if the
system is being overloaded and take appropriate action.  When a
taskprocessor is created it has default congestion levels set.  A
taskprocessor can later have those congestion levels altered for specific
needs if stress testing shows that the taskprocessor is a symptom of
overloading or needs to handle bursty activity without triggering an
overload alert.

* Add CLI "core show taskprocessor" low/high water columns.

* Fixed __allocate_taskprocessor() to not use RAII_VAR().  RAII_VAR() was
never a good thing to use when creating a taskprocessor because of the
nature of how its references needed to be cleaned up on a partial
creation.

* Made res_pjsip's distributor check if the taskprocessor overload alert
is active before placing a message representing brand new work onto a
distributor serializer.

ASTERISK-26088
Reported by:  Richard Mudgett

Change-Id: I182f1be603529cd665958661c4c05ff9901825fa
2016-06-09 10:32:07 -05:00
Richard Mudgett c966a035e0 res_pjsip_session: Use distributor serializer for incoming calls.
We must continue using the serializer that the original INVITE came in on
for the dialog.  There may be retransmissions already enqueued in the
original serializer that can result in reentrancy and message sequencing
problems.

Outgoing call legs create the pjsip/outsess/<endpoint> serializers for
their dialogs.

ASTERISK-26088
Reported by:  Richard Mudgett

Change-Id: I24d7948749c582b8045d5389ba3f6588508adbbc
2016-06-09 10:32:06 -05:00
Richard Mudgett 5b7b16a87f res_pjsip_pubsub.c: Recreate subscriptions using distributor serializer.
* Resolves potential reentrancy problems if system restarted in the middle
of subscription message transactions.

* Fixes memory leak recreating persistent subscriptions when the
subscription resource tree could not be created.

ASTERISK-26088
Reported by:  Richard Mudgett

Change-Id: I71e34d7ae8ed35a694f1030e820e2548c48697be
2016-06-09 10:32:06 -05:00
Richard Mudgett c2ae49249c res_pjsip_pubsub.c: Use distributor serializer for incoming subscriptions.
We must continue using the serializer that the original SUBSCRIBE came in
on for the dialog.  There may be retransmissions already enqueued in the
original serializer that can result in reentrancy and message sequencing
problems.  The "sip_transaction Unable to register SUBSCRIBE transaction
(key exists)" message is a notable symptom of this issue.

Outgoing subscriptions still create the pjsip/pubsub/<endpoint>
serializers for their dialogs.

ASTERISK-26088
Reported by:  Richard Mudgett

Change-Id: I18b00bb74a56747b2c8c29543a82440b110bf0b0
2016-06-09 10:32:06 -05:00
Richard Mudgett 2ff26e9746 pjsip_distributor.c: Consistently pick a serializer for messages.
Incoming messages that are not part of a dialog or a recognized response
to one of our requests need to be sent to a consistent serializer.  Under
load we may be queueing retransmissions before we can process the original
message.  We don't need to throw these messages onto random serializers
and cause reentrancy and message sequencing problems.

* Created a pool of pjsip/distributor serializers that get picked by
hashing the call-id and remote tag strings of the received messages.

* Made ast_sip_destroy_distributor() destroy items in the reverse order of
creation.

ASTERISK-26088
Reported by:  Richard Mudgett

Change-Id: I2ce769389fc060d9f379977f559026fbcb632407
2016-06-09 10:32:06 -05:00
Richard Mudgett df2791da8f pjsip_distributor.c: Ignore messages until fully booted.
We should not be processing any incoming messages until we are fully
booted.  We may not have dialplan or other needed configuration loaded
yet.

ASTERISK-26089 #close
Reported by: Scott Griepentrog

ASTERISK-26088
Reported by:  Richard Mudgett

Change-Id: I584aefb4f34b885a8927e1f13a2c64babd606264
2016-06-09 10:32:06 -05:00
George Joseph d21a77b325 build: Fix ast_sockaddr initialization to be more portable
A change to glibc 2.22 changed the order of the sockadddr_storage
members which caused the places where we do an initialization of
ast_sockaddr with '{ { 0, 0, } }' to fail compilation.  Those
initializers (which we shouldn't have been using anyway) have been
replaced with memsets.

Change-Id: Idd1b3b320903d8771bfe221f0b015685de628fa4
2016-06-09 09:50:31 -05:00
Joshua Colp fbece11a0c Merge "translate: Enables native Packet-Loss Concealment (PLC) for supporting codecs." 2016-06-09 07:24:46 -05:00
Joshua Colp 2525563438 Merge "chan_sip: No rtpmap for static RTP payload IDs in SDP." 2016-06-09 04:40:43 -05:00
Joshua Colp 7eb3a3357c Merge "BuildSystem: Avoid 'ar cru' and use 'ar cr' instead." 2016-06-09 04:40:37 -05:00
Joshua Colp 20be856b51 Merge "Detect and use proper libraries for musl toolchains" 2016-06-09 04:40:30 -05:00
Joshua Colp 5c949d009e Merge "Fixes to include signal.h" 2016-06-09 04:40:24 -05:00
Joshua Colp 216f78c0ce Merge "Make use of GLOB_BRACE and GLOB_NOMAGIC optional" 2016-06-09 04:40:14 -05:00
Joshua Colp 6ef3094239 Merge "res_hep_{pjsip|rtcp}: Decline module loads if res_hep had not loaded" 2016-06-08 17:17:38 -05:00
Joshua Colp 1ead09dcb1 Merge "Fix res_search usage" 2016-06-08 14:43:35 -05:00
Joshua Colp 7bcccd4db3 Merge "Fix #include poll.h and sys/cdefs.h" 2016-06-08 14:43:13 -05:00