After a context is detached, the context is not properly cleared. In
addition to releasing the context:
- Reset the context settings (IP, DNS, interface, ...).
- Signal the Active flag as false.
In case of error in sim_pin_query_cb function. pin_type is set
to -1. This is causing segmentation fault in function
sim_passwd_name due to invalid index pin_type = -1. Fixing this
issue by handling error case before calling sim_passwd_name
function.
There was an issue while running LTE and the connection
manager tried to activate the context with CID 1 while
it got automatically activated at the same time with
CID 4.
When the automatic activation happened ofono_gprs_cid_activated
got called which tried to assign the context, but that failed
since the driver context was considered in use
(by the activation call).
Eventhough it failed, the context was modified,
cid was set to 0 (making cid 1 leak).
Then release_context got called which clear pointers
assigned to the context.
A bit later the activation callback got called, in my case
activation failed. Due to the failure it tries to clean up
by calling context_settings_free, but unfortunately the pointers
where reset above causing ofono to segfault du to null pointer
derefs.
Instead we make sure assign_context does not touch the context
unless it succeeds. Then there is no need to call release_context
if assign fails.
That ensures the context being intact when the activation callback
gets called.
03:23:21 ofonod[545]: Aux: < \r\n+CGEV: ME PDN ACT 4\r\n\r\n+CTZE: +04,0,"19/12/10,04:25:03"\r\n
03:23:21 ofonod[545]: drivers/ubloxmodem/network-registration.c:ctze_notify() tz +04 dst 0 time 19/12/10,04:25:03
03:23:21 ofonod[545]: src/network.c:ofono_netreg_time_notify() net time 2019-12-10 04:25:03 utcoff 3600 dst 0
03:23:22 ofonod[545]: Aux: > AT+CGDCONT?\r
03:23:22 ofonod[545]: drivers/ubloxmodem/gprs-context.c:ublox_gprs_activate_primary() cid 1
Connection manager requests activation, will mark the context in use and assign
it cid 1.
03:23:22 ofonod[545]: Aux: < \r\n+CGDCONT: 1,"IP","m2m.tele2.com","",0,0,0,0,0,0\r\n
03:23:22 ofonod[545]: Aux: < +CGDCONT: 4,"IP","m2m.tele2.com.mnc003.mcc248.gprs","100.69.174.133",0,0,0,0,0,0\r\n
03:23:22 ofonod[545]: Aux: < \r\nOK\r\n
03:23:22 ofonod[545]: drivers/atmodem/gprs.c:at_cgdcont_read_cb() ok 1
03:23:22 ofonod[545]: src/gprs.c:ofono_gprs_cid_activated() cid 4
03:23:22 ofonod[545]: Can't assign context to driver for APN.
Since its marked in use above, we fail to assign it cid 4. When that fails
the cid is cleared an all context pointers are set to NULL.
03:23:22 ofonod[545]: Aux: > AT+CGDCONT=1,"IP","m2m.tele2.com"\r
03:23:22 ofonod[545]: Aux: < \r\nOK\r\n
03:23:22 ofonod[545]: drivers/ubloxmodem/gprs-context.c:cgdcont_cb() ok 1
03:23:22 ofonod[545]: Aux: > AT+CGACT=1,1\r
03:23:22 ofonod[545]: Aux: < \r\n+CME ERROR: 100\r\n
03:23:22 ofonod[545]: drivers/ubloxmodem/gprs-context.c:cgact_enable_cb() ok 0
03:23:22 ofonod[545]: src/gprs.c:pri_activate_callback() 0x853480
03:23:22 ofonod[545]: src/gprs.c:pri_activate_callback() Activating context failed with error: Unknown error
Activation callback, and it failed. Will try to clean up, but the pointers are
NULL'ed...
Dec 10 03:23:22 ofonod[545]: Aborting (signal 11) [/usr/sbin/ofonod]
The intent here was to find the contents of the 3 low order bits
according to Table 11-5 in ETSI 102.221. However, the mask ended up
only grabbing the contents of the 2 low order bits.
There was a race condition where a context might be
registered before the netreg status updates to LTE.
The code took for granted the context is activated after
the technology update. With this change, any order is
is accepted.
Its incorrect to fiddle with the driver attach state when
attaching. When attaching the state is transitioning,
and the correct state will now always be assigned in
the end of the attach process, regardless of result.
Currently there is an issue if the attach state changes and
there are active contexts of which the driver does not
implement the detach_shutdown.
In that case we just release the context (clears CID and
active state), but nothing is signalled on D-Bus or towards
the modem.
Ofono is then out of sync with both the connection manager
and the modem, this manifests itself later on if the modem
changes state of the context, then ofono will not find it
since the CID is cleared, and the connection manager won't
be notified.
In the same way as we consider the driver attached when the
gprs status indicates we are registered, we should consider
it deattached when the status indicates unregistration.
If we don't, then we would not always recover from the case
when deattaching the driver fails. We would just revert back
the driver attached status to true, and "ignore" if the status
indicates the opposite when we check the registration status
afterwards.
Commit 1fd419e5b4 and
0167c3339c introduced logic that
treated ofono_gprs_cid_activated as an 'attaching' state.
Since gprs_attached_update now guarantees that we
will not get attached without having a context activated
in LTE, this is not needed anymore. It also potentially
interferes in case the driver was actually attaching.
Since we have a different condition for the attach state
when running on LTE, we should consider it in gprs_attached_update.
Previously it's done in some instances. But for instance if
the driver got detached from GPRS but now running on LTE with a
context up, we would be deattached.
There is an issue if an context gets auto activated early,
then provisioning might not have run yet for instance,
so a "new" context is created, which might be duplicated
by a provisioning context later.
So ignore the activated contexts until gprs is ready,
then it calls the driver to list active contexts.
There are cases where the gprs status might updated to for instance
"unknown" while LTE is the bearer.
In that case we should not set the attach state to FALSE,
since then running LTE the conext activation reflects the attached
state.
Previously the valid "unknown" netreg status was set
during startup, but its a bit problematic for gprs.
There might be cases where a LTE context is activated
before netreg is finished updating its status.
Resulting in gprs taking faulty actions.
Instead we set the status to -1 until we are updated
with a known value.
During the time the status is -1, gprs postpones actions until
the status is valid (>= 0).
When oFono is built with --enable-external-ell, the compiler for some
reason does not generate a debug section on some systems. This is due
to the fact that l_debug is never called. However, ell also does not
call l_debug, yet when built-in ell is used, the section is created by
the compiler.
For now work around this by adding a no-op l_debug() call in main.c.
The real fix is to migrate all of the oFono logging functionality to use
ell instead.
We pass in the maximum size of the buffer to the read system call. On
the astronomically unlikely chance that we indeed read the full buffer
full of data, the subsequent assignment will overflow it. Fix this by
passing sizeof(buf) - 1 to the read system call instead.
This fix is similar to the one in the following commit,
but fixes allocation for context ids after ap's are
read from settings.
commit c3fdf6a7c5
Author: Denis Kenzior <denkenz@gmail.com>
Date: Thu Jan 3 17:17:21 2019 -0600
gprs: Fix allocation of context id