#1027 Race condition in LDAP provider
Closed: Invalid None Opened 12 years ago by brhellman.

Kubuntu 11.04
KDE 4.7.1
network-manager 0.8.4~git.20110319t175609.d14809b-0ubuntu3
kdm 4:4.7.1-0ubuntu2~natty1~ppa2
sssd 1.5.13-0ubuntu1~natty

When using WiFi sssd will prevent the system from shutting down, it hangs the system. The only way to get it to shutdown is either:

  • Stop sssd (but only if I have a pre-existing shell with root privileges, which is never)
  • Push the power button which forces the shutdown.

. If I'm on a LAN connection the system will shutdown fine. In the attached sssd_LDAP.log look for "WORKING" and "BROKE 8:43" where I had issues.

The issue only occurs when I have access to the LDAP tree and am on WiFi; without access I don't experience the problem.


I tried reproducing the issue on F15. Installed KDE, logged in via kdm and everything worked fine, both logout and restart. I was running kdm-4.6.5-5.fc15, the rest of KDE stack is on 4.6.5-1. I made sure SSSD was online during the test.

I suspect this is not an SSSD bug and I would recommend raising this issue with Ubuntu via Launchpad.

Adding Jan, our team's KDE user to CC. Can you confirm that we're not seeing the issue on Fedora? (or Ubuntu, if you still have one handy)

cc: => jzeleny

I managed to sort of reproduce the issue. Steps to reproduce:

  1. Start the system
  2. Connect to a wifi
  3. Log in as a remote user (i.e. user in the LDAP/KRB), you can log in in a different tty, it doesn't matter (this is important, otherwise everything works fine)
  4. Try to shut down the system

What happens is that PAM gets somehow completely frozen and it is not possible neither to log in, nor log out. System unfreezes and shuts down after ~60 seconds.

Ok, after digging a bit deeper, I found out what happens. It is an interesting race condition. During the shutdown, SSSD receives getAccountInfo request for a user (in my case it was the local user). LDAP query is issued for this user and right after sending the query, an SBUS event resInit is received (changed status on the network).

This causes current LDAP connection to be terminated and new one created, therefore no response from the server is received. This causes that no response to the responder is received, provider looses track of the request and the whole process basically freezes (well, not really, it responds to pings, but that's about it) until a timeout in pam responder kicks in.

After this closer inspection I take back my comment that logging as remote user is necessary to reproduce this bug. In fact only configuring a remote domain should do the trick.

Fields changed

summary: SSSD prevents system shutdown on KDE => Race condition in LDAP provider

Why we are going online when the session is shutting down?[[BR]]
Please investigate.

milestone: NEEDS_TRIAGE => SSSD 1.8.0

Replying to [comment:4 jzeleny]:

Ok, after digging a bit deeper, I found out what happens. It is an interesting race condition. During the shutdown, SSSD receives getAccountInfo request for a user (in my case it was the local user). LDAP query is issued for this user and right after sending the query, an SBUS event resInit is received (changed status on the network).

This causes current LDAP connection to be terminated and new one created, therefore no response from the server is received. This causes that no response to the responder is received, provider looses track of the request and the whole process basically freezes (well, not really, it responds to pings, but that's about it) until a timeout in pam responder kicks in.

After this closer inspection I take back my comment that logging as remote user is necessary to reproduce this bug. In fact only configuring a remote domain should do the trick.

Configuring only a remote domain does nothing for me I still get the error, and only when logged in as an LDAP user. If I login as a local user things work flawlessly. If there is more information I can provide please let me know and I'll get whatever you need.

Thanks,
Brian

Fields changed

rhbz: => 0

Fields changed

blockedby: =>
blocking: =>
feature_milestone: =>
owner: somebody => jzeleny

Recommending that we defer this. The only known case where this happens is shutdown on KDE.

milestone: SSSD 1.8.0 (LTM) => NEEDS_TRIAGE

Fields changed

milestone: NEEDS_TRIAGE => SSSD 1.9.0
priority: major => minor

Current status of related work. We already have a code in SSSD that marks connections with "disconnecting" flag. What is needed right now is a following change:

When going offline, provider should not destroy all connection-related structures, as this won't disconnect all existing operations. Instead, it should mark all connections as "disconnecting" and wait for all existing operations to finish. In the meantime it should also reject all new requests with "offline" response.

Also I'm not sure if there is a code that destroys connection upon finishing an operation only if the operation is the only/last one associated with the connection, that might be necessary as well, although I'm not 100% sure about it.

Fields changed

milestone: SSSD 1.9.0 => SSSD 1.9.1

Fields changed

milestone: SSSD 1.9.1 => SSSD 1.9.2

Fields changed

milestone: SSSD 1.9.2 => SSSD 1.9.3

Not critical for 1.9.3

design: =>
design_review: => 0
fedora_test_page: =>
milestone: SSSD 1.9.3 => SSSD 1.9.4

Dropping the investigation/documentation tasks to trivial. These can be deferred if needed.

priority: minor => trivial

Moving the docs task to 1.9.5

milestone: SSSD 1.9.4 => SSSD 1.9.5

Please investigate together with #1507.

owner: jzeleny => okos
selected: =>

Fields changed

milestone: SSSD 1.9.5 => SSSD 1.11 beta
review: => 0

Fields changed

changelog: =>
milestone: SSSD 1.13 beta => SSSD Deferred

Documenting the sdap_ops is alrady tracked in different ticket, I suggest we just close this one.

mark: => 0
review: 0 => 1
sensitive: => 0

Fields changed

resolution: => worksforme
status: new => closed

Metadata Update from @brhellman:
- Issue assigned to okos
- Issue set to the milestone: SSSD Patches welcome

7 years ago

SSSD is moving from Pagure to Github. This means that new issues and pull requests
will be accepted only in SSSD's github repository.

This issue has been cloned to Github and is available here:
- https://github.com/SSSD/sssd/issues/2069

If you want to receive further updates on the issue, please navigate to the github issue
and click on subscribe button.

Thank you for understanding. We apologize for all inconvenience.

Login to comment on this ticket.

Metadata