Learn more about these different git repos.
Other Git URLs
Ticket was cloned from Red Hat Bugzilla (product Red Hat Enterprise Linux 6): Bug 918394
Description of problem: When we clear the sss-cache by using sss_cache -U, sss_cache -G, sss_cache -u <login> the process sssd_nss takes each time some fds more. When the process reaches its fd_limit, sssd runs at 99% CPU and the system gets unresponsive for every user-related task. Version-Release number of selected component (if applicable): rpm -qa | grep sssd sssd-tools-1.9.2-82.el6.x86_64 sssd-client-1.9.2-82.el6.x86_64 sssd-1.9.2-82.el6.x86_64 How reproducible: Everytime we run sss_cache -U or sss_cache -u <login> the number of open files increases up to the fd_limit. Then, sssd runs at 99% CPU and no nss is working anymore... Steps to Reproduce: 1. service sssd start #start service 2. watch "lsof -p `ps -ef | grep sssd_nss | grep -v grep | perl -l -a -n -F"\s+" -e 'print $F[1]'` | wc -l" #watch fds 3. sss_cache -U #clear cache several times and watch the number of fds Actual results: Increasing number of fds for the sssd_nss process Expected results: Constant number of fds for the sssd_nss process Additional info: The leaking fds are all pointing to this files, lsof output: sssd_nss 2090 root 8176u REG 8,1 6806312 3424241 /var/lib/sss/mc/passwd (deleted) sssd_nss 2090 root 8177u REG 8,1 5206312 3424243 /var/lib/sss/mc/group (deleted) sssd_nss 2090 root 8178u REG 8,1 6806312 3424242 /var/lib/sss/mc/passwd (deleted) sssd_nss 2090 root 8179u REG 8,1 5206312 3424245 /var/lib/sss/mc/group (deleted) sssd_nss 2090 root 8180u REG 8,1 6806312 3424247 /var/lib/sss/mc/passwd (deleted) sssd_nss 2090 root 8181u REG 8,1 6806312 3424244 /var/lib/sss/mc/passwd (deleted) sssd_nss 2090 root 8182u REG 8,1 5206312 3424246 /var/lib/sss/mc/group (deleted) sssd_nss 2090 root 8183u REG 8,1 5206312 3424248 /var/lib/sss/mc/group (deleted) sssd_nss 2090 root 8184u REG 8,1 5206312 3424250 /var/lib/sss/mc/group (deleted) sssd_nss 2090 root 8185u REG 8,1 6806312 3424251 /var/lib/sss/mc/passwd (deleted) sssd_nss 2090 root 8186u REG 8,1 5206312 3424252 /var/lib/sss/mc/group (deleted) sssd_nss 2090 root 8187u REG 8,1 6806312 3424253 /var/lib/sss/mc/passwd (deleted) sssd_nss 2090 root 8188u REG 8,1 5206312 3424254 /var/lib/sss/mc/group (deleted) sssd_nss 2090 root 8189u REG 8,1 6806312 11493377 /var/lib/sss/mc/passwd (deleted) sssd_nss 2090 root 8190u REG 8,1 6806312 3424255 /var/lib/sss/mc/passwd (deleted) sssd_nss 2090 root 8191u REG 8,1 5206312 3424256 /var/lib/sss/mc/group (deleted) The reason for the CPU usage is the error handling after epoll_wait(), strace output: epoll_wait(5, {{EPOLLIN, {u32=24633616, u64=24633616}}}, 1, 40403) = 1 accept(23, 0x149b38e0, [110]) = -1 EMFILE (Too many open files) epoll_wait(5, {{EPOLLIN, {u32=24633616, u64=24633616}}}, 1, 40403) = 1 accept(23, 0x149b38e0, [110]) = -1 EMFILE (Too many open files) epoll_wait(5, {{EPOLLIN, {u32=24633616, u64=24633616}}}, 1, 40403) = 1 accept(23, 0x149b38e0, [110]) = -1 EMFILE (Too many open files) epoll_wait(5, {{EPOLLIN, {u32=24633616, u64=24633616}}}, 1, 40403) = 1 accept(23, 0x149b38e0, [110]) = -1 EMFILE (Too many open files) epoll_wait(5, {{EPOLLIN, {u32=24633616, u64=24633616}}}, 1, 40403) = 1 accept(23, 0x149b38e0, [110]) = -1 EMFILE (Too many open files) epoll_wait(5, {{EPOLLIN, {u32=24633616, u64=24633616}}}, 1, 40403) = 1 accept(23, 0x149b38e0, [110]) = -1 EMFILE (Too many open files) epoll_wait(5, {{EPOLLIN, {u32=24633616, u64=24633616}}}, 1, 40376) = 1 Workaround: We set the fd_limit in the [nss] section of sssd.conf to a much too high value and restart sssd with our NMS when it approaches the limit. [nss] entry_negative_timeout = 0 debug_level = 0x1310 fd_limit=200000 This is not yet fixed in the packages in this repo [sssd-1.9-RHEL6.3] name=SSSD 1.9.x built for latest stable RHEL baseurl=http://repos.fedorapeople.org/repos/jhrozek/sssd/epel-6/$basearch/ enabled=1 skip_if_unavailable=1 gpgcheck=0
Fields changed
blockedby: => blocking: => coverity: => design: => design_review: => 0 feature_milestone: => fedora_test_page: => owner: somebody => mzidek selected: => testsupdated: => 0
patch: 0 => 1
milestone: NEEDS_TRIAGE => SSSD 1.9.5
resolution: => fixed status: new => closed
Metadata Update from @jhrozek: - Issue assigned to mzidek - Issue set to the milestone: SSSD 1.9.5
SSSD is moving from Pagure to Github. This means that new issues and pull requests will be accepted only in SSSD's github repository.
This issue has been cloned to Github and is available here: - https://github.com/SSSD/sssd/issues/2868
If you want to receive further updates on the issue, please navigate to the github issue and click on subscribe button.
subscribe
Thank you for understanding. We apologize for all inconvenience.
Login to comment on this ticket.