Learn more about these different git repos.
Other Git URLs
sssd is configured against Active Directory.
sssd_be crashed dumping core:
Core was generated by `/usr/libexec/sssd/sssd_be --domain default --debug-to-files'. Program terminated with signal 11, Segmentation fault. #0 0x0000000000411a93 in fo_set_port_status (server=0x50eef600, status=PORT_WORKING) at src/providers/fail_over.c:1332 1332 if (!siter->common || !siter->common->name) continue; (gdb) list 1327 /* It is possible to introduce duplicates when expanding SRV results 1328 * into fo_server structures. Find the duplicates and set the same 1329 * status */ 1330 DLIST_FOR_EACH(siter, server->service->server_list) { 1331 if (siter == server) continue; 1332 if (!siter->common || !siter->common->name) continue; 1333 1334 if (siter->port == server->port && 1335 (strcasecmp(siter->common->name, server->common->name) == 0)) { 1336 DEBUG(7, ("Marking port %d of duplicate server '%s' as '%s'\n", (gdb) print siter->common $1 = (struct server_common *) 0xa0 (gdb) print siter->common->name Cannot access memory at address 0xc0
/var/log/secure:
Oct 7 04:26:01 blah crond[6022]: pam_sss(crond:account): Request to sssd failed. Timer expired
core file has been retained (but is large - 1.3Gbytes), and an sssd_default.log is availabled at log level 9.
Easier to read log from gdb gdb-log
Sorry for not asking this sooner, but do you still have SSSD logs from when the bug happened? It would be very beneficial to see what resolving SSSD performed etc.
Also, if you still have the core file, can you examine some data structures for me, please?
I would like to see the following from inside the fo_set_port_status() function:
fo_set_port_status()
print server->service->ctx print *server->service->ctx print *server->service->ctx->server_common_list
Thank you!
Fields changed
milestone: NEEDS_TRIAGE => SSSD 1.7.0 priority: critical => blocker
owner: somebody => jzeleny
Besides information jhrozek asked for earlier, I'd also greatly appreciate a reproducer, i.e. sanitized config file and steps you had to perform to induce this segfault. I'd like a core file of my own so I could inspect the code in detail.
Thanks Jan
There has been no activity for some time in this ticket. I'd like to ask you once more for the additional information we requested. If no more info is provided, I'll close the ticket as worksforme.
Replying to [comment:7 jzeleny]:
Sorry for not getting back to you, I'd not seen the movement on this ticket. I've still got the sssd logs and the core dumps, but not the matching build of 1.6.1 I had installed at the time, so I'm not sure the value of it. I've not got the matching /var/log/secure which makes lining up the timings of when things went wrong and matching that up with the 4.4.Gbyte sssd_default.log a little fun.
I upgraded to 1.6.3 and have not seen this problem again. I've left in place a script that monitors the logs for this failure, so should be able to catch it again if it happens in future. Before it was happening every week or two on a heavily loaded system, so it should crop up again soon enough if the problem's not fixed.
I have had crashes of sssd_be since, but they've all recovered gracefully.
jh
I don't have a reliable reproducer unfortunately and there's not an obvious pattern. The machine sits in service with a reasonable number of users coming in and out over ssh. Over the last month (a mix of the old 1.6.1 and the newer 1.6.3) it sssd_be has crashed 9 times. What log level would be useful?
sssd.conf:
[sssd] config_file_version = 2 reconnection_retries = 3 sbus_timeout = 30 services = nss, pam domains = default [nss] filter_groups = root filter_users = root reconnection_retries = 3 [pam] reconnection_retries = 3 [domain/default] lookup_family_order=ipv4_only auth_provider = krb5 cache_credentials = false krb5_realm = EXAMPLE.COM chpass_provider = krb5 id_provider = ldap dns_discovery_domain = EXAMPLE.COM krb5_validate = true krb5_renew_interval = 300 min_id = 100 access_provider = simple simple_allow_groups = a_group enumerate = false ldap_force_upper_case_realm = True ldap_schema = rfc2307bis ldap_referrals = false ldap_search_base = dc=example,dc=com ldap_sasl_mech = gssapi ldap_pwd_policy = none ldap_user_object_class = user ldap_user_name = sAMAccountName ldap_user_uid_number = msSFU30UidNumber ldap_user_gid_number = primaryGroupID ldap_user_gecos = displayName ldap_user_home_directory = msSFU30HomeDirectory ldap_user_shell = msSFU30LoginShell ldap_user_principal = userPrincipalName ldap_group_object_class = group ldap_group_name = cn ldap_group_gid_number = msSFU30GidNumber ldap_group_search_base = ou=blah,ou=blah,dc=example,dc=com
status: new => assigned
I'm going to close this one, as the patch which probably fixes this has been pushed to master. Please feel free to reopen if the error persists on your system.
Fixed in: d4d9091
resolution: => fixed status: assigned => closed
rhbz: => 0
Metadata Update from @prefect: - Issue assigned to jzeleny - Issue set to the milestone: SSSD 1.7.0
SSSD is moving from Pagure to Github. This means that new issues and pull requests will be accepted only in SSSD's github repository.
This issue has been cloned to Github and is available here: - https://github.com/SSSD/sssd/issues/2079
If you want to receive further updates on the issue, please navigate to the github issue and click on subscribe button.
subscribe
Thank you for understanding. We apologize for all inconvenience.
Login to comment on this ticket.