seems to be related to orphan tombstone entries - not sure if it is the tombstone entry that is being replicated, or if it is attempting to urp with the tombstone entry on the consumer side
stack of the crashed thread 389stacktrace.txt
This is an example of an orphaned entry: {{{ rdn: nsuniqueid=cb711181-397411e1-8d5b8e63-25f4fc76,cn=KDC nsUniqueId: cb711181-397411e1-8d5b8e63-25f4fc76 objectClass;vucsn-4f08b6eb000200150000: nsContainer objectClass;vucsn-4f08b6eb000200150000: ipaConfigObject objectClass;vucsn-4f08b6eb000200150000: top objectClass;vucsn-4f10d589002500040000: nsTombstone ipaConfigString;vucsn-4f08b6eb000200150000: enabledService ipaConfigString;vucsn-4f08b6eb000200150000: startOrder 10 cn;vucsn-4f08b6eb000200150000;mdcsn-4f08b6eb000200150000: KDC creatorsName;vucsn-4f08b6eb000200150000: cn=directory manager modifiersName;vucsn-4f08b6eb000200150000: cn=directory manager createTimestamp;vucsn-4f08b6eb000200150000: 20120107211605Z modifyTimestamp;vucsn-4f08b6eb000200150000: 20120107211605Z parentid: 7385 entryid: 7390 entryusn: 1005398 nsParentUniqueId: 86e57683-397411e1-8d5b8e63-25f4fc76 nscpEntryDN: cn=kdc,cn=fqdn.domain.tld,cn=masters,cn=ipa,cn=etc,dc=domain,dc=tld }}} The id 7385 does not exist in the database. Neither 7390 nor 7385 were in the entryrdn index.
output from valgrind valgrind.out.txt
The main reason of the crash looks the way how urp_fixup_add_entry passes the to-be-added entry to slapi_add_entry_internal_set_pb. The entry is consumed in the add operation, but the urp continues accessing it.
git patch file (master) 0001-Trac-Ticket-298-crash-when-replicating-orphaned-tomb.patch
Fix description: 1. The cause of the crash was freeing a to-be-added entry in tombstone_to_glue although the entry is consumed in slapi_add_entry_internal_set_pb/slapi_add_internal_pb. This patch removes the redundant slapi_entry_free from tombstone_to_glue. 2. Introducing is_suffix_dn_ext to pass is_tombstone flag for getting the proper parent sdn of a tombstoned entry. 3. Logic handling ancestor tombstone was broken. In _entryrdn_insert_key, if _entryrdn_get_tombstone_elem finds a child node, it was checking if the node is a tombstone or not immediately. It should have been done in the next loop. 4. Reducing repeated "WARNING: bad entry: ID ##" messages.
Reviewed by Rich (Thank you!!!)
Pushed to master.
$ git checkout master Switched to branch 'master' [nhosoi@kiki ldapserver 1964]$ git merge 298 Updating 85d44fc..2e5ee4d Fast-forward ldap/servers/plugins/replication/urp.c | 13 +++++-- ldap/servers/plugins/replication/urp.h | 1 + ldap/servers/plugins/replication/urp_tombstone.c | 3 +- ldap/servers/slapd/add.c | 13 ++++--- ldap/servers/slapd/back-ldbm/import-threads.c | 4 +- ldap/servers/slapd/back-ldbm/import.c | 11 +++--- ldap/servers/slapd/back-ldbm/import.h | 11 ++++-- ldap/servers/slapd/back-ldbm/ldbm_add.c | 3 +- ldap/servers/slapd/back-ldbm/ldbm_entryrdn.c | 43 ++++++++-------------- ldap/servers/slapd/back-ldbm/ldif2ldbm.c | 4 ++- 10 files changed, 53 insertions(+), 53 deletions(-)
$ git push Counting objects: 35, done. Delta compression using up to 4 threads. Compressing objects: 100% (18/18), done. Writing objects: 100% (18/18), 2.68 KiB, done. Total 18 (delta 16), reused 0 (delta 0) To ssh://git.fedorahosted.org/git/389/ds.git 85d44fc..2e5ee4d master -> master
Cherry-picked and pushed to 389-ds-base-1.2.10, as well.
$ git cherry-pick 2e5ee4d [ds1210 214b58a] Trac Ticket #298 - crash when replicating orphaned tombstone entry 10 files changed, 53 insertions(+), 53 deletions(-)
$ git push origin ds1210:389-ds-base-1.2.10 Counting objects: 35, done. Delta compression using up to 4 threads. Compressing objects: 100% (18/18), done. Writing objects: 100% (18/18), 2.70 KiB, done. Total 18 (delta 16), reused 0 (delta 0) To ssh://git.fedorahosted.org/git/389/ds.git f676eb1..d7e1c25 ds1210 -> 389-ds-base-1.2.10
Note: test case is automated.
commit changeset:6939b2d/389-ds-base Author: Rich Megginson rmeggins@redhat.com Date: Wed Feb 22 16:26:05 2012 -0700 Fix description: The previous fix for 4. Reducing repeated "WARNING: bad entry: ID ##" messages. introduced a regression that caused the import to crash. This fixes the crash by restoring the bad entry ID logic. Reviewed by: nhosoi (Thanks!)
commit changeset:48ba947/389-ds-base Author: Rich Megginson rmeggins@redhat.com Date: Wed Feb 22 16:26:05 2012 -0700 1.2.10 branch
commit changeset:d7e1c25/389-ds-base Author: Noriko Hosoi nhosoi@redhat.com Date: Tue Feb 21 10:17:00 2012 -0800 1.2.10 branch (cherry picked from commit changeset:2e5ee4d/389-ds-base)
Ticket has been cloned to Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=796770
Added initial screened field value.
Metadata Update from @nhosoi: - Issue assigned to nhosoi - Issue set to the milestone: 1.2.10.2
389-ds-base is moving from Pagure to Github. This means that new issues and pull requests will be accepted only in 389-ds-base's github repository.
This issue has been cloned to Github and is available here: - https://github.com/389ds/389-ds-base/issues/298
If you want to receive further updates on the issue, please navigate to the github issue and click on subscribe button.
subscribe
Thank you for understanding. We apologize for all inconvenience.
Metadata Update from @spichugi: - Issue close_status updated to: wontfix (was: Fixed)
Login to comment on this ticket.