Hi, this ticket looks to be very similar to what's being seen in https://fedorahosted.org/389/ticket/47696 , which is still open at this time. Perhaps the logs and traces here can help that ticket. If this is the wrong process for creating associated tickets, then I apologise.
We have a pair of 389 DS masters replicating with each other. Very often while deleting a subtree containing around 10000 entries the ldap client will receive error 66 (LDAP_NOT_ALLOWED_ON_NONLEAF) partway through. In the errors log we see messages like:
[02/Apr/2014:11:53:08 +0100] entryrdn-index - _entryrdn_delete_key: Failed to remove cn=fax; has children [02/Apr/2014:11:53:08 +0100] - database index operation failed BAD 1031, err=-1 Unknown error: -1
Afterwards, looking at the parent entry with an ldap browser we see no (apparent) child entries. If the DS is restarted the entry (in this case cn=Fax,..) reappears. If you try to delete that entry you get an Operations error. If you then try to delete it a second time it is successful.
I've managed to repeat this reliably in my test environment; however this contains sensitive data. Also we have added items to the schema. When the tree is imported back into the DS the delete will fail at the same entry. Narrowing it down it appears that just deleting that single entry will trigger the error.
I managed to capture the moment of deletion with full error logging enabled; this is attached. The DNs of interest in this log are tpUuid=a219426d-0add-482c-a405-2cfd1e137a9b,cn=Fax,gcUID=00036448,ou=Subscribers,gcUID=00000067,o=teamphone.com and its parent.
What also might be useful is that I have taken a dbscan trace between each delete operation (on another item with the same behaviour). The start (clean) state is:
.. id 11001 rdn: cn=Fax nsUniqueId: e2d0ebb0-ba8511e3-b8dab01a-d8064075 cn: Fax gcCOS: unrestricted gcLanguage: 1 gcProgramNbr: Fax gcViewerClass: FaxRenderer objectClass: gcService objectClass: top objectClass: tpParameters owner: gcUID=0001ab1c,ou=Subscribers,gcUID=00000067,o=Teamphone.com tpNumberParameter0: 0 tpNumberParameter1: 0 tpNumberParameter10: 0 tpNumberParameter11: 0 tpNumberParameter12: 0 tpNumberParameter13: 0 tpNumberParameter14: 0 tpNumberParameter15: 0 tpNumberParameter2: 0 tpNumberParameter3: 0 tpNumberParameter4: 0 tpNumberParameter5: 0 tpNumberParameter6: 0 tpNumberParameter7: 0 tpNumberParameter8: 0 tpNumberParameter9: 0 creatorsName: cn=directory manager modifiersName: cn=directory manager createTimestamp: 20140402164347Z modifyTimestamp: 20140402164347Z parentid: 11000 entryid: 11001 numSubordinates: 1 .. id 11863 rdn: tpUuid=89510da7-b042-47de-8b83-13ad878220fa nsUniqueId: e2d0ef0e-ba8511e3-b8dab01a-d8064075 objectClass: gcRouting objectClass: top telephoneNumber: +442031625004 tpTargetService: cn=Fax,gcUID=0001ab1c,ou=Subscribers,gcUID=00000067,o=Teampho ne.com tpUuid: 89510da7-b042-47de-8b83-13ad878220fa creatorsName: cn=directory manager modifiersName: cn=directory manager createTimestamp: 20140402164355Z modifyTimestamp: 20140402164355Z parentid: 11001 entryid: 11863 ..
After tpUuid=89510da7-b042-47de-8b83-13ad878220fa,cn=Fax,gcUID=0001ab1c,ou=Subscribers,gcUID=00000067,o=Teamphone.com is deleted the dbscan trace difference is:
298383c298383 < numSubordinates: 1 --- > tombstoneNumSubordinates: 1 326932c326932,326933 < rdn: tpUuid=89510da7-b042-47de-8b83-13ad878220fa --- > rdn: nsuniqueid=e2d0ef0e-ba8511e3-b8dab01a-d8064075,tpUuid=89510da7-b042-47de- > 8b83-13ad878220fa 326935a326937 > objectClass;vucsn-533d622b000000010000: nsTombstone 326945a326948,326950 > nsParentUniqueId: e2d0ebb0-ba8511e3-b8dab01a-d8064075 > nscpEntryDN: tpuuid=89510da7-b042-47de-8b83-13ad878220fa,cn=fax,gcuid=0001ab1c > ,ou=subscribers,gcuid=00000067,o=teamphone.com 374010c374015,374016 < nsds50ruv: {replica 1 ldap://chrisds.ds.com:489} --- > nsds50ruv: {replica 1 ldap://chrisds.ds.com:489} 533d622b000000010000 533d622b > 000000010000 374011a374018 > nsruvReplicaLastModified: {replica 1 ldap://chrisds.ds.com:489} 533d6228
After cn=Fax,gcUID=0001ab1c,ou=Subscribers,gcUID=00000067,o=Teamphone.com is deleted (as per the 2-attempt, above) the trace difference is:
298347c298347,298348 < numSubordinates: 5 --- > numSubordinates: 4 > tombstoneNumSubordinates: 1 298350c298351,298352 < rdn: cn=Fax --- > rdn: nsuniqueid=e2d0ebb0-ba8511e3-b8dab01a-d8064075,nsuniqueid=e2d0ebb0-ba8511 > e3-b8dab01a-d8064075 298359a298362 > objectClass;vucsn-533d6274000000010000: nsTombstone 298383a298387,298389 > nsParentUniqueId: e2d0ebaf-ba8511e3-b8dab01a-d8064075 > nscpEntryDN: nsuniqueid=e2d0ebb0-ba8511e3-b8dab01a-d8064075,cn=fax,gcuid=0001a > b1c,ou=subscribers,gcuid=00000067,o=teamphone.com 374015c374021 < nsds50ruv: {replica 1 ldap://chrisds.ds.com:489} 533d622b000000010000 533d622b --- > nsds50ruv: {replica 1 ldap://chrisds.ds.com:489} 533d622b000000010000 533d6274 374018c374024 < nsruvReplicaLastModified: {replica 1 ldap://chrisds.ds.com:489} 533d6228 --- > nsruvReplicaLastModified: {replica 1 ldap://chrisds.ds.com:489} 533d6271
Note how the rdn is set to twice what the original self nsUniqueId was (rather than the parent nsUniqueId).
The version is 389-ds-base-1.2.11.15-32.el6_5.x86_64 on Centos 6.5.
errors log containing full tracing errors-during-delete.txt.gz
I've been doing some investigating on this, and I think I've found the area that might be causing the problem.
I think that if strlen('cn=fax') <= strlen('C11001') the problem occurs i.e. if the length of the normalised RDN is shorter than the RDN-CHILD-INDEX key.
Notice how the way that this index key is created is inconsistent in the size/ulen members: {{{ keybuf = slapi_ch_smprintf("%c%u", RDN_INDEX_CHILD, targetid); key.data = keybuf; key.size = key.ulen = strlen(keybuf) + 1; }}} And {{{ keybuf = slapi_ch_smprintf("%c%u", RDN_INDEX_CHILD, id); key.data = (void *)keybuf; key.size = key.ulen = strlen(nrdn) + 1; }}}
If I apply this patch then that fixes my problem: {{{ diff -Npru 389-ds-base-1.2.11.15.orig/ldap/servers/slapd/back-ldbm/ldbm_entryrdn.c 389-ds-base-1.2.11.15/ldap/servers/slapd/back-ldbm/ldbm_entryrdn.c --- 389-ds-base-1.2.11.15.orig/ldap/servers/slapd/back-ldbm/ldbm_entryrdn.c 2014-04-08 12:00:40.006725972 +0100 +++ 389-ds-base-1.2.11.15/ldap/servers/slapd/back-ldbm/ldbm_entryrdn.c 2014-04-11 16:06:48.892476801 +0100 @@ -2616,7 +2616,7 @@ _entryrdn_insert_key(backend be, / E.g., C1 */ keybuf = slapi_ch_smprintf("%c%u", RDN_INDEX_CHILD, workid); key.data = keybuf; - key.size = key.ulen = strlen(keybuf) + 1; + key.size = key.ulen = strlen(nrdn) + 1; key.flags = DB_DBT_USERMEM;
tmpsrdn = srdn;
}}} Note that this is not the full fix as this breaks ldif2db import - however this might be a helpful pointer to finding the real solution.
Looking at the latest version of 1.2.11, I only see "nrdn" being used for the key length in _entryrdn_delete_key() - not in _entryrdn_insert_key().
Anyway, using the strlen of nrdn is wrong. We should never use nrdn for the key length, but instead use the keybuf generated by slapi_ch_smprintf(): key.size = key.ulen = strlen(keybuf) + 1;
I can not reproduce this issue(I've been trying), but if you can, can you please try using "key.size = key.ulen = strlen(keybuf) + 1;" for all cases where we are building rdn child key? In my code base, 1.2.11, I only see it being used _entryrdn_delete_key(), but if its in other places in your code, then it should also be changed there as well. Please let me know if it helps - thanks.
I tried it with this patch, but it still failed - in fact it made the problem worse as now more elements in the tree exhibit the problem: {{{ diff -Npru 389-ds-base-1.2.11.15.orig/ldap/servers/slapd/back-ldbm/ldbm_entryrdn.c 389-ds-base-1.2.11.15/ldap/servers/slapd/back-ldbm/ldbm_entryrdn.c --- 389-ds-base-1.2.11.15.orig/ldap/servers/slapd/back-ldbm/ldbm_entryrdn.c 2014-04-08 12:00:40.006725972 +0100 +++ 389-ds-base-1.2.11.15/ldap/servers/slapd/back-ldbm/ldbm_entryrdn.c 2014-04-11 17:47:34.947470699 +0100 @@ -2826,7 +2826,7 @@ _entryrdn_delete_key(backend be, / check if the target element has a child or not / keybuf = slapi_ch_smprintf("%c%u", RDN_INDEX_CHILD, id); key.data = (void )keybuf; - key.size = key.ulen = strlen(nrdn) + 1; + key.size = key.ulen = strlen(keybuf) + 1; key.flags = DB_DBT_USERMEM;
memset(&data, 0, sizeof(data));
}}}
I'm not familiar enough (yet) with the Berkeley DB API - I'm wondering whether one of the calls involving RDN_INDEX_CHILD keys is making modifications to the memory pointed to by {{{key.data}}}. I'll keep investigating.
In the original report, this is definitely a bug. 298350c298351,298352 < rdn: cn=Fax
rdn: nsuniqueid=e2d0ebb0-ba8511e3-b8dab01a-d8064075,nsuniqueid=e2d0ebb0-ba8511 e3-b8dab01a-d8064075
I should change the subject, but a dup of this ticket (tombstoned tombstone), which I'm working on now... https://fedorahosted.org/389/ticket/47696
I ran this with Valgrind, in case it showed anything. Comparing between a working case and the failing case I get this trace when deleting the leaf entry: {{{ ==8782== Thread 44: ==8782== Conditional jump or move depends on uninitialised value(s) ==8782== at 0xC50DDA8: __log_putr (log_put.c:731) ==8782== by 0xC50EB88: __log_put (log_put.c:463) ==8782== by 0xC4C4E68: __db_addrem_log (db_auto.c:257) ==8782== by 0xC4D2147: __db_pitem (db_dup.c:130) ==8782== by 0xC42A701: __bam_iitem (bt_put.c:376) ==8782== by 0xC425237: __bamc_put (bt_cursor.c:1960) ==8782== by 0xC4CEA82: __dbc_put (db_cam.c:1520) ==8782== by 0xC4D3FC0: __dbc_put_pp (db_iface.c:2456) ==8782== by 0x100BB950: _entryrdn_put_data (ldbm_entryrdn.c:1948) ==8782== by 0x100C1B1A: _entryrdn_insert_key (ldbm_entryrdn.c:2102) ==8782== by 0x100C23A0: entryrdn_index_entry (ldbm_entryrdn.c:295) ==8782== by 0x100BA286: ldbm_back_delete (ldbm_delete.c:808) ==8782== by 0x4E7B522: ??? (in /usr/lib64/dirsrv/libslapd.so.0.0.0) ==8782== by 0x4E7B860: do_delete (in /usr/lib64/dirsrv/libslapd.so.0.0.0) ==8782== by 0x4141D3: ??? (in /usr/sbin/ns-slapd) ==8782== by 0x6A929A5: _pt_root (ptthread.c:204) ==8782== by 0x70CB9D0: start_thread (pthread_create.c:301) ==8782== by 0x73C9B6C: clone (clone.S:115) }}} I'm not sure whether the above is useful, but it led me to the thought in comment 1.
It seems to be in the area where the leaf is converted into a tombstone, and the child index for this tombstone is added to its parent. Perhaps in the case where the parent rdn is short it causes some sort of corruption of the index that eventually leads to the tombstoned tombstone issue when the parent (now itself a leaf) is deleted.
Replying to [comment:5 chatfield]:
This trace is benign. This is not the droid you're looking for. Unfortunately, valgrind does not have a high signal to noise ratio - there can be some noise like this that you have to ignore.
I've just checked the entryrdn db files (using dbscan -R) after deleting the original leaf entry (id 11863), and it appears that everything is being updated as it should.
Maybe this theory is why we're getting the tombstoned tombstone issue: When the client deletes cn=Fax above, we get the "has children" error (current cause unknown) in the log, and the error code back to the client. Then the transaction gets aborted, so the disk view is correct. Having checked the client view at this point (by reloading the subtree), the cn=Fax entry is gone. However, if at this point the client hadn't refreshed they could attempt to delete the same entry. If you do this you get this in the errors log: {{{ [14/Apr/2014:16:03:49 +0100] entryrdn-index - _entryrdn_delete_key: Failed to remove cn=fax; has children [14/Apr/2014:16:03:49 +0100] - database index operation failed BAD 1031, err=-1 Unknown error: -1 [14/Apr/2014:16:03:53 +0100] entryrdn-index - _entryrdn_delete_key: Failed to remove nsuniqueid=e2d0ebb0-ba8511e3-b8dab01a-d8064075; has children [14/Apr/2014:16:03:53 +0100] - database index operation failed BAD 1031, err=-1 Unknown error: -1 }}} This suggests the cache/memory view wasn't reset fully when the previous transaction was aborted. It would seem the entry in the cache is the version that has partially been converted to a tombstone. Note also that if the server is restarted after the first delete of cn=Fax the entry becomes visible to client searches again.
As an addition to comment 7, this cached entry of a partially converted tombstone is in a state where a delete can be attempted on it if the client knows the original dn. This means that it may be converted into a tombstone a second time, hence the double tombstone rdn.
Another theory, this time a guess at the cause of the "has children" error: It looks like the change in comment 3 is a valid one. However by making that change, it has revealed something that might have been working "accidentally". Normally the length of the rdn is greater than the length of the index key. When the size/ulen fields were set to strlen(nrdn) where the data field was set to keybuf, the Berkeley db was probably trying to search for a child index whose key was actually "C12345\0xyz" rather than "C12345". This causes the return code of DB_NOTFOUND, which allows the delete to continue. However in the case above, the nrdn is the same length as the key, which means the child index does contain an entry, which causes the no-child-check to abort the transaction.
Ok, I think I have the steps to recreate: * Set up a pair of servers, ldap1 and ldap2 * Set up replication between them * Fill the tree so that there are just over 100 entries in the complete tree (i.e. we want the subsequent entry ids to be 3 digits long) * Create an entry within the tree o=xy (it's important that its rdn is 4 characters long) * Create an entry under this (it doesn't matter in this case what kind you create) * from your ldap client delete the subtree starting at o=xy * you should get the "has children" error * theoretically if you delete again the o=xy entry you would get the double tombstone
If you'd like me to prepare and upload an example tree that demonstrates the the problem, just let me know. Hopefully that would be useful for you.
Replying to [comment:11 chatfield]:
Yes, it'd help our debugging a lot. We've been working on the related issue #47750 etc., and I'd like to run your test against the debugged version. Could you please provide us the test ldif file(s) you are using? . the tree so that there are just over 100 entries . an entry within the tree o=xy . Create an entry under this
Thank you, in advance.
Here is a simpler version of the file as described in the recreation steps. To recreate, follow this: * Prepare a fresh 389DS with a base DN of o=a * Enable changelog and (multi master) replication for o=a (you don't need a second DS, nor an agreement set up) * Import the sample ldif and restart the DS * Using a client tool, delete "ou=child,o=x,o=a" * Using a client tool, delete "o=x,o=a" You should see the second delete fail, and warnings in the errors log.
attachment sample.ldif
Thank you for the reproducer! I could duplicate the bug.
Bug description: When checking a child entry on a node, it only checked the first position, which was normally "deleted" if there were no more children. But in some cases, a tombstoned child was placed there. If it occurred, even though there were no live child any more, _entryrdn_delete_key returned "has children" and the delete operation failed.
Fix description: This patch checks all the children of the to-be- deleted node and if there is no child or all of them are tombstones, it goes to the next process. Also, the fixed a typo reported by chatfield (Thank you!!)
git patch file (master) 0001-Ticket-47764-Problem-with-deletion-while-replicated.patch
Thanks for the patch! I've tried it on my test box (Centos 6.5/EPEL with all the current updates), and applied the patch against the source rpm. I had to change one line to make it work for me: {{{if (!slapi_is_special_rdn(childelem->rdn_elem_nrdn_rdn, RDN_IS_TOMBSTONE)) {}}} becomes: {{{if (NULL == strchr((char *)childelem->rdn_elem_nrdn_rdn, ',')) {}}} Good news - it solves the case I'm seeing!
While playing around with this I saw a potential leak in Valgrind: {{{ ==26812== 60,394 bytes in 13,709 blocks are definitely lost in loss record 1,625 of 1,676 ==26812== at 0x4C279EE: malloc (vg_replace_malloc.c:270) ==26812== by 0x4E77BBA: slapi_ch_malloc (ch_malloc.c:155) ==26812== by 0x4E85A8F: slapi_entry_attr_get_charptr (entry.c:2730) ==26812== by 0x100B97E7: ldbm_back_delete (ldbm_delete.c:329) ==26812== by 0x4E7B522: op_shared_delete (delete.c:364) ==26812== by 0x4E7B860: do_delete (delete.c:128) ==26812== by 0x4141D3: connection_threadmain (connection.c:584) ==26812== by 0x6A929A5: _pt_root (ptthread.c:204) ==26812== by 0x70CB9D0: start_thread (pthread_create.c:301) ==26812== by 0x73C9B6C: clone (clone.S:115) }}} It looks like in the line in ldbm_delete.c: {{{char *pid_str = slapi_entry_attr_get_charptr(e->ep_entry, LDBM_PARENTID_STR);}}} the pid_str is not freed. Having said that, I don't think the patch causes this. Also, it might have already been fixed in the Git tree.
Anyway, great work!
Thank you for reviewing and trying the patch! Yes, this patch was created on top of this fix, in which slapi_is_special_rdn was added (please note that the both patches are waiting for reviews...) https://fedorahosted.org/389/ticket/47750 As you see, the patch for #47750 is huge and invasive... I ran my tests with the 2 patches applied via valgrind and thought it's leak clean, but I might have missed it. Definitely, I'm going to take a look at the the code you pointed out. Thanks! --noriko
I had fixed the pid_str leak in a previous patch for 1.2.11. So this patch is good to go, ack.
No wonder I didn't see it. Thanks a lot, Mark!
Reviewed by chatfield and Mark (Thank you, both of you!!)
Pushed to master: 160cb3f..832253e master -> master commit 832253e
Pushed to 389-ds-base-1.3.2: 3f933fe..86b34ca 389-ds-base-1.3.2 -> 389-ds-base-1.3.2 commit 86b34ca
Pushed to 389-ds-base-1.3.1: 68b52b7..efa23ce 389-ds-base-1.3.1 -> 389-ds-base-1.3.1 commit efa23ced2a6e3de3389d9b801329066f511bc38c
Pushed to 389-ds-base-1.2.11: 9e5b333..ca407f8 389-ds-base-1.2.11 -> 389-ds-base-1.2.11 commit ca407f8
Ticket has been cloned to Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1097001
Ticket has been cloned to Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1097002
Metadata Update from @nhosoi: - Issue assigned to nhosoi - Issue set to the milestone: 1.2.11.30
389-ds-base is moving from Pagure to Github. This means that new issues and pull requests will be accepted only in 389-ds-base's github repository.
This issue has been cloned to Github and is available here: - https://github.com/389ds/389-ds-base/issues/1096
If you want to receive further updates on the issue, please navigate to the github issue and click on subscribe button.
subscribe
Thank you for understanding. We apologize for all inconvenience.
Metadata Update from @spichugi: - Issue close_status updated to: wontfix (was: Fixed)
Login to comment on this ticket.