https://bugzilla.redhat.com/show_bug.cgi?id=510182
Description of problem: The setup is a single instance DNA plugin enabled. The dnaNextValue counter gets incremented even if there is a failure in addition of the user. Therefore creating gaps in number assignment and resulting in non utilization of a valid number in range. Version-Release number of selected component (if applicable): RHDS 9.0 How reproducible: Everytime Steps to Reproduce: 1. Enable DNA Plugin 2. Add the entry below, to enable DNA for uidNumber attribute. dn: cn=Account UIDs,cn=Distributed Numeric Assignment Plugin,cn=plugins,cn=config objectClass: top objectClass: extensibleObject cn: Account UIDs dnatype: uidNumber dnafilter: (objectclass=posixAccount) dnascope: dc=pnq,dc=redhat,dc=com dnanextvalue: 1 dnamaxvalue: 1300 3. Restart server. 4. Add a user and see if the DNA plugin is working as expected and the dnaNextValue number get incremented. (This is happening correctly). 5. Add a user without gidNumber attribute, which will fail as DNA is not enabled for gidNumber in this setup. 6. Check dnaNextValue. It would have got incremented even though the user addition is failed. Actual results: dnaNextValue get incremented even if the user addition fails. Expected results: dnaNextValue should not get incremented if the user addition fails. Additional info:
batch move to milestone 1.3
I'm not sure that we can catch 100% of the cases where we have already grabbed the next free value from the range without adding some way of putting a value back if the operation ultimately fails. This would require quite a bit of surgery since we would have to handle the case where the free values are not a contiguous range.
I think the right solution here is to grab the next free value from the range at the betxnpreoperation stage now that we have transaction support at the plug-in level. This will catch the majority of cases where a pre-op plug-in rejects an operation after we have already allocated a value.
I noticed that I can refine some of the code in the fix:
3391 /* If we have a value, see if it's the magic value. */ 3392 if (bv) { 3393 len = strlen(DNA_NEEDS_UPDATE); 3394 if (len == bv->bv_len) { 3395 if (!slapi_UTF8NCASECMP(bv->bv_val,DNA_NEEDS_UPDATE,len)) { 3396 slapi_ch_array_add(&types_to_generate, slapi_ch_strdup(type)); 3397 } 3398 }
We don't need to get the strlen of a fixed value, so some of this can be cleaned up.
Working on a new revision...
Mark
The change to ldbm_back_modify() is not sufficient. Consider the case where you get a DB_LOCK_DEADLOCK == retval and have to abort/start the transaction again. In this case, you will want to apply your mods to the original mods list, not the one that may have been modified in a previous loop iteration. So you will need to make a copy of the original mods list and restore it if DB_LOCK_DEADLOCK == retval and you have to retry the transaction. The same applies if you are changing SLAPI_ADD_ENTRY or SLAPI_MODIFY_EXISTING_ENTRY - you would need to restore the original pre/post operation entries and re-apply your changes if looping again.
instead of doing the cleanup to restore the original adding entry/mods at each error condition, can you just put it at the top of the big loop in the abort condition e.g. something like this: {{{ for (retry_count = 0; retry_count < RETRY_TIMES; retry_count++) {
if (txn.back_txn_txn && (txn.back_txn_txn != parent_txn)) { dblayer_txn_abort(li,&txn); slapi_pblock_set(pb, SLAPI_TXN, parent_txn); /* Reset mods/entry, Abort and re-try */ ldap_mods_free(mods, 1); slapi_pblock_set(pb, SLAPI_MODIFY_MODS, copy_mods(mods_original)); backentry_free(&ec); slapi_pblock_set( pb, SLAPI_MODIFY_EXISTING_ENTRY, ec_original->ep_entry ); ec = ec_original; if ( (ec_original = backentry_dup( e )) == NULL ) { ldap_result_code= LDAP_OPERATIONS_ERROR; goto error_return; }
}}}
For mods, you may also have to handle the case where the plugin actually changes the mods pointer - I believe there are some plugins that actually do {{{ slapi_pblock_set(pb, SLAPI_MODIFY_MODS, newmods); }}} so you'll have to use slapi_pblock_get(pb, SLAPI_MODIFY_MODS, &mods); to get them first before you free them and restore them to the original. With entries, I don't think that happens, but you could do a search for places where slapi_pblock_set.*SLAPI_ADD_ENTRY or SLAPI_MODIFY_EXISTING_ENTRY is called.
You'll also have to add similar code to ldbm_delete and ldbm_modrdn in case there are plugins that modify the pblock parameters.
This is why I didn't do this when I originally implemented transaction support - this is quite messy. I just figured it would be easier for the plugins to call the internal operations directly rather than try to muck about with the parameters, and have to keep track of the originals in case of looping.
Can I ask not to mix the spaces and tabs?
754 backentry_free(&ec); 755 slapi_pblock_set( pb, SLAPI_MODRDN_EXISTING_ENTRY, original_entry->ep_entry ); 756 ec = original_entry;
You copied lots of entries and DNs... We were trying to reduce the malloc as much as possible for the performance improvement task. Just please have it in mind...
Replying to [comment:10 nhosoi]:
Can I ask not to mix the spaces and tabs? 754 backentry_free(&ec); 755 slapi_pblock_set( pb, SLAPI_MODRDN_EXISTING_ENTRY, original_entry->ep_entry ); 756 ec = original_entry; You copied lots of entries and DNs... We were trying to reduce the malloc as much as possible for the performance improvement task. Just please have it in mind...
Yeah, it's tough - if we are going to allow the betxn plugins to modify their arguments at all, we have to make copies and restore them in the retry case. Or, betxn plugins would have to know if they applied their updates to the plugin arguments (i.e. the mods list) already. That would require adding a lot of logic to plugins. Or we could make plugins aware that they are being called as part of a retry. I don't know if we can just say "betxn plugins cannot modify their arguments - they should be treated as read-only". They would be allowed to perform nested internal operations of their own, which would be included in the transaction as a nested transaction.
There is one "special" case - the code that does the ldbm_txn_ruv_modify_context() code - this allows the replication plugin to "piggyback" the mods to the RUV tombstone entry along with the rest of the writes in the transaction.
Thanks for the explanation. It makes sense. I agree that plugins should be free from the backend transaction logic.
final revision 0001-Ticket-211-dnaNextValue-gets-incremented-even-if-the.2.patch
Thanks for the review Rich and Noriko. I removed the tabs from the fix. Sorry I've been working on that, but it's just a old habit.
Yeah I know it's a lot of grabbing and freeing of memory, but we only do most of it if we hit the deadlock. I do think the plugin interaction needs to be discussed further - outside of this bug. This bug just kind of exposed the issue. In fact most of the fix in the backend is all proactive. We don't need to do most of it as of today, but if you do it for one operation, you should do it for all of them.
Fix is pushed...
[mareynol@localhost ldap]$ git merge ticket211 Updating e2a1090..69c9f3b Fast-forward ldap/servers/plugins/dna/dna.c | 616 +++++++++++++++++++++------- ldap/servers/slapd/back-ldbm/ldbm_add.c | 15 + ldap/servers/slapd/back-ldbm/ldbm_delete.c | 15 + ldap/servers/slapd/back-ldbm/ldbm_modify.c | 39 ++- ldap/servers/slapd/back-ldbm/ldbm_modrdn.c | 88 ++++- ldap/servers/slapd/operation.c | 2 +- ldap/servers/slapd/slapi-private.h | 1 + 7 files changed, 614 insertions(+), 162 deletions(-)
[mareynol@localhost ldap]$ git push Counting objects: 29, done. Delta compression using up to 4 threads. Compressing objects: 100% (15/15), done. Writing objects: 100% (15/15), 6.44 KiB, done. Total 15 (delta 13), reused 0 (delta 0) To ssh://git.fedorahosted.org/git/389/ds.git e2a1090..69c9f3b master -> master
Additional patch 0001-ticket-211-Use-of-uninitialized-variables-in-ldbm_ba.patch
Pushed additional patch to master. Thanks to Noriko for her review!
Counting objects: 13, done. Delta compression using up to 2 threads. Compressing objects: 100% (7/7), done. Writing objects: 100% (7/7), 720 bytes, done. Total 7 (delta 5), reused 0 (delta 0) To ssh://git.fedorahosted.org/git/389/ds.git 78a349b..f5930b8 master -> master
git patch file (master) 0001-Ticket-211-Avoid-preop-range-requests-non-DNA-operat.patch
Author: Nathan Kinder nkinder@redhat.com Date: Wed Mar 7 07:50:24 2012 -0800
Ticket 211 - Avoid preop range requests non-DNA operations The previous patch for ticket 211 moved range requests to the preop callback. The checking for a range overflow is happening for any operation, not just those that would require DNA to generate a value. This can cause issues with operations being rejected when they shouldn't be due to failed range extension requests. This patch moves the range extension request to only occur if the incoming operation would require DNA to generate a value. In addition, we need to check if we've bypassed the threshold in the preop callback. This check has been added. Lastly, we need to avoid doing a range extension request from within a transaction. Even though it is an unlikely case, there was still the possibility of the betxnpreop triggering a range request. I changed the code to make a range request impossible while we are within a transaction.
Reviewed and tested by nhosoi. Pushing the patch on behalf of Nathan. $ git push Counting objects: 13, done. Delta compression using up to 4 threads. Compressing objects: 100% (7/7), done. Writing objects: 100% (7/7), 1.88 KiB, done. Total 7 (delta 5), reused 0 (delta 0) To ssh://git.fedorahosted.org/git/389/ds.git 0f21a5b..e2fd5bf master -> master
originally targeted for 1.2.11.rc1, but actually in the 1.2.11.a1 release
Ticket has been cloned to Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=834064
Added initial screened field value.
Metadata Update from @nkinder: - Issue assigned to mreynolds - Issue set to the milestone: 1.2.11.a1
389-ds-base is moving from Pagure to Github. This means that new issues and pull requests will be accepted only in 389-ds-base's github repository.
This issue has been cloned to Github and is available here: - https://github.com/389ds/389-ds-base/issues/211
If you want to receive further updates on the issue, please navigate to the github issue and click on subscribe button.
subscribe
Thank you for understanding. We apologize for all inconvenience.
Metadata Update from @spichugi: - Issue close_status updated to: wontfix (was: Fixed)
Login to comment on this ticket.