#211 dnaNextValue gets incremented even if the user addition fails
Closed: wontfix None Opened 12 years ago by rmeggins.

https://bugzilla.redhat.com/show_bug.cgi?id=510182

Description of problem:
The setup is a single instance DNA plugin enabled. The dnaNextValue counter
gets incremented even if there is a failure in addition of the user. Therefore
creating gaps in number assignment and resulting in non utilization of a valid
number in range.

Version-Release number of selected component (if applicable):
RHDS 9.0

How reproducible:
Everytime

Steps to Reproduce:
1. Enable DNA Plugin
2. Add the entry below, to enable DNA for uidNumber attribute.
dn: cn=Account UIDs,cn=Distributed Numeric Assignment
Plugin,cn=plugins,cn=config
objectClass: top
objectClass: extensibleObject
cn: Account UIDs
dnatype: uidNumber
dnafilter: (objectclass=posixAccount)
dnascope: dc=pnq,dc=redhat,dc=com
dnanextvalue: 1
dnamaxvalue: 1300

3. Restart server.
4. Add a user and see if the DNA plugin is working as expected and the
dnaNextValue number get incremented. (This is happening correctly).
5. Add a user without gidNumber attribute, which will fail as DNA is not
enabled for gidNumber in this setup.
6. Check dnaNextValue. It would have got incremented even though the user
addition is failed.

Actual results:
dnaNextValue get incremented even if the user addition fails.

Expected results:
dnaNextValue should not get incremented if the user addition fails.

Additional info:

batch move to milestone 1.3

I'm not sure that we can catch 100% of the cases where we have already grabbed the next free value from the range without adding some way of putting a value back if the operation ultimately fails. This would require quite a bit of surgery since we would have to handle the case where the free values are not a contiguous range.

I think the right solution here is to grab the next free value from the range at the betxnpreoperation stage now that we have transaction support at the plug-in level. This will catch the majority of cases where a pre-op plug-in rejects an operation after we have already allocated a value.

I noticed that I can refine some of the code in the fix:

3391                                 /* If we have a value, see if it's the magic value. */ 
3392                                 if (bv) { 
3393                                     len = strlen(DNA_NEEDS_UPDATE); 
3394                                     if (len == bv->bv_len) { 
3395                                         if (!slapi_UTF8NCASECMP(bv->bv_val,DNA_NEEDS_UPDATE,len)) { 
3396                                             slapi_ch_array_add(&types_to_generate, slapi_ch_strdup(type)); 
3397                                         } 
3398                                     }

We don't need to get the strlen of a fixed value, so some of this can be cleaned up.

Working on a new revision...

Mark

The change to ldbm_back_modify() is not sufficient. Consider the case where you get a DB_LOCK_DEADLOCK == retval and have to abort/start the transaction again. In this case, you will want to apply your mods to the original mods list, not the one that may have been modified in a previous loop iteration. So you will need to make a copy of the original mods list and restore it if DB_LOCK_DEADLOCK == retval and you have to retry the transaction. The same applies if you are changing SLAPI_ADD_ENTRY or SLAPI_MODIFY_EXISTING_ENTRY - you would need to restore the original pre/post operation entries and re-apply your changes if looping again.

instead of doing the cleanup to restore the original adding entry/mods at each error condition, can you just put it at the top of the big loop in the abort condition e.g. something like this:
{{{
for (retry_count = 0; retry_count < RETRY_TIMES; retry_count++) {

    if (txn.back_txn_txn && (txn.back_txn_txn != parent_txn)) {
        dblayer_txn_abort(li,&txn);
        slapi_pblock_set(pb, SLAPI_TXN, parent_txn);
                /* Reset mods/entry, Abort and re-try */ 
                    ldap_mods_free(mods, 1); 
                    slapi_pblock_set(pb, SLAPI_MODIFY_MODS, copy_mods(mods_original)); 
                    backentry_free(&ec); 
                    slapi_pblock_set( pb, SLAPI_MODIFY_EXISTING_ENTRY, ec_original->ep_entry ); 
                    ec = ec_original; 
                    if ( (ec_original = backentry_dup( e )) == NULL ) { 
                            ldap_result_code= LDAP_OPERATIONS_ERROR; 
                            goto error_return; 
                    }

}}}

For mods, you may also have to handle the case where the plugin actually changes the mods pointer - I believe there are some plugins that actually do
{{{
slapi_pblock_set(pb, SLAPI_MODIFY_MODS, newmods);
}}}
so you'll have to use slapi_pblock_get(pb, SLAPI_MODIFY_MODS, &mods); to get them first before you free them and restore them to the original. With entries, I don't think that happens, but you could do a search for places where slapi_pblock_set.*SLAPI_ADD_ENTRY or SLAPI_MODIFY_EXISTING_ENTRY is called.

You'll also have to add similar code to ldbm_delete and ldbm_modrdn in case there are plugins that modify the pblock parameters.

This is why I didn't do this when I originally implemented transaction support - this is quite messy. I just figured it would be easier for the plugins to call the internal operations directly rather than try to muck about with the parameters, and have to keep track of the originals in case of looping.

Can I ask not to mix the spaces and tabs?

754             backentry_free(&ec); 
755                         slapi_pblock_set( pb, SLAPI_MODRDN_EXISTING_ENTRY, original_entry->ep_entry ); 
756                         ec = original_entry;

You copied lots of entries and DNs... We were trying to reduce the malloc as much as possible for the performance improvement task. Just please have it in mind...

Replying to [comment:10 nhosoi]:

Can I ask not to mix the spaces and tabs?

754 backentry_free(&ec);
755 slapi_pblock_set( pb, SLAPI_MODRDN_EXISTING_ENTRY, original_entry->ep_entry );
756 ec = original_entry;

You copied lots of entries and DNs... We were trying to reduce the malloc as much as possible for the performance improvement task. Just please have it in mind...

Yeah, it's tough - if we are going to allow the betxn plugins to modify their arguments at all, we have to make copies and restore them in the retry case. Or, betxn plugins would have to know if they applied their updates to the plugin arguments (i.e. the mods list) already. That would require adding a lot of logic to plugins. Or we could make plugins aware that they are being called as part of a retry.
I don't know if we can just say "betxn plugins cannot modify their arguments - they should be treated as read-only". They would be allowed to perform nested internal operations of their own, which would be included in the transaction as a nested transaction.

There is one "special" case - the code that does the ldbm_txn_ruv_modify_context() code - this allows the replication plugin to "piggyback" the mods to the RUV tombstone entry along with the rest of the writes in the transaction.

Thanks for the explanation. It makes sense. I agree that plugins should be free from the backend transaction logic.

Thanks for the review Rich and Noriko. I removed the tabs from the fix. Sorry I've been working on that, but it's just a old habit.

Yeah I know it's a lot of grabbing and freeing of memory, but we only do most of it if we hit the deadlock. I do think the plugin interaction needs to be discussed further - outside of this bug. This bug just kind of exposed the issue. In fact most of the fix in the backend is all proactive. We don't need to do most of it as of today, but if you do it for one operation, you should do it for all of them.

Fix is pushed...

[mareynol@localhost ldap]$ git merge ticket211
Updating e2a1090..69c9f3b
Fast-forward
ldap/servers/plugins/dna/dna.c | 616 +++++++++++++++++++++-------
ldap/servers/slapd/back-ldbm/ldbm_add.c | 15 +
ldap/servers/slapd/back-ldbm/ldbm_delete.c | 15 +
ldap/servers/slapd/back-ldbm/ldbm_modify.c | 39 ++-
ldap/servers/slapd/back-ldbm/ldbm_modrdn.c | 88 ++++-
ldap/servers/slapd/operation.c | 2 +-
ldap/servers/slapd/slapi-private.h | 1 +
7 files changed, 614 insertions(+), 162 deletions(-)

[mareynol@localhost ldap]$ git push
Counting objects: 29, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (15/15), done.
Writing objects: 100% (15/15), 6.44 KiB, done.
Total 15 (delta 13), reused 0 (delta 0)
To ssh://git.fedorahosted.org/git/389/ds.git
e2a1090..69c9f3b master -> master

Pushed additional patch to master. Thanks to Noriko for her review!

Counting objects: 13, done.
Delta compression using up to 2 threads.
Compressing objects: 100% (7/7), done.
Writing objects: 100% (7/7), 720 bytes, done.
Total 7 (delta 5), reused 0 (delta 0)
To ssh://git.fedorahosted.org/git/389/ds.git
78a349b..f5930b8 master -> master

Author: Nathan Kinder nkinder@redhat.com
Date: Wed Mar 7 07:50:24 2012 -0800

Ticket 211 - Avoid preop range requests non-DNA operations

The previous patch for ticket 211 moved range requests to the preop
callback.  The checking for a range overflow is happening for any
operation, not just those that would require DNA to generate a
value.  This can cause issues with operations being rejected when
they shouldn't be due to failed range extension requests.  This
patch moves the range extension request to only occur if the incoming
operation would require DNA to generate a value.

In addition, we need to check if we've bypassed the threshold in
the preop callback.  This check has been added.

Lastly, we need to avoid doing a range extension request from within
a transaction.  Even though it is an unlikely case, there was still
the possibility of the betxnpreop triggering a range request.  I
changed the code to make a range request impossible while we are
within a transaction.

Reviewed and tested by nhosoi. Pushing the patch on behalf of Nathan.
$ git push
Counting objects: 13, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (7/7), done.
Writing objects: 100% (7/7), 1.88 KiB, done.
Total 7 (delta 5), reused 0 (delta 0)
To ssh://git.fedorahosted.org/git/389/ds.git
0f21a5b..e2fd5bf master -> master

originally targeted for 1.2.11.rc1, but actually in the 1.2.11.a1 release

Added initial screened field value.

Metadata Update from @nkinder:
- Issue assigned to mreynolds
- Issue set to the milestone: 1.2.11.a1

7 years ago

389-ds-base is moving from Pagure to Github. This means that new issues and pull requests
will be accepted only in 389-ds-base's github repository.

This issue has been cloned to Github and is available here:
- https://github.com/389ds/389-ds-base/issues/211

If you want to receive further updates on the issue, please navigate to the github issue
and click on subscribe button.

Thank you for understanding. We apologize for all inconvenience.

Metadata Update from @spichugi:
- Issue close_status updated to: wontfix (was: Fixed)

3 years ago

Login to comment on this ticket.

Metadata