systemtap analysis shows a lot of contention for this lock - investigate a lock-free way to access this e.g. get the value at the beginning of the operation and store in thread local storage, use hazard pointers/ref counting, others
Since the be_suffix is accessed as part of the plugin API, it is not possible to use local thread storage/hazard pointers. Used read/write lock instead.
no lock version 0001-Ticket-509-lock-free-access-to-be-be_suffixlock.patch
The suffixlock is still present in the latest patch. Otherwise, looks good.
Replying to [comment:5 rmeggins]:
Rich, Ludwig pointed out that we still need serialization for adding suffixes - so the lock is still needed for "adds", but adds are extremely rare and I feel this lock is acceptable.
Ludwig's comment:
And I think be_addsuffix is not safe.
If two threads try to add a suffix, both can get the same current count, the both set the new suffix be->be_suffix[count]= slapi_sdn_dup(suffix);
Replying to [comment:6 mreynolds]:
Replying to [comment:5 rmeggins]: The suffixlock is still present in the latest patch. Otherwise, looks good. Rich, Ludwig pointed out that we still need serialization for adding suffixes - so the lock is still needed for "adds", but adds are extremely rare and I feel this lock is acceptable. Ludwig's comment: And I think be_addsuffix is not safe. If two threads try to add a suffix, both can get the same current count, the both set the new suffix be->be_suffix[count]= slapi_sdn_dup(suffix);
How can both get the same count? In the patch, suffixlock is acquired before count is assigned or incremented. The mutex prevents another thread from accessing suffixcounter.
and then both increment count. the increment is atomic, but the assignment could be done to the wrong index.
Replying to [comment:7 rmeggins]:
Replying to [comment:6 mreynolds]: Replying to [comment:5 rmeggins]: The suffixlock is still present in the latest patch. Otherwise, looks good. Rich, Ludwig pointed out that we still need serialization for adding suffixes - so the lock is still needed for "adds", but adds are extremely rare and I feel this lock is acceptable. Ludwig's comment: And I think be_addsuffix is not safe. If two threads try to add a suffix, both can get the same current count, the both set the new suffix be->be_suffix[count]= slapi_sdn_dup(suffix); How can both get the same count? In the patch, suffixlock is acquired before count is assigned or incremented. The mutex prevents another thread from accessing suffixcounter. I have since revised the patch, the patch Ludwig reviewed did not have any locking. So I added the locking back to the "add" function. and then both increment count. the increment is atomic, but the assignment could be done to the wrong index.
How can both get the same count? In the patch, suffixlock is acquired before count is assigned or incremented. The mutex prevents another thread from accessing suffixcounter. I have since revised the patch, the patch Ludwig reviewed did not have any locking. So I added the locking back to the "add" function.
git merge ticket509 Merge made by recursive. ldap/servers/slapd/backend.c | 34 ++++++++++++++++++---------------- ldap/servers/slapd/backend_manager.c | 30 +++++++++++++++--------------- ldap/servers/slapd/slap.h | 2 +- 3 files changed, 34 insertions(+), 32 deletions(-)
git push origin master Counting objects: 24, done. Delta compression using up to 4 threads. Compressing objects: 100% (13/13), done. Writing objects: 100% (13/13), 1.72 KiB, done. Total 13 (delta 9), reused 0 (delta 0) To ssh://git.fedorahosted.org/git/389/ds.git 4e9aab8..cd6a6dd master -> master
Why was this a merge commit? We should avoid merge commits if at all possible.
Replying to [comment:11 rmeggins]:
Sorry I forgot to rebase before I did the merge.
I still think it is not safe. If you don't have the lock in slapi_be_issuffix and traverse be_suffix an other thread in be_addsuffix could have reallocated be_suffix
The last comment also applies to slapi_lookup_instance_name_by_suffix()
There is this comment in backend.c: {{{ / * The caller may use the returned pointer without holding the * be_suffixlock since we never remove suffixes from the array. * The Slapi_DN pointer will always be valid even though the array * itself may be changing due to the addition of a suffix. / }}} I think the assumption is that slapi_ch_realloc() in be_addsuffix will alloc be->be_suffix "in place". If this is false, then we must not allow two threads to access be->be_suffix. The traditional way to do this is with a mutex which we already have. If we are trying to get rid of contention on this mutex, we will have to use some lock-free technique like hazard pointers or thread local storage. I think it should be possible to do that, even though the be suffix functions are part of the public API.
I don't know how else we can do this. From what I read, hazard pointers need local thread storage, and since plugins can create their own threads this doesn't seem to be an option.
Should I just revert the fix for now?
I read the comment that caller can use the pointer to slapi_dn without holding the lock since it will never be removed. A realloc will copy the contents of the old alloc to the new, but I think there is no guarantee that the memory can just be extended, so we need the lock when iterating be_suffix.
Replying to [comment:17 mreynolds]:
If we need the pointers set up at the time of thread creation, we can force plugin writers to use a slapi thread creation function wrapper, and initialize the thread local data for the hazard pointers in that function. If there are threads not created by this function, then all bets are off.
Or perhaps there is some other lock-free technique for doing this that we have not investigated.
Or just add back the locking - the slapi counter stuff is ok.
Replying to [comment:18 lkrispen]:
Then there is no guarantee that a caller of slapi_be_getsuffix will have it's Slapi_DN changed out from under it, and either the caller will have to hold be->be_suffixlock for the duration of the use of the Slapi_DN* returned by slapi_be_getsuffix, or we will have to find a lock-free way to do this.
Replying to [comment:19 rmeggins]:
Replying to [comment:17 mreynolds]: I don't know how else we can do this. From what I read, hazard pointers need local thread storage, and since plugins can create their own threads this doesn't seem to be an option. If we need the pointers set up at the time of thread creation, we can force plugin writers to use a slapi thread creation function wrapper, and initialize the thread local data for the hazard pointers in that function. If there are threads not created by this function, then all bets are off. Or perhaps there is some other lock-free technique for doing this that we have not investigated. Should I just revert the fix for now? Or just add back the locking - the slapi counter stuff is ok. We don't need the counter is we are using the locks again. I guess I can go back to my original fix of using read/write lockers.
Or just add back the locking - the slapi counter stuff is ok. We don't need the counter is we are using the locks again. I guess I can go back to my original fix of using read/write lockers.
Another option would be to use a linked list for be_suffix instead of reallocating the array after each add.
Replying to [comment:22 mreynolds]:
Good idea - avoid the realloc
Looks good. Please change roles_cache.c and ldbm_fetch_subtrees() to use slapi_be_getsuffix(be, 0) to get the first element of the suffix list rather than accessing the suffix list directly. We don't want to expose the implementation details outside of backend.c and backend_manager.c
Overall, the patch looks good to me. 3 minor comments...
1) variable i declared at line 180 is not initialized? {{{ 169 173 slapi_be_issuffix( const Slapi_Backend be, const Slapi_DN suffix ) [...] 179 if ( slapi_sdn_compare( be->be_suffix[i], suffix ) == 0) 180 { 181 r= 1; 180 int i, count; 181 182 count = slapi_counter_get_value(be->be_suffixcounter); 183 list = be->be_suffixlist; 184 while(list && i < count){ }}}
2) You may want to update the comment. ;) {{{ 223 / 224 * The caller may use the returned pointer without holding the 225 * be_suffixlock since we never remove suffixes from the array. <== 226 * The Slapi_DN pointer will always be valid even though the array 227 * itself may be changing due to the addition of a suffix. 228 / 229 const Slapi_DN 230 slapi_be_getsuffix(Slapi_Backend be,int n) }}}
3) variable sdn is not needed any more? {{{ 229 242 const Slapi_DN * 230 243 slapi_be_getsuffix(Slapi_Backend be,int n) 231 244 { 245 struct suffixlist list; 232 246 Slapi_DN *sdn = NULL; }}}
One more... I'm checking the compiler warnings now. Variable "count" at the line 208 is not used any more...
… … be_addsuffix(Slapi_Backend be,const Slapi_DN suffix) 197 204 { 198 205 if (be->be_state != BE_STATE_DELETED) 199 206 { 207 struct suffixlist new_suffix, list; 200 208 int count; <==
Amendment to use a linked list instead of a pointer array 0001-Ticket-509-lock-free-access-to-be-be_suffixlock.2.patch
git merge ticket509 Updating fc80262..1b6f39c Fast-forward ldap/servers/plugins/roles/roles_cache.c | 4 +- ldap/servers/slapd/back-ldbm/ldif2ldbm.c | 2 +- ldap/servers/slapd/backend.c | 81 ++++++++++++++++++----------- ldap/servers/slapd/backend_manager.c | 19 ++++--- ldap/servers/slapd/slap.h | 8 ++- 5 files changed, 71 insertions(+), 43 deletions(-)
[mareynol@localhost ds]$ git push origin master Counting objects: 25, done. Delta compression using up to 4 threads. Compressing objects: 100% (13/13), done. Writing objects: 100% (13/13), 1.96 KiB, done. Total 13 (delta 11), reused 0 (delta 0) To ssh://git.fedorahosted.org/git/389/ds.git fc80262..1b6f39c master -> master
Fix Description: Previous commit 1b6f39c had a malloc size problem. Instead of the size of pointer, the size of struct suffixlist has to be allocated.
{{{ Valgrind output: ==1702== Invalid write of size 8 ==1702== at 0x4C5A790: be_addsuffix (backend.c:211) ==1702== by 0x4CD6975: init_schema_dse_ext (schema.c:4163) ==1702== by 0x4CD6A76: init_schema_dse (schema.c:4195) ==1702== by 0x41E26B: setup_internal_backends (fedse.c:1770) ==1702== by 0x42029C: main (main.c:834) ==1702== Address 0x812c3c8 is 0 bytes after a block of size 8 alloc'd ==1702== at 0x4A0881C: malloc (vg_replace_malloc.c:270) ==1702== by 0x4C5D9C7: slapi_ch_malloc (ch_malloc.c:155) ==1702== by 0x4C5A774: be_addsuffix (backend.c:209) ==1702== by 0x4CD6975: init_schema_dse_ext (schema.c:4163) ==1702== by 0x4CD6A76: init_schema_dse (schema.c:4195) ==1702== by 0x41E26B: setup_internal_backends (fedse.c:1770) ==1702== by 0x42029C: main (main.c:834) }}}
git patch file (master, 389-ds-base-1.3.0) 0001-Ticket-509-lock-free-access-to-be-be_suffixlock.3.patch
Reviewed by Mark (Thank you!!)
Pushed to master: commit 5bd932e
Pushed to 389-ds-base-1.3.0: commit 69d2dd5
Metadata Update from @lkrispen: - Issue assigned to mreynolds - Issue set to the milestone: 1.3.0
389-ds-base is moving from Pagure to Github. This means that new issues and pull requests will be accepted only in 389-ds-base's github repository.
This issue has been cloned to Github and is available here: - https://github.com/389ds/389-ds-base/issues/509
If you want to receive further updates on the issue, please navigate to the github issue and click on subscribe button.
subscribe
Thank you for understanding. We apologize for all inconvenience.
Metadata Update from @spichugi: - Issue close_status updated to: wontfix (was: Fixed)
Login to comment on this ticket.