In IPA upgrade this deadlock was encountered:
Thread 4 (Thread 0x7f27f97e2700 (LWP 11679)): #0 __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135 #1 0x00007f282fa1a1af in _L_lock_1026 () from /lib64/libpthread.so.0 #2 0x00007f282fa1a151 in __GI___pthread_mutex_lock (mutex=mutex@entry=0x7f2833f46d18) at ../nptl/pthread_mutex_lock.c:134 #3 0x00007f28300722c9 in PR_Lock (lock=lock@entry=0x7f2833f46d18) at ../../../nspr/pr/src/pthreads/ptsynch.c:174 #4 0x00007f2830072883 in PR_EnterMonitor (mon=0x7f2833f46d10) at ../../../nspr/pr/src/pthreads/ptsynch.c:529 #5 0x00007f282612bff6 in dblayer_txn_begin () from /usr/lib64/dirsrv/plugins/libback-ldbm.so #6 0x00007f282614bba3 in ldbm_back_add () from /usr/lib64/dirsrv/plugins/libback-ldbm.so #7 0x00007f2831c4ec1a in op_shared_add () from /usr/lib64/dirsrv/libslapd.so.0 #8 0x00007f2831c4f4a3 in add_internal_pb () from /usr/lib64/dirsrv/libslapd.so.0 #9 0x00007f28249daf0d in retrocl_postob () from /usr/lib64/dirsrv/plugins/libretrocl-plugin.so #10 0x00007f2831ca7cb0 in plugin_call_func () from /usr/lib64/dirsrv/libslapd.so.0 #11 0x00007f2831ca7f08 in plugin_call_plugins () from /usr/lib64/dirsrv/libslapd.so.0 #12 0x00007f2826166b91 in ldbm_back_modify () from /usr/lib64/dirsrv/plugins/libback-ldbm.so #13 0x00007f2831c94fa1 in op_shared_modify () from /usr/lib64/dirsrv/libslapd.so.0 #14 0x00007f2831c95a54 in modify_internal_pb () from /usr/lib64/dirsrv/libslapd.so.0 #15 0x00007f28258446a1 in memberof_fix_memberof_callback () from /usr/lib64/dirsrv/plugins/libmemberof-plugin.so #16 0x00007f2831cbc3cd in send_ldap_search_entry_ext () from /usr/lib64/dirsrv/libslapd.so.0 #17 0x00007f2831cbcc0c in send_ldap_search_entry () from /usr/lib64/dirsrv/libslapd.so.0 #18 0x00007f2831c9a593 in iterate.isra.0.constprop.3 () from /usr/lib64/dirsrv/libslapd.so.0 #19 0x00007f2831c9a73a in send_results_ext.constprop.2 () from /usr/lib64/dirsrv/libslapd.so.0 #20 0x00007f2831c9be21 in op_shared_search () from /usr/lib64/dirsrv/libslapd.so.0 #21 0x00007f2831cabc8e in search_internal_callback_pb () from /usr/lib64/dirsrv/libslapd.so.0 #22 0x00007f2825845ef7 in memberof_fixup_task_thread () from /usr/lib64/dirsrv/plugins/libmemberof-plugin.so #23 0x00007f2830077c10 in _pt_root (arg=0x7f28140485e0) at ../../../nspr/pr/src/pthreads/ptthread.c:204 #24 0x00007f282fa17f33 in start_thread (arg=0x7f27f97e2700) at pthread_create.c:309 #25 0x00007f282f745ead in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111 Thread 36 (Thread 0x7f283204f700 (LWP 11643)): #0 __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135 #1 0x00007f282fa1a1af in _L_lock_1026 () from /lib64/libpthread.so.0 #2 0x00007f282fa1a151 in __GI___pthread_mutex_lock (mutex=mutex@entry=0x7f2833f9fd38) at ../nptl/pthread_mutex_lock.c:134 #3 0x00007f28300722c9 in PR_Lock (lock=lock@entry=0x7f2833f9fd38) at ../../../nspr/pr/src/pthreads/ptsynch.c:174 #4 0x00007f2830072883 in PR_EnterMonitor (mon=0x7f2833f9fd30) at ../../../nspr/pr/src/pthreads/ptsynch.c:529 #5 0x00007f2825846eca in memberof_postop_del () from /usr/lib64/dirsrv/plugins/libmemberof-plugin.so #6 0x00007f2831ca7cb0 in plugin_call_func () from /usr/lib64/dirsrv/libslapd.so.0 #7 0x00007f2831ca7f08 in plugin_call_plugins () from /usr/lib64/dirsrv/libslapd.so.0 #8 0x00007f28261594de in ldbm_back_delete () from /usr/lib64/dirsrv/plugins/libback-ldbm.so #9 0x00007f2831c5beb0 in op_shared_delete () from /usr/lib64/dirsrv/libslapd.so.0 #10 0x00007f2831c5c062 in delete_internal_pb () from /usr/lib64/dirsrv/libslapd.so.0 #11 0x00007f28249db6de in delete_changerecord () from /usr/lib64/dirsrv/plugins/libretrocl-plugin.so #12 0x00007f28249db9b1 in changelog_trim_thread_fn () from /usr/lib64/dirsrv/plugins/libretrocl-plugin.so #13 0x00007f2830077c10 in _pt_root (arg=0x7f28100008e0) at ../../../nspr/pr/src/pthreads/ptthread.c:204 #14 0x00007f282fa17f33 in start_thread (arg=0x7f283204f700) at pthread_create.c:309 #15 0x00007f282f745ead in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
There is a scope for memberof defined:
memberofentryscope: dc=ipa,dc=dev,dc=lan
which should exclude cn=changelog, but the check is only done after acquiring the memberof lock.
I think for postop_add and postop_del this check coould be done when memberof_oktodo() is called
Ludwig, setting the next 1.3.2 to the milestone. Please feel free to adjust it to your plan. Thanks!
Is this still an issue in the IPA upgrade (from which version to which)?
If so, could you put the 389-ds-base version and the stacktraces with the matched debuginfo (there are some functions which options and lines are missing...)?
Setting to FUTURE.
Comment by Ludwig: global lock and lmdb redesign should handle this
Version: 389-ds-base-1.3.3.1-15.el7_1.x86_64
Replying to [comment:10 mreynolds]: Thanks to Mark for picking up this ticket. Setting Milestone to 1.3.4.3...
attachment 0001-Ticket-47931-memberOf-retrocl-deadlocks.patch
Hi Mark!
I'm reviewing your fix. First, I'm trying to understand the "fix description"... The deadlock among plug-ins is always hard to solve, isn't it? ;) {{{ Bug Description: When concurrently updating multiple backends the memberOf and retrocl plugins can deadlock on each other. This is caused by the required retrocl lock, and the db lock on the changenumber index in the retrocl db.
Fix Description: Added scoping to the retrocl that allows subtrees/suffixes to be included or excluded. Also moved the existing memberOf scoping outside of its global lock. }}} So, the deadlock was caused by the memberof scoping inside the global lock and retrocl lock (retrocl_internal_lock or retrocl_cn_lock?)? And basically just moving the memberof scoping out solves the deadlock?
Thank you for your explanation, Mark! Now I understood the design of your fix.
2 minor issues... 1. Is 'i' supposed to get incremented? {{{ 645 / Check if the the entry include scope is a child of the sdn / 646 static Slapi_DN 647 memberof_scope_is_child_of_dn(MemberOfConfig config, Slapi_DN sdn) 648 { 649 int i = 0; 650 651 while(config->entryScopes && config->entryScopes[i]){ 652 if(slapi_sdn_issuffix(config->entryScopes[i], sdn)){ 653 return config->entryScopes[i]; 654 } 655 } 656 return NULL; 657 } }}} 2. Is retro changelog applicable to the dynamic plug-in? If yes, retrocl_includes|excludes could leak? {{{ 48 static Slapi_DN retrocl_includes = NULL; 49 static Slapi_DN retrocl_excludes = NULL; 391 retrocl_excludes = (Slapi_DN )slapi_ch_calloc(sizeof(Slapi_DN *),num_vals+1); 411 retrocl_includes = (Slapi_DN )slapi_ch_calloc(sizeof(Slapi_DN ),num_vals+1); }}}
Replying to [comment:16 nhosoi]:
Thank you for your explanation, Mark! Now I understood the design of your fix. 2 minor issues... 1. Is 'i' supposed to get incremented? {{{ 645 / Check if the the entry include scope is a child of the sdn / 646 static Slapi_DN 647 memberof_scope_is_child_of_dn(MemberOfConfig config, Slapi_DN *sdn) 648 { 649 int i = 0; 650 651 while(config->entryScopes && config->entryScopes[i]){ 652 if(slapi_sdn_issuffix(config->entryScopes[i], sdn)){ 653 return config->entryScopes[i]; 654 } 655 } 656 return NULL; 657 } }}}
2 minor issues... 1. Is 'i' supposed to get incremented? {{{ 645 / Check if the the entry include scope is a child of the sdn / 646 static Slapi_DN 647 memberof_scope_is_child_of_dn(MemberOfConfig config, Slapi_DN *sdn) 648 { 649 int i = 0; 650 651 while(config->entryScopes && config->entryScopes[i]){ 652 if(slapi_sdn_issuffix(config->entryScopes[i], sdn)){ 653 return config->entryScopes[i]; 654 } 655 } 656 return NULL; 657 } }}}
Good catch
Is retro changelog applicable to the dynamic plug-in? If yes, retrocl_includes|excludes could leak? {{{ 48 static Slapi_DN retrocl_includes = NULL; 49 static Slapi_DN retrocl_excludes = NULL; 391 retrocl_excludes = (Slapi_DN )slapi_ch_calloc(sizeof(Slapi_DN *),num_vals+1); 411 retrocl_includes = (Slapi_DN )slapi_ch_calloc(sizeof(Slapi_DN *),num_vals+1); }}}
This is freed in the plugin stop function, and that's how dynamic plugins work. Any change to plugin's config triggers the plugin to be restarted. So this is not leaked.
revision 0001-Ticket-47931-memberOf-retrocl-deadlocks.2.patch
New patch attached...
Hi Mark, thanks for waiting. Please push it!
6b138a2..fd959ac master -> master commit fd959ac Author: Mark Reynolds mreynolds@redhat.com Date: Tue Aug 4 12:19:31 2015 -0400
e62b481..d810847 389-ds-base-1.3.4 -> 389-ds-base-1.3.4 commit d810847
Coverity issues found, reopening...
Fix coverity issues 0002-Ticket-47931-Fix-coverity-issues.patch
fd959ac..5daea97 master -> master commit 5daea97 Author: Mark Reynolds mreynolds@redhat.com Date: Mon Aug 10 10:42:40 2015 -0400
d810847..9a0047e 389-ds-base-1.3.4 -> 389-ds-base-1.3.4 commit 9a0047e
08eabd5..f86dab3 389-ds-base-1.3.3 -> 389-ds-base-1.3.3 commit f86dab3
f86dab3..e6343e6 389-ds-base-1.3.3 -> 389-ds-base-1.3.3 commit e6343e6
Debug logging was accidentally left in the source - it has now been removed:
29c09a5..db7153f master -> master commit db7153f Author: Mark Reynolds mreynolds@redhat.com Date: Wed Aug 19 10:03:50 2015 -0400
2c5e0d5..1781280 389-ds-base-1.3.4 -> 389-ds-base-1.3.4 commit 1781280
895dc4f..5c48f28 389-ds-base-1.3.3 -> 389-ds-base-1.3.3 commit 5c48f28
lib389 test script 0001-Ticket-47931-lib389-test-memberof-and-retrocl-deadlo.patch
Replying to [comment:27 mreynolds]: Probably, a silly question... :p Could you tell me where test_ticket47931 is called?
Replying to [comment:28 nhosoi]:
Simon modified create-ticket.py to make new ticket tests more compliant with the pytest way. So now in the main function we call
{{{ pytest.main("-s %s" % CURRENT_FILE) }}}
Which executes all the functions that start with "test".
I see. Thanks!
Thanks Noriko!
4277f2b..48470e7 master -> master commit 48470e7 Author: Mark Reynolds mreynolds@redhat.com Date: Fri Aug 28 15:01:58 2015 -0400
8c9c2fc..22e4de7 389-ds-base-1.3.4 -> 389-ds-base-1.3.4 commit 22e4de7
f10e2e6..e7d3b2f 389-ds-base-1.3.3 -> 389-ds-base-1.3.3 commit e7d3b2f
Metadata Update from @mreynolds: - Issue assigned to mreynolds - Issue set to the milestone: 1.3.3.13
389-ds-base is moving from Pagure to Github. This means that new issues and pull requests will be accepted only in 389-ds-base's github repository.
This issue has been cloned to Github and is available here: - https://github.com/389ds/389-ds-base/issues/1262
If you want to receive further updates on the issue, please navigate to the github issue and click on subscribe button.
subscribe
Thank you for understanding. We apologize for all inconvenience.
Metadata Update from @spichugi: - Issue close_status updated to: wontfix (was: Fixed)
Login to comment on this ticket.