The access control plugin did not do any cleanup when the server shutdown. This created a lot of noise in our valgrind reports.
I applied your patch against the master branch, then import in the setup phase fails due to this double free and I cannot setup the server... [02/Jan/2014:15:10:41 -0800] - All database threads now stopped ==22906== Invalid free() / delete / delete[] / realloc() ==22906== at 0x4A077E6: free (vg_replace_malloc.c:446) ==22906== by 0x4C61D2F: slapi_ch_free (ch_malloc.c:363) ==22906== by 0x4C61DBC: slapi_ch_free_string (ch_malloc.c:396) ==22906== by 0x112C47D0: dblayer_post_close (dblayer.c:2931) ==22906== by 0x112C48F0: dblayer_close (dblayer.c:2974) ==22906== by 0x112E0302: import_main_offline (import.c:1455) ==22906== by 0x112E0D76: ldbm_back_ldif2ldbm_deluxe (import.c:1719) ==22906== by 0x11320FBC: ldbm_back_ldif2ldbm (ldif2ldbm.c:772) ==22906== by 0x131828: slapd_exemode_ldif2db (main.c:2265) ==22906== by 0x12EDD6: main (main.c:972) ==22906== Address 0x133d2290 is 0 bytes inside a block of size 32 free'd ==22906== at 0x4A077E6: free (vg_replace_malloc.c:446) ==22906== by 0x4C61D2F: slapi_ch_free (ch_malloc.c:363) ==22906== by 0x4C60EC4: charray_free (charray.c:238) ==22906== by 0x112C47B4: dblayer_post_close (dblayer.c:2928) ==22906== by 0x112C48F0: dblayer_close (dblayer.c:2974) ==22906== by 0x112E0302: import_main_offline (import.c:1455) ==22906== by 0x112E0D76: ldbm_back_ldif2ldbm_deluxe (import.c:1719) ==22906== by 0x11320FBC: ldbm_back_ldif2ldbm (ldif2ldbm.c:772) ==22906== by 0x131828: slapd_exemode_ldif2db (main.c:2265) ==22906== by 0x12EDD6: main (main.c:972) ==22906==
Replying to [comment:2 nhosoi]:
So only offline imports trigger the crash. I have addressed this. The new patch is attached.
Thanks Mark
acl related output from valgrind val_acl.txt
Thank you, Mark. I could setup the server. Since I'm debugging another acl related ticket (#47571), your patch is very helpful. I ran some tests against the server with your patch, I still get the attached leaks. My test is modifying some ACIs and ran search using the ACIs. I guess the cases are not covered by your fixes, but if they could be also eliminated, it'd be nice. Please note that this output is from the pure master branch + your patch (not including my fixes).
Replying to [comment:4 nhosoi]:
Thanks Noriko I'll look into this right away! I would like to get this checked in today as I'll be on vacation starting Monday...
Replying to [comment:5 mreynolds]:
How nice! Your vacation starts next Monday! If that's the case, you could push the current patch. It is in the good shape. You clearly noted "when just starting and stopping the server" in the git comment. You have my ack.
Have a great time!
Replying to [comment:6 nhosoi]:
Replying to [comment:5 mreynolds]: Thanks Noriko I'll look into this right away! I would like to get this checked in today as I'll be on vacation starting Monday... How nice! Your vacation starts next Monday! If that's the case, you could push the current patch. It is in the good shape. You clearly noted "when just starting and stopping the server" in the git comment. You have my ack. Have a great time!
The day isn't over yet :-)
So, I can not reproduce any ACI leaks. I modified the anonymous aci, and then did some anonymous searches that engaged that ACI. No leaks. Can you provide your exact test case please?
test script tar ball test.tar
Replying to [comment:7 mreynolds]:
Sure. Attached is a tar ball containing my test cases. Need to modify a bit, though.
I used the port 10389 instead of 389. Also, the suffix is "o=my.com".
Steps: 1. add entries in mycom3.ldif 2. run "sh acltest.sh"
Please ignore the test result. It requires my patch to pass...
Replying to [comment:8 nhosoi]:
Replying to [comment:7 mreynolds]: So, I can not reproduce any ACI leaks. I modified the anonymous aci, and then did some anonymous searches that engaged that ACI. No leaks. Can you provide your exact test case please? Sure. Attached is a tar ball containing my test cases. Need to modify a bit, though. I used the port 10389 instead of 389. Also, the suffix is "o=my.com". Steps: 1. add entries in mycom3.ldif 2. run "sh acltest.sh" Please ignore the test result. It requires my patch to pass...
It couldn't find mymodify(), but I manually added all the acis before running the test. I'm not sure if that invalidated the testcase, but I didn't see any leaks with my new patch. I was expecting to see the ones about the factory extensions - as I will be addressing that in a different ticket(I hope).
Can you test this patch again? Thanks!
Replying to [comment:9 mreynolds]:
It couldn't find mymodify(), Oops, sorry! It's just one line script... :p {{{ ldapmodify -x -h localhost -p 10389 -D 'cn=directory manager' -w <mypw> $@ }}} but I manually added all the acis before running the test. I'm not sure if that invalidated the testcase, but I didn't see any leaks with my new patch. I was expecting to see the ones about the factory extensions - as I will be addressing that in a different ticket(I hope). Can you test this patch again? Thanks!
It couldn't find mymodify(), Oops, sorry! It's just one line script... :p {{{ ldapmodify -x -h localhost -p 10389 -D 'cn=directory manager' -w <mypw> $@ }}} but I manually added all the acis before running the test. I'm not sure if that invalidated the testcase, but I didn't see any leaks with my new patch. I was expecting to see the ones about the factory extensions - as I will be addressing that in a different ticket(I hope).
I'm attaching the result. It eliminated leaks from aclanom.c.
acl related output from valgrind 2 val_acl.txt.1
revision 0001-Ticket-47654-Cleanup-old-memory-leaks-reported-from-.patch
Replying to [comment:10 nhosoi]:
Replying to [comment:9 mreynolds]: It couldn't find mymodify(), Oops, sorry! It's just one line script... :p {{{ ldapmodify -x -h localhost -p 10389 -D 'cn=directory manager' -w <mypw> $@ }}} but I manually added all the acis before running the test. I'm not sure if that invalidated the testcase, but I didn't see any leaks with my new patch. I was expecting to see the ones about the factory extensions - as I will be addressing that in a different ticket(I hope). Can you test this patch again? Thanks! I'm attaching the result. It eliminated leaks from aclanom.c.
New patch attached. It does not address the factory extension leaks, or the lock array leaks.
The lock array leak is much more complicated, and needs more investigation. The issue is the lock array can not be freed during the plugin shutdown as acl extensions are still associated with some connections, so it must be freed in daemon.c after freeing the connection table. The other option is not using a lock array, and creating/deleting a rwlock for each connection - not so ideal.
The work I am doing for ticket 47451 should address the factory extension leaks. I am thinking I will also try and address the array leak in that ticket as well.
If you don't mind doing one more test with the latest patch that would be great!
Thank you, Mark!! The result looks pretty good. Needless to say I think, but val_acl.txt.2 is the latest output. :)
298 1620 17094 /tmp/val_acl.txt 164 886 9247 /tmp/val_acl.txt.1 116 628 6596 /tmp/val_acl.txt.2
Replying to [comment:12 nhosoi]:
Thank you, Mark!! The result looks pretty good. Needless to say I think, but val_acl.txt.2 is the latest output. :) wc /tmp/val_acl.txt* 298 1620 17094 /tmp/val_acl.txt 164 886 9247 /tmp/val_acl.txt.1 116 628 6596 /tmp/val_acl.txt.2
Thank you, can I see val_acl.txt.2?
acl related output from valgrind 3 val_acl.txt.2
git merge aci-leak Updating a9cd4e7..1a1b4f8 Fast-forward include/base/pool.h | 5 +- include/libaccess/aclproto.h | 4 ++ include/libaccess/las.h | 4 ++ ldap/servers/plugins/acl/acl.h | 7 +++ ldap/servers/plugins/acl/acl_ext.c | 44 +++++++++++++++++-- ldap/servers/plugins/acl/aclanom.c | 19 +++++---- ldap/servers/plugins/acl/aclgroup.c | 7 +++ ldap/servers/plugins/acl/aclinit.c | 2 +- ldap/servers/plugins/acl/acllist.c | 54 +++++++++++++++++++----- ldap/servers/plugins/acl/aclparse.c | 20 ++++---- ldap/servers/plugins/acl/aclplugin.c | 12 +++++- ldap/servers/plugins/retrocl/retrocl.c | 3 +- ldap/servers/slapd/back-ldbm/dblayer.c | 6 +++ ldap/servers/slapd/back-ldbm/proto-back-ldbm.h | 1 + ldap/servers/slapd/back-ldbm/vlv.c | 10 ++++ ldap/servers/slapd/daemon.c | 1 + ldap/servers/slapd/schema.c | 9 ++++ ldap/servers/slapd/slapi-private.h | 3 + lib/base/pool.cpp | 44 +++++++++---------- lib/libaccess/acl.yy.cpp | 5 ++ lib/libaccess/aclcache.cpp | 24 +++++++++- lib/libaccess/aclscan.h | 1 + lib/libaccess/acltools.cpp | 9 ++++ lib/libaccess/permhash.h | 5 +- lib/libaccess/register.cpp | 47 ++++++++++++++++++-- 25 files changed, 273 insertions(+), 73 deletions(-)
git push origin master a9cd4e7..1a1b4f8 master -> master
commit 1a1b4f8 Author: Mark Reynolds mreynolds@redhat.com Date: Fri Jan 3 17:09:49 2014 -0500
Leaving open for now, as we might want to push to to 1.3.2 and 1.3.1...
Unfortunately, this free is causing double-free with DBLAYER_NORMAL_MODE, too. {{{ ldap/servers/slapd/back-ldbm/dblayer.c 2931 if(dbmode == DBLAYER_NORMAL_MODE){ 2932 slapi_ch_free_string(&priv->dblayer_home_directory); 2933 } 2934 }}}
{{{ ==3256== Invalid free() / delete / delete[] / realloc() ==3256== at 0x4A077E6: free (vg_replace_malloc.c:446) ==3256== by 0x4C620CB: slapi_ch_free (ch_malloc.c:363) ==3256== by 0x4C62158: slapi_ch_free_string (ch_malloc.c:396) ==3256== by 0x112E87D6: dblayer_post_close (dblayer.c:2932) ==3256== by 0x112E8923: dblayer_close (dblayer.c:2978) ==3256== by 0x112E13B6: ldbm_back_close (close.c:62) ==3256== by 0x4CC34E3: plugin_call_func (plugin.c:1474) ==3256== by 0x4CC33D0: plugin_call_one (plugin.c:1442) ==3256== by 0x4CC3227: plugin_dependency_closeall (plugin.c:1373) ==3256== by 0x4CC331C: plugin_closeall (plugin.c:1420) ==3256== by 0x126C80: slapd_daemon (daemon.c:1336) ==3256== by 0x12F503: main (main.c:1273) ==3256== Address 0x13fac370 is 0 bytes inside a block of size 28 free'd ==3256== at 0x4A077E6: free (vg_replace_malloc.c:446) ==3256== by 0x4C620CB: slapi_ch_free (ch_malloc.c:363) ==3256== by 0x4C61260: charray_free (charray.c:238) ==3256== by 0x112E87B4: dblayer_post_close (dblayer.c:2928) ==3256== by 0x112E8923: dblayer_close (dblayer.c:2978) ==3256== by 0x112E13B6: ldbm_back_close (close.c:62) ==3256== by 0x4CC34E3: plugin_call_func (plugin.c:1474) ==3256== by 0x4CC33D0: plugin_call_one (plugin.c:1442) ==3256== by 0x4CC3227: plugin_dependency_closeall (plugin.c:1373) ==3256== by 0x4CC331C: plugin_closeall (plugin.c:1420) ==3256== by 0x126C80: slapd_daemon (daemon.c:1336) ==3256== by 0x12F503: main (main.c:1273) }}}
You may want to stash this debug output. ;) 2967 LDAPDebug1Arg(LDAP_DEBUG_ANY,"MARK dbmode %d\n",dbmode);
Replying to [comment:15 nhosoi]:
Unfortunately, this free is causing double-free with DBLAYER_NORMAL_MODE, too. {{{ ldap/servers/slapd/back-ldbm/dblayer.c 2931 if(dbmode == DBLAYER_NORMAL_MODE){ 2932 slapi_ch_free_string(&priv->dblayer_home_directory); 2933 } 2934 }}} {{{ ==3256== Invalid free() / delete / delete[] / realloc() ==3256== at 0x4A077E6: free (vg_replace_malloc.c:446) ==3256== by 0x4C620CB: slapi_ch_free (ch_malloc.c:363) ==3256== by 0x4C62158: slapi_ch_free_string (ch_malloc.c:396) ==3256== by 0x112E87D6: dblayer_post_close (dblayer.c:2932) ==3256== by 0x112E8923: dblayer_close (dblayer.c:2978) ==3256== by 0x112E13B6: ldbm_back_close (close.c:62) ==3256== by 0x4CC34E3: plugin_call_func (plugin.c:1474) ==3256== by 0x4CC33D0: plugin_call_one (plugin.c:1442) ==3256== by 0x4CC3227: plugin_dependency_closeall (plugin.c:1373) ==3256== by 0x4CC331C: plugin_closeall (plugin.c:1420) ==3256== by 0x126C80: slapd_daemon (daemon.c:1336) ==3256== by 0x12F503: main (main.c:1273) ==3256== Address 0x13fac370 is 0 bytes inside a block of size 28 free'd ==3256== at 0x4A077E6: free (vg_replace_malloc.c:446) ==3256== by 0x4C620CB: slapi_ch_free (ch_malloc.c:363) ==3256== by 0x4C61260: charray_free (charray.c:238) ==3256== by 0x112E87B4: dblayer_post_close (dblayer.c:2928) ==3256== by 0x112E8923: dblayer_close (dblayer.c:2978) ==3256== by 0x112E13B6: ldbm_back_close (close.c:62) ==3256== by 0x4CC34E3: plugin_call_func (plugin.c:1474) ==3256== by 0x4CC33D0: plugin_call_one (plugin.c:1442) ==3256== by 0x4CC3227: plugin_dependency_closeall (plugin.c:1373) ==3256== by 0x4CC331C: plugin_closeall (plugin.c:1420) ==3256== by 0x126C80: slapd_daemon (daemon.c:1336) ==3256== by 0x12F503: main (main.c:1273) }}}
How are you reproducing this? I do not see this double free in my testing.
attachment 0001-Ticket-47654-fix-double-free.patch
git merge ticket47654 Updating f83f7fe..26aad75 Fast-forward ldap/servers/slapd/back-ldbm/dblayer.c | 8 +++----- 1 files changed, 3 insertions(+), 5 deletions(-)
git push origin master f83f7fe..26aad75 master -> master
commit 26aad75 Author: Mark Reynolds mreynolds@redhat.com Date: Thu Jan 16 10:36:19 2014 -0500
vlv_close() only frees a vlv search lock which causes a deadlock during online bak2db. Reopening ticket...
ok, but is there some place where these could be freed safely, to avoid crashes and to avoid valgrind noise?
Fix regression 0001-Ticket-47654-Fix-regression-deadlock-crash.patch
Replying to [comment:26 rmeggins]:
Yes, just needed to make sure we are shutting the server down, and not just the backend(for backend tasks). New patch attached...
For the dblayer stuff - are these variables associated with the ldbm plugin? If so, should we delete them when shutting down the ldbm plugin? I'm just thinking about the work you did for loading/unloading plugins - these seem like variables we should free when unloading the ldbm plugin (and re-allocating them when loading the plugin).
As for the vlv variable - I'm not sure what the lifetime is supposed to be, if it is associated with a plugin or the server in general. If the latter, then deleting at shutdown is ok.
Replying to [comment:28 rmeggins]:
Dynamic plugins do not allow ldbm database plugins to be dynamically restarted. It would of been a an even "huger" change to try and do that. I decided that if you are making such a dramatic change to DS, that a restart should be required(and acceptable).
Each backend has its own vlv lock. vlv_init() is smart enough to know when the lock already exists, or not. So this should not be an issue.
git merge ticket47654 Updating 3a66ec7..16e5ce7 Fast-forward ldap/servers/slapd/back-ldbm/dblayer.c | 15 +++++++++++----
git push origin master 3a66ec7..16e5ce7 master -> master
commit 16e5ce7 Author: Mark Reynolds mreynolds@redhat.com Date: Wed Jul 9 17:39:38 2014 -0400
Metadata Update from @rmeggins: - Issue assigned to mreynolds - Issue set to the milestone: 1.3.3 - 1/14 (January)
389-ds-base is moving from Pagure to Github. This means that new issues and pull requests will be accepted only in 389-ds-base's github repository.
This issue has been cloned to Github and is available here: - https://github.com/389ds/389-ds-base/issues/991
If you want to receive further updates on the issue, please navigate to the github issue and click on subscribe button.
subscribe
Thank you for understanding. We apologize for all inconvenience.
Metadata Update from @spichugi: - Issue close_status updated to: wontfix (was: Fixed)
Login to comment on this ticket.