#47654 Cleanup old memory leaks reported from valgrind
Closed: wontfix None Opened 10 years ago by mreynolds.

The access control plugin did not do any cleanup when the server shutdown. This created a lot of noise in our valgrind reports.


I applied your patch against the master branch, then import in the setup phase fails due to this double free and I cannot setup the server...
[02/Jan/2014:15:10:41 -0800] - All database threads now stopped
==22906== Invalid free() / delete / delete[] / realloc()
==22906== at 0x4A077E6: free (vg_replace_malloc.c:446)
==22906== by 0x4C61D2F: slapi_ch_free (ch_malloc.c:363)
==22906== by 0x4C61DBC: slapi_ch_free_string (ch_malloc.c:396)
==22906== by 0x112C47D0: dblayer_post_close (dblayer.c:2931)
==22906== by 0x112C48F0: dblayer_close (dblayer.c:2974)
==22906== by 0x112E0302: import_main_offline (import.c:1455)
==22906== by 0x112E0D76: ldbm_back_ldif2ldbm_deluxe (import.c:1719)
==22906== by 0x11320FBC: ldbm_back_ldif2ldbm (ldif2ldbm.c:772)
==22906== by 0x131828: slapd_exemode_ldif2db (main.c:2265)
==22906== by 0x12EDD6: main (main.c:972)
==22906== Address 0x133d2290 is 0 bytes inside a block of size 32 free'd
==22906== at 0x4A077E6: free (vg_replace_malloc.c:446)
==22906== by 0x4C61D2F: slapi_ch_free (ch_malloc.c:363)
==22906== by 0x4C60EC4: charray_free (charray.c:238)
==22906== by 0x112C47B4: dblayer_post_close (dblayer.c:2928)
==22906== by 0x112C48F0: dblayer_close (dblayer.c:2974)
==22906== by 0x112E0302: import_main_offline (import.c:1455)
==22906== by 0x112E0D76: ldbm_back_ldif2ldbm_deluxe (import.c:1719)
==22906== by 0x11320FBC: ldbm_back_ldif2ldbm (ldif2ldbm.c:772)
==22906== by 0x131828: slapd_exemode_ldif2db (main.c:2265)
==22906== by 0x12EDD6: main (main.c:972)
==22906==

Replying to [comment:2 nhosoi]:

I applied your patch against the master branch, then import in the setup phase fails due to this double free and I cannot setup the server...
[02/Jan/2014:15:10:41 -0800] - All database threads now stopped
==22906== Invalid free() / delete / delete[] / realloc()
==22906== at 0x4A077E6: free (vg_replace_malloc.c:446)
==22906== by 0x4C61D2F: slapi_ch_free (ch_malloc.c:363)
==22906== by 0x4C61DBC: slapi_ch_free_string (ch_malloc.c:396)
==22906== by 0x112C47D0: dblayer_post_close (dblayer.c:2931)
==22906== by 0x112C48F0: dblayer_close (dblayer.c:2974)
==22906== by 0x112E0302: import_main_offline (import.c:1455)
==22906== by 0x112E0D76: ldbm_back_ldif2ldbm_deluxe (import.c:1719)
==22906== by 0x11320FBC: ldbm_back_ldif2ldbm (ldif2ldbm.c:772)
==22906== by 0x131828: slapd_exemode_ldif2db (main.c:2265)
==22906== by 0x12EDD6: main (main.c:972)
==22906== Address 0x133d2290 is 0 bytes inside a block of size 32 free'd
==22906== at 0x4A077E6: free (vg_replace_malloc.c:446)
==22906== by 0x4C61D2F: slapi_ch_free (ch_malloc.c:363)
==22906== by 0x4C60EC4: charray_free (charray.c:238)
==22906== by 0x112C47B4: dblayer_post_close (dblayer.c:2928)
==22906== by 0x112C48F0: dblayer_close (dblayer.c:2974)
==22906== by 0x112E0302: import_main_offline (import.c:1455)
==22906== by 0x112E0D76: ldbm_back_ldif2ldbm_deluxe (import.c:1719)
==22906== by 0x11320FBC: ldbm_back_ldif2ldbm (ldif2ldbm.c:772)
==22906== by 0x131828: slapd_exemode_ldif2db (main.c:2265)
==22906== by 0x12EDD6: main (main.c:972)
==22906==

So only offline imports trigger the crash. I have addressed this. The new patch is attached.

Thanks
Mark

acl related output from valgrind
val_acl.txt

Thank you, Mark. I could setup the server. Since I'm debugging another acl related ticket (#47571), your patch is very helpful. I ran some tests against the server with your patch, I still get the attached leaks. My test is modifying some ACIs and ran search using the ACIs. I guess the cases are not covered by your fixes, but if they could be also eliminated, it'd be nice. Please note that this output is from the pure master branch + your patch (not including my fixes).

Replying to [comment:4 nhosoi]:

Thank you, Mark. I could setup the server. Since I'm debugging another acl related ticket (#47571), your patch is very helpful. I ran some tests against the server with your patch, I still get the attached leaks. My test is modifying some ACIs and ran search using the ACIs. I guess the cases are not covered by your fixes, but if they could be also eliminated, it'd be nice. Please note that this output is from the pure master branch + your patch (not including my fixes).

Thanks Noriko I'll look into this right away! I would like to get this checked in today as I'll be on vacation starting Monday...

Replying to [comment:5 mreynolds]:

Thanks Noriko I'll look into this right away! I would like to get this checked in today as I'll be on vacation starting Monday...

How nice! Your vacation starts next Monday! If that's the case, you could push the current patch. It is in the good shape. You clearly noted "when just starting and stopping the server" in the git comment. You have my ack.

Have a great time!

Replying to [comment:6 nhosoi]:

Replying to [comment:5 mreynolds]:

Thanks Noriko I'll look into this right away! I would like to get this checked in today as I'll be on vacation starting Monday...

How nice! Your vacation starts next Monday! If that's the case, you could push the current patch. It is in the good shape. You clearly noted "when just starting and stopping the server" in the git comment. You have my ack.

Have a great time!

The day isn't over yet :-)

So, I can not reproduce any ACI leaks. I modified the anonymous aci, and then did some anonymous searches that engaged that ACI. No leaks. Can you provide your exact test case please?

Replying to [comment:7 mreynolds]:

So, I can not reproduce any ACI leaks. I modified the anonymous aci, and then did some anonymous searches that engaged that ACI. No leaks. Can you provide your exact test case please?

Sure. Attached is a tar ball containing my test cases. Need to modify a bit, though.

I used the port 10389 instead of 389. Also, the suffix is "o=my.com".

Steps:
1. add entries in mycom3.ldif
2. run "sh acltest.sh"

Please ignore the test result. It requires my patch to pass...

Replying to [comment:8 nhosoi]:

Replying to [comment:7 mreynolds]:

So, I can not reproduce any ACI leaks. I modified the anonymous aci, and then did some anonymous searches that engaged that ACI. No leaks. Can you provide your exact test case please?

Sure. Attached is a tar ball containing my test cases. Need to modify a bit, though.

I used the port 10389 instead of 389. Also, the suffix is "o=my.com".

Steps:
1. add entries in mycom3.ldif
2. run "sh acltest.sh"

Please ignore the test result. It requires my patch to pass...

It couldn't find mymodify(), but I manually added all the acis before running the test. I'm not sure if that invalidated the testcase, but I didn't see any leaks with my new patch. I was expecting to see the ones about the factory extensions - as I will be addressing that in a different ticket(I hope).

Can you test this patch again? Thanks!

Replying to [comment:9 mreynolds]:

It couldn't find mymodify(),
Oops, sorry! It's just one line script... :p
{{{
ldapmodify -x -h localhost -p 10389 -D 'cn=directory manager' -w <mypw> $@
}}}
but I manually added all the acis before running the test. I'm not sure if that invalidated the testcase, but I didn't see any leaks with my new patch. I was expecting to see the ones about the factory extensions - as I will be addressing that in a different ticket(I hope).

Can you test this patch again? Thanks!

I'm attaching the result. It eliminated leaks from aclanom.c.

acl related output from valgrind 2
val_acl.txt.1

Replying to [comment:10 nhosoi]:

Replying to [comment:9 mreynolds]:

It couldn't find mymodify(),
Oops, sorry! It's just one line script... :p
{{{
ldapmodify -x -h localhost -p 10389 -D 'cn=directory manager' -w <mypw> $@
}}}
but I manually added all the acis before running the test. I'm not sure if that invalidated the testcase, but I didn't see any leaks with my new patch. I was expecting to see the ones about the factory extensions - as I will be addressing that in a different ticket(I hope).

Can you test this patch again? Thanks!

I'm attaching the result. It eliminated leaks from aclanom.c.

New patch attached. It does not address the factory extension leaks, or the lock array leaks.

The lock array leak is much more complicated, and needs more investigation. The issue is the lock array can not be freed during the plugin shutdown as acl extensions are still associated with some connections, so it must be freed in daemon.c after freeing the connection table. The other option is not using a lock array, and creating/deleting a rwlock for each connection - not so ideal.

The work I am doing for ticket 47451 should address the factory extension leaks. I am thinking I will also try and address the array leak in that ticket as well.

If you don't mind doing one more test with the latest patch that would be great!

Thank you, Mark!! The result looks pretty good. Needless to say I think, but val_acl.txt.2 is the latest output. :)

wc /tmp/val_acl.txt*

298 1620 17094 /tmp/val_acl.txt
164 886 9247 /tmp/val_acl.txt.1
116 628 6596 /tmp/val_acl.txt.2

Replying to [comment:12 nhosoi]:

Thank you, Mark!! The result looks pretty good. Needless to say I think, but val_acl.txt.2 is the latest output. :)

wc /tmp/val_acl.txt*

298 1620 17094 /tmp/val_acl.txt
164 886 9247 /tmp/val_acl.txt.1
116 628 6596 /tmp/val_acl.txt.2

Thank you, can I see val_acl.txt.2?

acl related output from valgrind 3
val_acl.txt.2

git merge aci-leak
Updating a9cd4e7..1a1b4f8
Fast-forward
include/base/pool.h | 5 +-
include/libaccess/aclproto.h | 4 ++
include/libaccess/las.h | 4 ++
ldap/servers/plugins/acl/acl.h | 7 +++
ldap/servers/plugins/acl/acl_ext.c | 44 +++++++++++++++++--
ldap/servers/plugins/acl/aclanom.c | 19 +++++----
ldap/servers/plugins/acl/aclgroup.c | 7 +++
ldap/servers/plugins/acl/aclinit.c | 2 +-
ldap/servers/plugins/acl/acllist.c | 54 +++++++++++++++++++-----
ldap/servers/plugins/acl/aclparse.c | 20 ++++----
ldap/servers/plugins/acl/aclplugin.c | 12 +++++-
ldap/servers/plugins/retrocl/retrocl.c | 3 +-
ldap/servers/slapd/back-ldbm/dblayer.c | 6 +++
ldap/servers/slapd/back-ldbm/proto-back-ldbm.h | 1 +
ldap/servers/slapd/back-ldbm/vlv.c | 10 ++++
ldap/servers/slapd/daemon.c | 1 +
ldap/servers/slapd/schema.c | 9 ++++
ldap/servers/slapd/slapi-private.h | 3 +
lib/base/pool.cpp | 44 +++++++++----------
lib/libaccess/acl.yy.cpp | 5 ++
lib/libaccess/aclcache.cpp | 24 +++++++++-
lib/libaccess/aclscan.h | 1 +
lib/libaccess/acltools.cpp | 9 ++++
lib/libaccess/permhash.h | 5 +-
lib/libaccess/register.cpp | 47 ++++++++++++++++++--
25 files changed, 273 insertions(+), 73 deletions(-)

git push origin master
a9cd4e7..1a1b4f8 master -> master

commit 1a1b4f8
Author: Mark Reynolds mreynolds@redhat.com
Date: Fri Jan 3 17:09:49 2014 -0500

Leaving open for now, as we might want to push to to 1.3.2 and 1.3.1...

Unfortunately, this free is causing double-free with DBLAYER_NORMAL_MODE, too.
{{{
ldap/servers/slapd/back-ldbm/dblayer.c
2931 if(dbmode == DBLAYER_NORMAL_MODE){
2932 slapi_ch_free_string(&priv->dblayer_home_directory);
2933 }
2934
}}}

{{{
==3256== Invalid free() / delete / delete[] / realloc()
==3256== at 0x4A077E6: free (vg_replace_malloc.c:446)
==3256== by 0x4C620CB: slapi_ch_free (ch_malloc.c:363)
==3256== by 0x4C62158: slapi_ch_free_string (ch_malloc.c:396)
==3256== by 0x112E87D6: dblayer_post_close (dblayer.c:2932)
==3256== by 0x112E8923: dblayer_close (dblayer.c:2978)
==3256== by 0x112E13B6: ldbm_back_close (close.c:62)
==3256== by 0x4CC34E3: plugin_call_func (plugin.c:1474)
==3256== by 0x4CC33D0: plugin_call_one (plugin.c:1442)
==3256== by 0x4CC3227: plugin_dependency_closeall (plugin.c:1373)
==3256== by 0x4CC331C: plugin_closeall (plugin.c:1420)
==3256== by 0x126C80: slapd_daemon (daemon.c:1336)
==3256== by 0x12F503: main (main.c:1273)
==3256== Address 0x13fac370 is 0 bytes inside a block of size 28 free'd
==3256== at 0x4A077E6: free (vg_replace_malloc.c:446)
==3256== by 0x4C620CB: slapi_ch_free (ch_malloc.c:363)
==3256== by 0x4C61260: charray_free (charray.c:238)
==3256== by 0x112E87B4: dblayer_post_close (dblayer.c:2928)
==3256== by 0x112E8923: dblayer_close (dblayer.c:2978)
==3256== by 0x112E13B6: ldbm_back_close (close.c:62)
==3256== by 0x4CC34E3: plugin_call_func (plugin.c:1474)
==3256== by 0x4CC33D0: plugin_call_one (plugin.c:1442)
==3256== by 0x4CC3227: plugin_dependency_closeall (plugin.c:1373)
==3256== by 0x4CC331C: plugin_closeall (plugin.c:1420)
==3256== by 0x126C80: slapd_daemon (daemon.c:1336)
==3256== by 0x12F503: main (main.c:1273)
}}}

You may want to stash this debug output. ;)
2967 LDAPDebug1Arg(LDAP_DEBUG_ANY,"MARK dbmode %d\n",dbmode);

Replying to [comment:15 nhosoi]:

Unfortunately, this free is causing double-free with DBLAYER_NORMAL_MODE, too.
{{{
ldap/servers/slapd/back-ldbm/dblayer.c
2931 if(dbmode == DBLAYER_NORMAL_MODE){
2932 slapi_ch_free_string(&priv->dblayer_home_directory);
2933 }
2934
}}}

{{{
==3256== Invalid free() / delete / delete[] / realloc()
==3256== at 0x4A077E6: free (vg_replace_malloc.c:446)
==3256== by 0x4C620CB: slapi_ch_free (ch_malloc.c:363)
==3256== by 0x4C62158: slapi_ch_free_string (ch_malloc.c:396)
==3256== by 0x112E87D6: dblayer_post_close (dblayer.c:2932)
==3256== by 0x112E8923: dblayer_close (dblayer.c:2978)
==3256== by 0x112E13B6: ldbm_back_close (close.c:62)
==3256== by 0x4CC34E3: plugin_call_func (plugin.c:1474)
==3256== by 0x4CC33D0: plugin_call_one (plugin.c:1442)
==3256== by 0x4CC3227: plugin_dependency_closeall (plugin.c:1373)
==3256== by 0x4CC331C: plugin_closeall (plugin.c:1420)
==3256== by 0x126C80: slapd_daemon (daemon.c:1336)
==3256== by 0x12F503: main (main.c:1273)
==3256== Address 0x13fac370 is 0 bytes inside a block of size 28 free'd
==3256== at 0x4A077E6: free (vg_replace_malloc.c:446)
==3256== by 0x4C620CB: slapi_ch_free (ch_malloc.c:363)
==3256== by 0x4C61260: charray_free (charray.c:238)
==3256== by 0x112E87B4: dblayer_post_close (dblayer.c:2928)
==3256== by 0x112E8923: dblayer_close (dblayer.c:2978)
==3256== by 0x112E13B6: ldbm_back_close (close.c:62)
==3256== by 0x4CC34E3: plugin_call_func (plugin.c:1474)
==3256== by 0x4CC33D0: plugin_call_one (plugin.c:1442)
==3256== by 0x4CC3227: plugin_dependency_closeall (plugin.c:1373)
==3256== by 0x4CC331C: plugin_closeall (plugin.c:1420)
==3256== by 0x126C80: slapd_daemon (daemon.c:1336)
==3256== by 0x12F503: main (main.c:1273)
}}}

How are you reproducing this? I do not see this double free in my testing.

git merge ticket47654
Updating f83f7fe..26aad75
Fast-forward
ldap/servers/slapd/back-ldbm/dblayer.c | 8 +++-----
1 files changed, 3 insertions(+), 5 deletions(-)

git push origin master
f83f7fe..26aad75 master -> master

commit 26aad75
Author: Mark Reynolds mreynolds@redhat.com
Date: Thu Jan 16 10:36:19 2014 -0500

vlv_close() only frees a vlv search lock which causes a deadlock during online bak2db. Reopening ticket...

ok, but is there some place where these could be freed safely, to avoid crashes and to avoid valgrind noise?

Replying to [comment:26 rmeggins]:

ok, but is there some place where these could be freed safely, to avoid crashes and to avoid valgrind noise?

Yes, just needed to make sure we are shutting the server down, and not just the backend(for backend tasks). New patch attached...

For the dblayer stuff - are these variables associated with the ldbm plugin? If so, should we delete them when shutting down the ldbm plugin? I'm just thinking about the work you did for loading/unloading plugins - these seem like variables we should free when unloading the ldbm plugin (and re-allocating them when loading the plugin).

As for the vlv variable - I'm not sure what the lifetime is supposed to be, if it is associated with a plugin or the server in general. If the latter, then deleting at shutdown is ok.

Replying to [comment:28 rmeggins]:

For the dblayer stuff - are these variables associated with the ldbm plugin? If so, should we delete them when shutting down the ldbm plugin? I'm just thinking about the work you did for loading/unloading plugins - these seem like variables we should free when unloading the ldbm plugin (and re-allocating them when loading the plugin).

Dynamic plugins do not allow ldbm database plugins to be dynamically restarted. It would of been a an even "huger" change to try and do that. I decided that if you are making such a dramatic change to DS, that a restart should be required(and acceptable).

As for the vlv variable - I'm not sure what the lifetime is supposed to be, if it is associated with a plugin or the server in general. If the latter, then deleting at shutdown is ok.

Each backend has its own vlv lock. vlv_init() is smart enough to know when the lock already exists, or not. So this should not be an issue.

git merge ticket47654
Updating 3a66ec7..16e5ce7
Fast-forward
ldap/servers/slapd/back-ldbm/dblayer.c | 15 +++++++++++----

git push origin master
3a66ec7..16e5ce7 master -> master

commit 16e5ce7
Author: Mark Reynolds mreynolds@redhat.com
Date: Wed Jul 9 17:39:38 2014 -0400

Metadata Update from @rmeggins:
- Issue assigned to mreynolds
- Issue set to the milestone: 1.3.3 - 1/14 (January)

7 years ago

389-ds-base is moving from Pagure to Github. This means that new issues and pull requests
will be accepted only in 389-ds-base's github repository.

This issue has been cloned to Github and is available here:
- https://github.com/389ds/389-ds-base/issues/991

If you want to receive further updates on the issue, please navigate to the github issue
and click on subscribe button.

Thank you for understanding. We apologize for all inconvenience.

Metadata Update from @spichugi:
- Issue close_status updated to: wontfix (was: Fixed)

3 years ago

Login to comment on this ticket.

Metadata