#254 rhds81 modrdn operation and 100% cpu use in replication
Closed: wontfix None Opened 12 years ago by rmeggins.

https://bugzilla.redhat.com/show_bug.cgi?id=748575

Description of problem:

high cpu use investigation likely related to successive modrdn operations.

customer's environment is 2x mmr and 3 consummers, SSL use, with report of 100%
cpu use on one master then later on, on all replica, never comes back down.

SSL does not seem to be related, it seem to be possible to reproduce without
SSL.

in the customer's word:
"
* RHDS develops a backlog in processing updates/writes, compared to the
  OpenLDAP servers. (Size of the backlog is logged by application
  developers)
* RHDS systems start showing ns-slapd processes at ~100% utilisation.
  * Situation can take up to several weeks to develop
  * Situation appears fairly abruptly
  * Read performance remains
  * Write/update backlog
  * These characteristics at first appear typical for a threading/deadlock
    situation,
  * BUT
    * Writes/updates continue to happen
    * Log files continue to be updated
"

one provided customer core file was showing
nothing unusual, one active thread (was number 23), and from the access log,
some corresponding  modrdn operation.

The only suspicious item seem to be the multiple modrdn operations that seem to
be done very often on a few dn, see the "Additional info" section for a sample.

access log excerpt which represents a sample of 13 hours, 30 minutes and 34 seconds or 5447 secs,
there are 3667 modrdn operations, that is one modrdn roughly every 1.49 sec

It is possible to reproduce a high cpu use when hitting continuously in the 500
to 700 modrdn operations on mmr replica's

Explanation had been when an entry's DN is modified, the following occurs:
1)  nscpEntry information is written to the correct csn
2) the main database (id2entry) is modified
3) the CN database is modified
4) indexes are updated
5) changelog written with nscpEntry state information
As the programmatic changes accumulate in state information, the amount of data
that is required for that specific entry grows. (but we do not seem to notice
this last part)

Proposed short-term actions, either:

1) not to use DN to represent the status of the entry/system, which is
unnecessarily expensive to the system, and instead use a single attribute
instead of the DN, but this may not be possible in customer;s applications.
or
2) remove multi-master and go single-master with multiple hubs/replicas, with
very small value to nsds5ReplicaPurgeDelay, purge state information much
quicker, less state changes being kept, no large nscpEntryWSI specific to mmr
conflicy resolution.


The nscpEntryWSI from the customer and whike testing do not show large entries
over time, only limited information, example:

Version-Release number of selected component (if applicable):

Red Hat Enterprise Linux Server release 5.5 (Tikanga)
redhat-ds-base-8.2.6-1.el5dsrv


How reproducible:


Steps to Reproduce:

1. set system environment
Red Hat Enterprise Linux Server release 5.4 (Tikanga)
Linux dell-p690-01.rhts.eng.bos.redhat.com 2.6.18-164.el5 #1 SMP Tue Aug 18
15:51:48 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux
4GB ram

2. set application environment
redhat-ds-base-8.2.6-1.el5dsrv
2x MMR and 4 consummers (2x MMR can be enough)

3. set test environment
add ldif file for modify statements with modrdn:

vi t.ldif
dn: uid=testuser1,ou=people,dc=example,dc=com
changetype: modrdn
newrdn: uid=testuser2
deleteoldrdn: 1

dn: uid=testuser2,ou=people,dc=example,dc=com
changetype: modrdn
newrdn: uid=testuser1
deleteoldrdn: 1

4. test modrdn and replication:

time ldapmodify -x -h 10.12.56.81 -p 389 -D "cn=directory manager" -w password
-a -f tt.ldif

ldapsearch -xLLL -D "cn=directory manager" -w password -b "dc=example,dc=com"
uid=testuser2

ldapdelete -x -D "cn=directory manager" -w password
uid=testuser2,ou=people,dc=example,dc=com

5. repeat in loop until cpu gets high:

collect ns-slapd pid's
ps ax|grep ns-slapd

used pid of 2 masters and 2 consumers in this command:

for i in `seq 1 1000`; do echo $i; time ldapmodify -x -h 10.12.56.81 -p 389 -D
"cn=directory manager" -w password -a -f t.ldif; top -b -n 1 -p 26320 -p 26737
-p 26461 -p 26530; sleep 1; echo "-----"; echo ""; done 2>&1 >> t.out.txt


Actual results:

after 500 to 700 modrn, depending on the test system, got the high cpu usage,
as well as time out replication messages, like seen by the customer.

Also noticed that even 12 hours after the test loop completed in a case with
10K modrdn, although in the test case, the system load came back to nothing in
the night, similar to the one reported by the customer
mine were like this:

[21/Oct/2011:08:19:24 -0400] NSMMReplicationPlugin - agmt="cn=m1toc1"
(ca1:393): Replication bind with SIMPLE auth resumed
[21/Oct/2011:08:24:26 -0400] - repl5_inc_waitfor_async_results timed out
waiting for responses: 43 242
[21/Oct/2011:08:25:16 -0400] NSMMReplicationPlugin - agmt="cn=m1toc3"
(ca1:395): Unable to receive the response for a startReplication extended
operation to consumer (Timed out). Will retry later.
[21/Oct/2011:08:27:27 -0400] NSMMReplicationPlugin - agmt="cn=m1toc4"
(ca1:396): Unable to receive the response for a startReplication extended
operation to consumer (Timed out). Will retry later.
[21/Oct/2011:08:28:25 -0400] NSMMReplicationPlugin - agmt="cn=m1tom2"
(ca1:390): Warning: unable to receive endReplication extended operation
response (Timed out)
[21/Oct/2011:08:29:11 -0400] NSMMReplicationPlugin - agmt="cn=m1toc2"
(ca1:394): Warning: unable to receive endReplication extended operation
response (Timed out)
[21/Oct/2011:08:33:25 -0400] NSMMReplicationPlugin - agmt="cn=m1tom2"
(ca1:390): Replication bind with SIMPLE auth resumed
[21/Oct/2011:08:34:11 -0400] NSMMReplicationPlugin - agmt="cn=m1toc2"
(ca1:394): Replication bind with SIMPLE auth resumed
[21/Oct/2011:08:34:27 -0400] NSMMReplicationPlugin - agmt="cn=m1toc1"
(ca1:393): Warning: unable to receive endReplication extended operation
response (Timed out)
[21/Oct/2011:08:38:27 -0400] - repl5_inc_waitfor_async_results timed out
waiting for responses: 49 232
[21/Oct/2011:08:39:13 -0400] - repl5_inc_waitfor_async_results timed out
waiting for responses: 48 167
[21/Oct/2011:08:39:27 -0400] NSMMReplicationPlugin - agmt="cn=m1toc1"
(ca1:393): Replication bind with SIMPLE auth resumed



Expected results:


Additional info:

log sample:

The finding so far seem to point to some log entries we noted a couple of days
ago.

commit changeset:c077173/389-ds-base
Author: Rich Megginson rmeggins@redhat.com
Date: Fri Oct 28 12:51:08 2011 -0600
Reviewed by: nhosoi (Thanks!)
Branch: master
Fix Description: The entry code takes great pains to add or insert in
ascending order the dncsn to the e_dncsnset. But the only place this is
ever used is in the call to entry_get_dncsn, which just calls
csnset_get_last_csn, which just gets the last one in the list. Since the
last one in the list is the one with the greatest csn, and the function
csnset_update_csn already stores the greatest csn, we can just use that
function, rather than using csnset_add and insert and building a
potentially
very large list. This should also have the effect of making importing
a replica init LDIF faster.
Platforms tested: RHEL6 x86_64
Flag Day: no
Doc impact: no

commit changeset:2346712/389-ds-base
Author: Rich Megginson rmeggins@redhat.com
Date: Wed Oct 26 17:13:28 2011 -0600
Reviewed by: nhosoi (Thanks!)
Branch: master
Fix Description: The modrdn operation causes the entry to be copied
multiple
times by calling slapi_entry_dup. This in turn does a csnset_dup of the
e_dncsnset in the entry. This function was very inefficient. It would
simp
ly
call csnset_add_csn, which would iterate to the end of the linked list
for every addition. Once you get several thousand items in the list, it
has to iterate to the end of several thousand items each time. I changed
it
to keep track of the last item in the list, and just add the new item to
the
end of the list. This improves the performance quite a bit, but the cpu
still gets pegged at a high percentage eventually, it just takes longer to
reach that point.
Platforms tested: RHEL6 x86_64
Flag Day: no
Doc impact: no


Added initial screened field value.

Metadata Update from @nkinder:
- Issue assigned to rmeggins
- Issue set to the milestone: 1.2.10

7 years ago

389-ds-base is moving from Pagure to Github. This means that new issues and pull requests
will be accepted only in 389-ds-base's github repository.

This issue has been cloned to Github and is available here:
- https://github.com/389ds/389-ds-base/issues/254

If you want to receive further updates on the issue, please navigate to the github issue
and click on subscribe button.

Thank you for understanding. We apologize for all inconvenience.

Metadata Update from @spichugi:
- Issue close_status updated to: wontfix (was: Fixed)

3 years ago

Login to comment on this ticket.

Metadata