#2213 ipa-replica-manage re-initialize causes ALL Severs to rerun memberof fixup
Closed: Fixed None Opened 12 years ago by jraquino.

I have a multimaster infrastructure with 3 core FreeIPA servers and 10 supporting (procedurally read-only) FreeIPA servers.

I notice that occasionally 1 of the systems starts producing errors filling up /var/log/dirsrv/slapd-DOMAIN-COM/errors:
Replica has a different generation ID than the local data
(I suspect this is due to ntp problems that I am trying to work out)

http://www.centos.org/docs/5/html/CDS/ag/8.0/Managing_Replication-Troubleshooting_Replication_Related_Problems.html

^ This document suggests that I should re-initialize the problematic system from one of the core master servers.

Upon so doing, I am finding that all 13 servers CPU's spike to 100% of 1 core while they re-process memberof data... Even though there are many many cores in these systems the intense & single threaded nature of this process causes a performance hit in all 13 data centers for all clients.

Am I reading the documentation wrong? Shouldn't a re-initialization of the problematic host only cause a replication: master -> slave + slave memberof fixup?

This seems like a fairly severe performance effecting bug.

How to reproduce:

Setup A 3 participant FreeIPA replica build.
1 master -> 2 slaves

Perform an ipa-replica-manage re-initialize --from=master on one of the slaves.

Notice that the other slave performs a memberof fixup

NOTE: This is a exponential problem as the more hosts/users groups/hostgroups/hbacrules/sudorules you have, the longer and more noticeable / performance effecting this is.


Pretty easy to replicate the CPU effects as well.

Add 1000+ hosts
Add Hostgroup
Add HBAC Rule
Add all 1000+ hosts to a hostgroup
Add hostgroup to HBAC Rule

This should cause each host to receive a memberof attribute for the hostgroup, the managed nisnetgroup, and the hbacrule.

When performing a memberof mixup the above recipe should be enough to spike the CPU and have it sit for quite a while.

This ticket is similar to: https://fedorahosted.org/freeipa/ticket/2199

The above ticket was created with the assumption that only the reinitialized system itself was performing the memberof fixup

Ticket 2213 is to address the fact that ALL replicas appear to perform the task.

After further inspection, it appears that the initial re-initialize causes the replica system to perform 1 memberof fixup, that in turn replicates back down to the master which triggers all other slaves to update all the objects which the first memberof touched.

This is a different issue. The problem here is that replication agreements do not have nsDS5ReplicatedAttributeList set properly.

Metadata Update from @jraquino:
- Issue assigned to simo
- Issue set to the milestone: FreeIPA 2.2 Core Effort - 2012/01

7 years ago

Login to comment on this ticket.

Metadata