I have a multimaster infrastructure with 3 core FreeIPA servers and 10 supporting (procedurally read-only) FreeIPA servers.
I notice that occasionally 1 of the systems starts producing errors filling up /var/log/dirsrv/slapd-DOMAIN-COM/errors: Replica has a different generation ID than the local data (I suspect this is due to ntp problems that I am trying to work out)
http://www.centos.org/docs/5/html/CDS/ag/8.0/Managing_Replication-Troubleshooting_Replication_Related_Problems.html
^ This document suggests that I should re-initialize the problematic system from one of the core master servers.
Upon so doing, I am finding that all 13 servers CPU's spike to 100% of 1 core while they re-process memberof data... Even though there are many many cores in these systems the intense & single threaded nature of this process causes a performance hit in all 13 data centers for all clients.
Am I reading the documentation wrong? Shouldn't a re-initialization of the problematic host only cause a replication: master -> slave + slave memberof fixup?
This seems like a fairly severe performance effecting bug.
How to reproduce:
Setup A 3 participant FreeIPA replica build. 1 master -> 2 slaves
Perform an ipa-replica-manage re-initialize --from=master on one of the slaves.
Notice that the other slave performs a memberof fixup
NOTE: This is a exponential problem as the more hosts/users groups/hostgroups/hbacrules/sudorules you have, the longer and more noticeable / performance effecting this is.
Pretty easy to replicate the CPU effects as well.
Add 1000+ hosts Add Hostgroup Add HBAC Rule Add all 1000+ hosts to a hostgroup Add hostgroup to HBAC Rule
This should cause each host to receive a memberof attribute for the hostgroup, the managed nisnetgroup, and the hbacrule.
When performing a memberof mixup the above recipe should be enough to spike the CPU and have it sit for quite a while.
This ticket is similar to: https://fedorahosted.org/freeipa/ticket/2199
The above ticket was created with the assumption that only the reinitialized system itself was performing the memberof fixup
Ticket 2213 is to address the fact that ALL replicas appear to perform the task.
After further inspection, it appears that the initial re-initialize causes the replica system to perform 1 memberof fixup, that in turn replicates back down to the master which triggers all other slaves to update all the objects which the first memberof touched.
Closed as a duplicate of https://fedorahosted.org/freeipa/ticket/2199
This is a different issue. The problem here is that replication agreements do not have nsDS5ReplicatedAttributeList set properly.
Ticket has been cloned to Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=772150
Fixed in: 0d3cd4c
Fixed in ipa-2-2 branch: d20a11a
Metadata Update from @jraquino: - Issue assigned to simo - Issue set to the milestone: FreeIPA 2.2 Core Effort - 2012/01
Login to comment on this ticket.