#5510 After an upgrade, DNA plugin tries/can request new ranges from the wrong replica
Opened 8 years ago by tbordaz. Modified 7 years ago

The upgrade install script creates an invalid shared config entry of the master:
dn: dnaHostname=vm-053.REALM+dnaPortNum=0,cn=posix-ids,cn=dna,cn=ipa,cn=etc,SUFFIX

This entry is a kind of duplicate of the valid shared config entry:
dn: dnaHostname=vm-053.REALM+dnaPortNum=389,cn=posix-ids,cn=dna,cn=ipa,cn=etc,SUFFIX

The reason is that the upgrade script disable for some time the listner (__disable_listeners) setting the normal server port 'nsslapd-port: 0'. (side effect of https://fedorahosted.org/freeipa/ticket/4925)

The consequence is that this invalid shared config entry will always be the first candidate to get a range.
This is because it has the highest range and its range is never updated.
So we can see this kind of log on a server requesting a range:

[03/Dec/2015:12:08:32 +0100] dna-plugin - dna_get_remote_config_info: Using LDAP protocol, but the non-secure port is not defined.
[03/Dec/2015:12:08:32 +0100] dna-plugin - dna_request_range: Unable to retrieve replica bind credentials.

if the connection protocol ('dnaRemoteConnProtocol') is LDAP. The requester will not be able to contact the server. So the impact of the bug is minor.

if the connection protocol is SSL, then the requester will request a range to this server even if it has not the highest number of remaining value.


Steps to reproduce

    << install + upgrade topology >>
Master
dnf install freeipa-server-4.2.2-1.fc23
ipa-server-install
ipa-replica-prepare
dnf install freeipa-server-4.2.3-1.1.fc23

Client #1
dnf install freeipa-server-4.2.2-1.fc23
ipa-replica-install
dnf install freeipa-server-4.2.3-1.1.fc23

Client #2
dnf install freeipa-server-4.2.2-1.fc23
ipa-replica-install
dnf install freeipa-server-4.2.3-1.1.fc23

    << check the invalid entry >>
(With https://fedorahosted.org/freeipa/ticket/4026 to allow
 DNA req to all servers)

dn: cn=dna,cn=ipa,cn=etc,SUFFIX
objectClass: nsContainer
objectClass: top
cn: dna

dn: cn=posix-ids,cn=dna,cn=ipa,cn=etc,SUFFIX
objectClass: nsContainer
objectClass: top
cn: posix-ids

    <<< dummy entry >>>
dn: dnaHostname=vm-053.REALM+dnaPortNum=0,cn=posix-ids,cn=dna,cn=ipa,cn=etc,SUFFIX
objectClass: dnaSharedConfig
objectClass: top
dnaHostname: vm-053.REALM
dnaPortNum: 0
dnaSecurePortNum: 636
dnaRemainingValues: 200000
dnaRemoteConnProtocol: LDAP
dnaRemoteBindMethod: SASL/GSSAPI

dn: dnaHostname=vm-053.REALM+dnaPortNum=389,cn=posix-ids,cn=dna,cn=ipa,cn=etc,SUFFIX
objectClass: dnaSharedConfig
objectClass: top
dnaHostname: vm-053.REALM
dnaPortNum: 389
dnaSecurePortNum: 636
dnaRemainingValues: 200000
dnaRemoteConnProtocol: LDAP
dnaRemoteBindMethod: SASL/GSSAPI

dn: dnaHostname=vm-058-102.REALM+dnaPortNum=389,cn=posix-ids,cn=dna,cn=ipa,cn=etc,SUFFIX
objectClass: dnaSharedConfig
objectClass: top
dnaHostname: vm-058-102.REALM
dnaPortNum: 389
dnaSecurePortNum: 636
dnaRemainingValues: 0
dnaRemoteConnProtocol: LDAP
dnaRemoteBindMethod: SASL/GSSAPI

============================================================
    <<< get ranges from the two replicas >>>
vm-058-102
ipa user-add --first=t --last=b tb102-1

ldapsearch -LLL -D "cn=directory manager" -w Secret123 -b "cn=dna,cn=ipa,cn=etc,SUFFIX" "(objectClass=dnaSharedConfig)" dnaRemainingValues
dn: dnaHostname=vm-053.REALM+dnaPortNum=0,cn=posix-ids,cn=dna,cn=ipa,cn=etc,SUFFIX
dnaRemainingValues: 200000

dn: dnaHostname=vm-053.REALM+dnaPortNum=389,cn=posix-ids,cn=dna,cn=ipa,cn=etc,SUFFIX
dnaRemainingValues: 100000  <----- granted range

dn: dnaHostname=vm-058-102.REALM+dnaPortNum=389,cn=posix-ids,cn=dna,cn=ipa,cn=etc,SUFFIX
dnaRemainingValues: 99999   <----- requested range

dn: dnaHostname=vm-058-043.REALM+dnaPortNum=389,cn=posix-ids,cn=dna,cn=ipa,cn=etc,SUFFIX
dnaRemainingValues: 0

vm-058-043
ipa user-add --first=t --last=b tb43-1

dn: dnaHostname=vm-053.REALM+dnaPortNum=0,cn=posix-ids,cn=dna,cn=ipa,cn=etc,SUFFIX
dnaRemainingValues: 200000

dn: dnaHostname=vm-053.REALM+dnaPortNum=389,cn=posix-ids,cn=dna,cn=ipa,cn=etc,SUFFIX
dnaRemainingValues: 50000   <----- granted range

dn: dnaHostname=vm-058-102.REALM+dnaPortNum=389,cn=posix-ids,cn=dna,cn=ipa,cn=etc,SUFFIX
dnaRemainingValues: 99999

dn: dnaHostname=vm-058-043.REALM+dnaPortNum=389,cn=posix-ids,cn=dna,cn=ipa,cn=etc,SUFFIX
dnaRemainingValues: 49999   <----- requested range

=============================================================
        << simulate exhausted range on vm-058-043>>
vm-058-043
ldapsearch -LLL -h localhost -p 389 -D "cn=directory manager" -w Secret123 -b "cn=Posix IDs,cn=Distributed Numeric Assignment Plugin,cn=plugins,cn=config" dnaMaxValue dnaNextValue
dn: cn=Posix IDs,cn=Distributed Numeric Assignment Plugin,cn=plugins,cn=config
dnaMaxValue: 862499999
dnaNextValue: 862450001

ldapmodify -h localhost -p  389 -D "cn=directory manager" -w Secret123
dn: cn=Posix IDs,cn=Distributed Numeric Assignment Plugin,cn=plugins,cn=config
changetype: modify
replace: dnaNextValue
dnaNextValue: 862499990

ldapsearch -LLL -D "cn=directory manager" -w Secret123 -b "cn=dna,cn=ipa,cn=etc,SUFFIX" "(objectClass=dnaSharedConfig)" dnaRemainingValues
dn: dnaHostname=vm-053.REALM+dnaPortNum=0,cn=posix-ids,cn=dna,cn=ipa,cn=etc,SUFFIX
dnaRemainingValues: 200000

dn: dnaHostname=vm-053.REALM+dnaPortNum=389,cn=posix-ids,cn=dna,cn=ipa,cn=etc,SUFFIX
dnaRemainingValues: 50000

dn: dnaHostname=vm-058-102.REALM+dnaPortNum=389,cn=posix-ids,cn=dna,cn=ipa,cn=etc,SUFFIX
dnaRemainingValues: 99999

dn: dnaHostname=vm-058-043.REALM+dnaPortNum=389,cn=posix-ids,cn=dna,cn=ipa,cn=etc,SUFFIX
dnaRemainingValues: 10

=============================================================
        << Trigger vm-058-043 to be a requestor of a new range
           It actually finds 'dnaHostname=vm-053.REALM+dnaPortNum=0' to be the highest
           but failed to contact it >>
vm-058-043
ipa user-add --first=t --last=b tb43-[2-10]

[03/Dec/2015:12:08:32 +0100] dna-plugin - dna_get_remote_config_info: Using LDAP protocol, but the non-secure port is not defined.
[03/Dec/2015:12:08:32 +0100] dna-plugin - dna_request_range: Unable to retrieve replica bind credentials.


        << so it got it from vm-058-102 >>
dn: dnaHostname=vm-053.REALM+dnaPortNum=0,cn=posix-ids,cn=dna,cn=ipa,cn=etc,SUFFIX
dnaRemainingValues: 200000

dn: dnaHostname=vm-053.REALM+dnaPortNum=389,cn=posix-ids,cn=dna,cn=ipa,cn=etc,SUFFIX
dnaRemainingValues: 50000

dn: dnaHostname=vm-058-102.REALM+dnaPortNum=389,cn=posix-ids,cn=dna,cn=ipa,cn=etc,SUFFIX
dnaRemainingValues: 50499   <---------- 99999 ->50499

dn: dnaHostname=vm-058-043.REALM+dnaPortNum=389,cn=posix-ids,cn=dna,cn=ipa,cn=etc,SUFFIX
dnaRemainingValues: 49509

=============================================================

        << If we change the protocol to be SSL to the dummy entry
           then the wrong server is reached >>

dn: dnaHostname=vm-053.REALM+dnaPortNum=0,cn=posix-id
 s,cn=dna,cn=ipa,cn=etc,SUFFIX
dnaRemoteConnProtocol: SSL  <----   set the protocol to SSL as will connect with dnaSecurePortNum
dnaRemoteBindMethod: SASL/GSSAPI
objectClass: dnaSharedConfig
objectClass: top
dnaHostname: vm-053.REALM
dnaPortNum: 0
dnaSecurePortNum: 636
dnaRemainingValues: 200000

    << exhaust vm-058-043 ranges >>
dn: dnaHostname=vm-053.REALM+dnaPortNum=0,cn=posix-ids,cn=dna,cn=ipa,cn=etc,SUFFIX
dnaRemainingValues: 200000

dn: dnaHostname=vm-053.REALM+dnaPortNum=389,cn=posix-ids,cn=dna,cn=ipa,cn=etc,SUFFIX
dnaRemainingValues: 25000

dn: dnaHostname=vm-058-102.REALM+dnaPortNum=389,cn=posix-ids,cn=dna,cn=ipa,cn=etc,SUFFIX
dnaRemainingValues: 25499  <--- Should request vm-058-102

dn: dnaHostname=vm-058-043.REALM+dnaPortNum=389,cn=posix-ids,cn=dna,cn=ipa,cn=etc,SUFFIX
dnaRemainingValues: 2  <------ exhausted

ipa user-add --first=thierry --last=bordaz tb43-27

dn: dnaHostname=vm-053.REALM+dnaPortNum=0,cn=posix-ids,cn=dna,cn=ipa,cn=etc,SUFFIX
dnaRemainingValues: 200000

dn: dnaHostname=vm-053.REALM+dnaPortNum=389,cn=posix-ids,cn=dna,cn=ipa,cn=etc,SUFFIX
dnaRemainingValues: 12500   <--------- requested vm-053 because the dummy entry  had 200000 remaining values

dn: dnaHostname=vm-058-102.REALM+dnaPortNum=389,cn=posix-ids,cn=dna,cn=ipa,cn=etc,SUFFIX
dnaRemainingValues: 25499

dn: dnaHostname=vm-058-043.REALM+dnaPortNum=389,cn=posix-ids,cn=dna,cn=ipa,cn=etc,SUFFIX
dnaRemainingValues: 12501   <---------- received from vm-053 instead of vm-058-043


Note: The remaining values of the dummy entry is unchanged so it will always remain
the first requested host

dn: dnaHostname=vm-053.REALM+dnaPortNum=0,cn=posix-ids,cn=dna,cn=ipa,cn=etc,SUFFIX
dnaRemainingValues: 200000

I think this ticket should be reassigned to 389-ds.

Actually, the upgrade procedure disables normal/secure ports to only allow ldapi access.
This is done by setting 'nsslapd-port: 0' and 'nsslapd-security: off' (upgradedsinstance.py:__disable_listeners).

The problem is that 389-ds dna plugins, recreate the shared config entry although those ports have been disabled. dna plugin should check 'nsslapd-security: off' (in addition to nsslapd-port: 0') to decide to skip this shared config entry, instead of doing a lookup of 'nsslapd-secureport'.

Now it remains the problem how to get rid of those already created dummy entries. Should it be part of the IPA upgrade script to clear the side effect of this 389-ds bug.
I think so.

As long as the dummy entry exists, the DNA request may hit the wrong host.

Should 389 plugin reflect an IPA specific upgrade behavior? It doesn't sound right. Though it could be the easiest way.

Hi Petr,

I think there are two issues:

  • creation of the dummy entries. At first I thought it should be handled in the IPA upgrade script but thinking further I think it is a bug in 389-ds. At that time both normal/secure port are disabled so DNA plugin should not create the shared config entry. I am fine to change the component to 389-ds

  • Since 4.2, upgrade procedure have created such dummy entries. The problem is that those dummy entries could direct the dna-range-request to the wrong server (as the dummy entry will remain with an unchanged remaining range). We need to clear those entries if they exist. Should it be documented only or implemented in a IPA scripts ?

Metadata Update from @tbordaz:
- Issue assigned to tbordaz
- Issue set to the milestone: FreeIPA 4.5 backlog

7 years ago

Login to comment on this ticket.

Metadata