#2434 KRA fails to install with IPA in some cases
Closed: migrated 3 years ago by dmoluguw. Opened 7 years ago by mbasti.

Please see FreeIPA ticket: https://fedorahosted.org/freeipa/ticket/6096

This behavior is happening in our test automation. If you need additional info please contact me.


NOTE: This ticket may be resolved by the 10.3.5 builds;
needs to be retested with these before determination
of the status of this ticket -- 10.3.6 or closed fixed.

The problem cannot be reproduced with PKI 10.3.5.

IT still does not work

pki-base-10.3.5-1.fc24.noarch
pki-base-java-10.3.5-1.fc24.noarch
pki-ca-10.3.5-1.fc24.noarch
pki-kra-10.3.5-1.fc24.noarch

Martin,

Could you provide the exact commands to reproduce the problem? Please attach the input files too if any (e.g. PKCS #12 file). Thanks.

According to Martin so far this problem only happens in automated test. No actual user encountered the problem yet. Due to the rarity of the problem the priority is lowered. We still need a reproducer (including input files) to debug the problem and verify the fix later.

I haven't been able to find more minimal reproducer (maybe this is the minimal)

Steps to reproduce:

* [master] ipa-server-install --setup-dns
* [master] ipa-kra-install


* [replica0] ipa-replica-install (against master)
* [replica0] ipa-ca-install
* [replica0] ipa-kra-install
* [replica0] ipa-dns-install


* [replica1] ipa-replica-install --setup-ca (against master)
* [replica1] ipa-kra-install  <-----failed here

I was able to reproduce it manually, but I don't know if this is 100% reproducible, please note that replica0 must be installed too, without that I couldn't reproduce it.

JFTR: this was reproduced with domain level 1 (default for IPA 4.3+)

I can confirm that this happen only when there are at least 3 servers with KRA, with 2 installs it works.

shorter reproducer

[master] ipa-server-install
[master] ipa-kra-install
[replica0] ipa-replica-install --setup-ca
[replica0] ipa-kra-install
[replica1] ipa-replica-install --setup-ca
[replica1] ipa-kra-install

I still cannot reproduce the problem with the above steps. Could you attach the CA and the KRA debug logs from all machines? Thanks.

Per CS/DS meeting of 09/12/2016: 10.4 (major)

Logs are too big, trac refuses to save them, I provided logs directly to Endi.

OK .. so whats going on here involves an authorization error due to replication timing. To explain the
problem - and the possible solution - I need to explain a bit about how authorization works during the
install process.

When you attempt to clone a Dogtag subsystem, the installer on the replica contacts the security domain CA, provides credentials and obtains a session_id which it uses as a token. At the same time, a database entry is created oin the security domain CA for the session (referenced by sessionID).

Now, during the install, whenever the replica needs something from another Dogtag subsystem, it provides this session_id to that system. That system then contacts the security domain and verifies that the session ID corresponds to an active installation session, and validates details like the user/system of the token bearer etc.

An example of this is as follows: When cloning a KRA, the KRA replica needs some configuration parameters from the master KRA. The replica provides the master KRA the session ID, and the master KRA validates the session ID by contacting the security domain (as configured on the master KRA).

OK -- so now lets understand what is going wrong.

Initially you have one PKI instance with a KRA and CA. (master CA/KRA). In the CS.cfg of each subsystem is a parameter securitydomain.host which points to the master instance.

Now, lets create the first replica CA. The replica CA contacts the security domain on master CA to get a token. When it asks the master CA for some config parameters, it provides the token - which the master CA checks against its own database. At the end of the install, the replica changes its securitydomain.host to point to itself.

Then we create a replica KRA. Once again, we contact the security domain on the master to get a token - and the KRA on the master checks its own db to see if the token is valid. Because its a KRA and not a CA though, the replica KRA still points to the master when the installation completes.

So, now we have

master CA (SD points to master CA)

master KRA (SD points to master CA)

replica1 CA (SD points to replica1 CA)

replica1 KRA (SD points to master CA)

Now we clone replica1 CA to create replica2 CA. In this case, replica2 contacts replica1 CA for the security domain, and replica1 verifies the token against its own database (as its SD points to replica1 CA). At the end of the install, the SD for replica2 is changed to replica2 CA.

Now, we try to clone replica1 KRA to create replica2 KRA. Replica2 KRA contacts the security domain on replica1 CA and gets a token issued by replica1 CA. When verifying the token, however, replica1 KRA checks his own security domain - which points to master CA.

Now normally this isn't a problem - because the databases for replica1 CA and master CA are replicated.
As long as enough time has elapsed, the session record created on replica1 CA will have been replicated to master CA.

But occasionally, it seems that a validation is required before the session record is replicated - causing an authorization failure - as we see in this case.

So, how do we fix this?

The simple solution is to set the security domain of the KRA to the CA on the same host at the end of the install. So, if we do that --

replica1 CA (SD points to replica1 CA)

replica1 KRA (SD points to master CA)

becomes:

replica1 CA (SD points to replica1 CA)

replica1 KRA (SD points to replica1 KRA)

The end result is that the token is issued and verified from the same instance - and the same db instance. So we no longer need to worry about the vagaries of replication.

This of course takes advantage of the unique way IPA has set up dogtag - in that whenever there is a KRA, there is necessarily a CA too, and that all KRAs and CAs are clones. We can't assume this in general, which is why this fix needs to happen in IPA and not in dogtag.

A more general solution to this probably means revamping how we use tokens -- maybe using signed tokens for instance so that no validation is required. But this is slated for 10.4.

So, the take away is - to fix the problem:
1. The securitydomain.host parameter in the KRA should be pointed to the CA on the same host. This should be done in IPA install code after pkispawn is complete.
2, You should also update existing KRA instances as above.

Thanks for info:

  1. but installation failed on pkispawn, how this can be done after? Can we edit somehow config for pkispawn?
  2. Is possible to avoid upgrades? This issue should affect only new installations, shouldn't it?

Metadata Update from @mbasti:
- Issue assigned to edewata
- Issue set to the milestone: UNTRIAGED

7 years ago

Dogtag PKI is moving from Pagure issues to GitHub issues. This means that existing or new
issues will be reported and tracked through Dogtag PKI's GitHub Issue tracker.

This issue has been cloned to GitHub and is available here:
https://github.com/dogtagpki/pki/issues/2554

If you want to receive further updates on the issue, please navigate to the
GitHub issue and click on Subscribe button.

Thank you for understanding, and we apologize for any inconvenience.

Metadata Update from @dmoluguw:
- Issue close_status updated to: migrated
- Issue status updated to: Closed (was: Open)

3 years ago

Login to comment on this ticket.

Metadata