#933 F20 clone pkispawn crashes when master is dogtag9
Closed: Invalid None Opened 10 years ago by mkosek.

master: ipa-server-3.0.0-37.el6.x86_64, pki-ca-9.0.3-32.el6.noarch[[BR]]
replica: freeipa-server-3.3.90GITa582bf9-0.fc20.x86_64, pki-ca-10.1.1-1.fc20.noarch

ipareplica-ca-install.log:

2014-03-26T08:09:15Z DEBUG   [2/18]: configuring certificate server instance
2014-03-26T08:09:15Z DEBUG Contents of pkispawn configuration file (/tmp/tmpO5PP2q):
[CA]
pki_security_domain_name = IPA
pki_enable_proxy = True
pki_restart_configured_instance = False
pki_backup_keys = True
pki_backup_password = XXXXXXXX
pki_client_database_dir = /tmp/tmp-XkC3M3
pki_client_database_password = XXXXXXXX
pki_client_database_purge = False
pki_client_pkcs12_password = XXXXXXXX
pki_admin_name = admin
pki_admin_uid = admin
pki_admin_email = root@localhost
pki_admin_password = XXXXXXXX
pki_admin_nickname = ipa-ca-agent
pki_admin_subject_dn = cn=ipa-ca-agent,O=EXAMPLE.COM
pki_client_admin_cert_p12 = /root/ca-agent.p12
pki_ds_ldap_port = 389
pki_ds_password = XXXXXXXX
pki_ds_base_dn = o=ipaca
pki_ds_database = ipaca
pki_subsystem_subject_dn = cn=CA Subsystem,O=EXAMPLE.COM
pki_ocsp_signing_subject_dn = cn=OCSP Subsystem,O=EXAMPLE.COM
pki_ssl_server_subject_dn = cn=vm-236.example.com,O=EXAMPLE.COM
pki_audit_signing_subject_dn = cn=CA Audit,O=EXAMPLE.COM
pki_ca_signing_subject_dn = cn=Certificate Authority,O=EXAMPLE.COM
pki_subsystem_nickname = subsystemCert cert-pki-ca
pki_ocsp_signing_nickname = ocspSigningCert cert-pki-ca
pki_ssl_server_nickname = Server-Cert cert-pki-ca
pki_audit_signing_nickname = auditSigningCert cert-pki-ca
pki_ca_signing_nickname = caSigningCert cert-pki-ca
pki_security_domain_hostname = vm-086.example.com
pki_security_domain_https_port = 443
pki_security_domain_user = admin
pki_security_domain_password = XXXXXXXX
pki_clone = True
pki_clone_pkcs12_path = /tmp/ca.p12
pki_clone_pkcs12_password = XXXXXXXX
pki_clone_replication_security = TLS
pki_clone_replication_master_port = 7389
pki_clone_replication_clone_port = 389
pki_clone_replicate_schema = False
pki_clone_uri = https://vm-086.example.com:443

2014-03-26T08:09:15Z DEBUG Starting external process
2014-03-26T08:09:15Z DEBUG args=/usr/sbin/pkispawn -s CA -f /tmp/tmpO5PP2q
2014-03-26T08:10:58Z DEBUG Process finished, return code=1
2014-03-26T08:10:58Z DEBUG stdout=Loading deployment configuration from /tmp/tmpO5PP2q.
Installing CA into /var/lib/pki/pki-tomcat.
Storing deployment configuration into /etc/sysconfig/pki/tomcat/pki-tomcat/ca/deployment.cfg.

Installation failed.


2014-03-26T08:10:58Z DEBUG stderr=pkispawn    : WARNING  ....... unabae to validate security domain user/password through REST interface. Interface not available
pkispawn    : ERROR    ....... Exception from Java Configuration Servlet: Failed to obtain configuration entries from the master for cloning org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 50; White spaces are required between publicId and systemId.

2014-03-26T08:10:58Z CRITICAL failed to configure ca instance Command '/usr/sbin/pkispawn -s CA -f /tmp/tmpO5PP2q' returned non-zero exit status 1

pki-ca-spawn log:

2014-03-26 09:09:24 pkispawn    : INFO     ....... executing 'systemctl daemon-reload'
2014-03-26 09:09:24 pkispawn    : INFO     ....... executing 'systemctl start pki-tomcatd@pki-tomcat.service'
2014-03-26 09:09:24 pkispawn    : DEBUG    ........... No connection - server may still be down
2014-03-26 09:09:24 pkispawn    : DEBUG    ........... No connection - exception thrown: Cannot connect to proxy. Socket error: [Errno 111] Connection refused.
2014-03-26 09:09:25 pkispawn    : DEBUG    ........... No connection - server may still be down
2014-03-26 09:09:25 pkispawn    : DEBUG    ........... No connection - exception thrown: Cannot connect to proxy. Socket error: [Errno 111] Connection refused.
2014-03-26 09:09:26 pkispawn    : DEBUG    ........... No connection - server may still be down
2014-03-26 09:09:26 pkispawn    : DEBUG    ........... No connection - exception thrown: Cannot connect to proxy. Socket error: [Errno 111] Connection refused.
2014-03-26 09:09:27 pkispawn    : DEBUG    ........... No connection - server may still be down
2014-03-26 09:09:27 pkispawn    : DEBUG    ........... No connection - exception thrown: Cannot connect to proxy. Socket error: [Errno 111] Connection refused.
2014-03-26 09:09:39 pkispawn    : DEBUG    ........... <?xml version="1.0" encoding="UTF-8" standalone="no"?><XMLResponse><State>0</State><Type>CA</Type><Status>running</Status><Version>10.1.1-1.fc20</Version></XMLResponse>
2014-03-26 09:09:40 pkispawn    : INFO     ....... constructing PKI configuration data.
2014-03-26 09:09:40 pkispawn    : INFO     ....... configuring PKI configuration data.
2014-03-26 09:10:58 pkispawn    : ERROR    ....... Exception from Java Configuration Servlet: Failed to obtain configuration entries from the master for cloning org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 50; White spaces are required between publicId and systemId.
2014-03-26 09:10:58 pkispawn    : DEBUG    ....... Error Type: HTTPError
2014-03-26 09:10:58 pkispawn    : DEBUG    ....... Error Message: 500 Server Error: Internal Server Error
2014-03-26 09:10:58 pkispawn    : DEBUG    .......   File "/usr/sbin/pkispawn", line 463, in main
    rv = instance.spawn(deployer)
  File "/usr/lib/python2.7/site-packages/pki/server/deployment/scriptlets/configuration.py", line 126, in spawn
    json.dumps(data, cls=pki.encoder.CustomTypeEncoder))
  File "/usr/lib/python2.7/site-packages/pki/server/deployment/pkihelper.py", line 3194, in configure_pki_data
    response = client.configure(data)
  File "/usr/lib/python2.7/site-packages/pki/system.py", line 80, in configure
    r = self.connection.post('/rest/installer/configure', data, headers)
  File "/usr/lib/python2.7/site-packages/pki/client.py", line 64, in post
    r.raise_for_status()
  File "/usr/lib/python2.7/site-packages/requests/models.py", line 683, in raise_for_status
    raise HTTPError(http_error_msg, response=self)

master access log:

10.0.0.236 - - [26/Mar/2014:04:04:21 -0400] "GET /ca/rest/securityDomain/domainInfo HTTP/1.1" 404 328
10.0.0.236 - - [26/Mar/2014:04:04:23 -0400] "GET /ca/admin/ca/getDomainXML HTTP/1.1" 200 1160
10.0.0.236 - - [26/Mar/2014:04:04:26 -0400] "GET /ca/rest/account/login HTTP/1.1" 404 316
10.0.0.236 - - [26/Mar/2014:04:04:46 -0400] "POST /ca/admin/ca/getCertChain HTTP/1.0" 200 1458
10.0.0.236 - - [26/Mar/2014:04:04:49 -0400] "GET /ca/rest/account/login HTTP/1.1" 404 316
10.0.0.236 - - [26/Mar/2014:04:04:50 -0400] "POST /ca/admin/ca/getCookie HTTP/1.1" 200 4102
10.0.0.236 - - [26/Mar/2014:04:04:52 -0400] "POST /ca/admin/ca/getDomainXML HTTP/1.0" 200 1160
10.0.0.236 - - [26/Mar/2014:04:04:54 -0400] "POST /ca/admin/ca/getCertChain HTTP/1.0" 200 1458
10.0.0.236 - - [26/Mar/2014:04:04:54 -0400] "POST /ca/admin/ca/updateNumberRange HTTP/1.0" 404 324
10.0.0.86 - - [26/Mar/2014:04:04:56 -0400] "POST /ca/ee/ca/tokenAuthenticate HTTP/1.0" 200 154
10.0.0.236 - - [26/Mar/2014:04:04:55 -0400] "POST /ca/ee/ca/updateNumberRange HTTP/1.0" 200 164
10.0.0.236 - - [26/Mar/2014:04:04:58 -0400] "POST /ca/admin/ca/updateNumberRange HTTP/1.0" 404 324
10.0.0.236 - - [26/Mar/2014:04:04:59 -0400] "POST /ca/ee/ca/updateNumberRange HTTP/1.0" 503 420
10.0.0.86 - - [26/Mar/2014:04:05:00 -0400] "POST /ca/ee/ca/tokenAuthenticate HTTP/1.0" 500 631

Per discussions with alee and nkinder, we have slated this for the 10.1.2 milestone.

I have been able to replicate this error utilizing an instance of Dogtag 9 on RHEL 6 (master) and an instance of Dogtag 10.1 on Fedora 20 (clone). I am now in the process of debugging this issue.

Replying to [comment:3 mharmsen]:

I have been able to replicate this error utilizing an instance of Dogtag 9 on RHEL 6 (master) and an instance of Dogtag 10.1 on Fedora 20 (clone). I am now in the process of debugging this issue.

I believe that I may have been mistaken regarding replication of this issue simply using Dogtag (probably some sort of configuration issue), but upon alee's advice, I was able to replicate the issue by setting up some VMs and using IPA:

RHEL 6 master (vm-125.example.com):

    # rpm -q ipa-server
    ipa-server-3.0.0-37.el6.x86_64

    # rpm -q pki-ca
    pki-ca-9.0.3-32.el6.noarch

    # sudo ipa-server-install --setup-dns

    # sudo ipa-replica-prepare vm-100.example.com

    Copied '/var/lib/ipa/replica-info-vm-100.example.com.gpg' to the same
    location on the replica machine (vm-100.example.com).

Fedora 20 replica (vm-100.example.com):

    # rpm -q freeipa-server
    freeipa-server-3.3.4-3.fc20.x86_64 (not your GIT version)

    # rpm -q pki-ca
    pki-ca-10.1.1-1.fc20.noarch

    # sudo ipa-replica-install --ip-address=10.16.78.100 --setup-ca
      /var/lib/ipa/replica-info-vm-100.example.com.gpg
      --skip-schema-check

      NOTE:  I encountered some difficulty related to DNS, and also
             had to use the '--skip-schema-check' in order to allow
             the replica to successfully install a directory server. 

This basically produced the error that you were seeing.

Checking the '/var/log/pki/pki-tomcat/ca/debug' file of the cloned CA on the replica machine,
I noticed a familiar issue -- the master CA appeared to be rejecting the certificate request
to sign the SSL server cert from the clone.

Looking back on the RHEL 6 master, I noticed that the changes required by FREEIPA TRAC Ticket #4265 - Proxy change required on Master IPA Server to allow IPA Replica when using Dogtag had not been applied to the file '/etc/httpd/conf.d/ipa-pki-proxy.conf' (looks like a bug was filed for RHEL 7, but not for RHEL 6).

I added |/ca/ee/ca/profileSubmit (include a caret after the pipe) to the end of the 'LocationMatch' matches for ee port, and restarted httpd.

After uninstalling the replica, I was able to successfully install the replica without any crashes.

Please apply this change to your example and retest.

If this fixes the issue, then please file a RHEL 6 bug for ipa-server-3.0.0 to have this fix applied to the IPA server present on RHEL 6.

When I re-tested the issue and updated the PKI proxy, installation completed.

In my previous test, I was mostly confused by the following PKI error:

2014-03-26 09:10:58 pkispawn    : ERROR    ....... Exception from Java Configuration Servlet: Failed to obtain configuration entries from the master for cloning org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 50; White spaces are required between publicId and systemId.

As it did not gave a hint that this is the problem.

Anyway, I will file a RHEL-6 bug for adding the proxy, thanks for testing.

Replying to [comment:5 mkosek]:

When I re-tested the issue and updated the PKI proxy, installation completed.

In my previous test, I was mostly confused by the following PKI error:
2014-03-26 09:10:58 pkispawn : ERROR ....... Exception from Java Configuration Servlet: Failed to obtain configuration entries from the master for cloning org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 50; White spaces are required between publicId and systemId.

As it did not gave a hint that this is the problem.

Anyway, I will file a RHEL-6 bug for adding the proxy, thanks for testing.

Although this ticket will be closed as 'worksforme', PKI TRAC Ticket #942 - Cleanup error reporting by ConfigurationUtils.java functions has been filed to address this confusing error message.

Metadata Update from @mkosek:
- Issue assigned to mharmsen
- Issue set to the milestone: 10.1.2

7 years ago

Dogtag PKI is moving from Pagure issues to GitHub issues. This means that existing or new
issues will be reported and tracked through Dogtag PKI's GitHub Issue tracker.

This issue has been cloned to GitHub and is available here:
https://github.com/dogtagpki/pki/issues/1499

If you want to receive further updates on the issue, please navigate to the
GitHub issue and click on Subscribe button.

Thank you for understanding, and we apologize for any inconvenience.

Login to comment on this ticket.

Metadata