#47901 After total init, nsds5replicaLastInitStatus can report an erroneous error status (like 'Referral')
Closed: wontfix None Opened 9 years ago by tbordaz.

During a total init, the replica agreement can report a status like:

10 Total update abortedLDAP error: Referral

The status, is actually the returned code (cb_data.rc) of send_entry/conn_send_extended_operation. These returned codes are ConnResult but not LDAP error.

For example, rc=CONN_TIMEOUT=10. But it is logged as a LDAP Error: Referral.
This confuse the diagnostic of the total init failure


thierry bordaz wrote:

This bug appears from time to time during IPA initialization of a replica. Initialization fails.
The reported status does not help to understand the reason of the failure, although it is a configuration issue.
There is no risk and the fix should be easy, to it worth to backport it in 1.3.2

For now, setting the milestone to 1.3.3.

The fix is not easy to test because it needs to trigger an network error.
To test it I attached the master process and set a breakpoint in see_if_write_available.
Start a full update and after 2-3 times it hits see_if_write_available, steps after the PR_poll (https://git.fedorahosted.org/cgit/389/ds.git/tree/ldap/servers/plugins/replication/repl5_connection.c#n549), set rc=0 (timeout) and continue .

The full update will fail and it will log:

{{{
nsds5replicaLastInitStatus: 10 connection error: time out - Total update aborted
}}}

do you need to log connection error in case of success, or would

connrc ? " - " : "", connrc ? connmsg : ""

suffice ?

ack - would also like a review from Ludwig

git merge ticket47901
Updating 4e39dbb..c70b88d
Fast-forward
ldap/servers/plugins/replication/repl5.h | 9 ++++++---
ldap/servers/plugins/replication/repl5_agmt.c | 30 ++++++++++++++++++++++++------
ldap/servers/plugins/replication/repl5_agmtlist.c | 2 +-
ldap/servers/plugins/replication/repl5_connection.c | 31 +++++++++++++++++++++++++++++++
ldap/servers/plugins/replication/repl5_protocol.c | 2 +-
ldap/servers/plugins/replication/repl5_tot_protocol.c | 34 +++++++++++++++++++---------------
ldap/servers/plugins/replication/windows_tot_protocol.c | 12 ++++++------
7 files changed, 88 insertions(+), 32 deletions(-)

git push origin master
Counting objects: 25, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (13/13), done.
Writing objects: 100% (13/13), 2.75 KiB, done.
Total 13 (delta 11), reused 0 (delta 0)
To ssh://git.fedorahosted.org/git/389/ds.git
4e39dbb..c70b88d master -> master

commit c70b88d
Author: Thierry bordaz (tbordaz) tbordaz@redhat.com
Date: Tue Oct 14 10:44:09 2014 +0200

'''push 1.3.3 branch'''

git push origin 389-ds-base-1.3.3
Counting objects: 25, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (13/13), done.
Writing objects: 100% (13/13), 2.76 KiB, done.
Total 13 (delta 11), reused 0 (delta 0)
To ssh://git.fedorahosted.org/git/389/ds.git
20888a6..1bf51f8 389-ds-base-1.3.3 -> 389-ds-base-1.3.3

commit 1bf51f8
Author: Thierry bordaz (tbordaz) tbordaz@redhat.com
Date: Tue Oct 14 10:44:09 2014 +0200

Metadata Update from @rmeggins:
- Issue assigned to tbordaz
- Issue set to the milestone: 1.3.3 backlog

7 years ago

389-ds-base is moving from Pagure to Github. This means that new issues and pull requests
will be accepted only in 389-ds-base's github repository.

This issue has been cloned to Github and is available here:
- https://github.com/389ds/389-ds-base/issues/1232

If you want to receive further updates on the issue, please navigate to the github issue
and click on subscribe button.

Thank you for understanding. We apologize for all inconvenience.

Metadata Update from @spichugi:
- Issue close_status updated to: wontfix (was: Fixed)

3 years ago

Login to comment on this ticket.

Metadata