#47388 [RFE] Support 'Content Synchronization Operation' (SyncRepl) - RFC 4533
Closed: wontfix None Opened 10 years ago by pspacek.

FreeIPA (bind-dyndb-ldap component) does synchronization between LDAP and external DNS database. Persistent search has some drawbacks for this use case:

Persistent search deficiencies

  1. Persistent search doesn't offer 'signal'/'indication' that all existing records were sent to client already and now the client waits for updates. (I.e. an equivalent of 'Sync Info Message' from [4533 section 3.4.1]].)[BR]
    It seems that there is a workaround for this problem, but it complicates a client application: https://lists.fedoraproject.org/pipermail/389-users/2013-June/015990.html

  2. The client application has to dump content of whole LDAP sub-tree to maintain consistency between LDAP and own state (e.g. application-specific database). This dump have to be re-done after any connection failure (and reconnection).

RFC 4533 use cases

  1. More effective bind-dyndb-ldap.

  2. (Potentially) A migration path to/from OpenLDAP?


I realized that [[http://tools.ietf.org/html/rfc4533|RFC 4533]] depends on entryUUID from [[http://tools.ietf.org/html/rfc4530|RFC 4530]].

Does it create any problem? It seems that nsUniqueId is basically entryUUID. I don't know if uuidMatch and uuidOrderingMatch are supported.

Replying to [comment:3 pspacek]:

I realized that [[http://tools.ietf.org/html/rfc4533|RFC 4533]] depends on entryUUID from [[http://tools.ietf.org/html/rfc4530|RFC 4530]].

Does it create any problem? It seems that nsUniqueId is basically entryUUID.

Not exactly. The format is slightly different. We'll have to investigate how to handle this as part of working on the ticket.

I don't know if uuidMatch and uuidOrderingMatch are supported.

They are not, but we can add support as part of the ticket work.

example client - works with OpenLDAP server
ldap_syncrepl.c

Added a patch with an initial implementation of rfc4533 as a plugin, there are still some pieces missing especially with update synchronization with cookies and a final specification for a cookie.

next step is to write a doc about the implemenation and have it reviewed, define missing pieces and perform more tests

Could you provide an experimental build to me, please? I would like hand out the build to bind-dyndb-ldap reviewer to test interoperability.

I don't care about crashes and risk of data loss, it would be only for testing purposes. Build for Fedora 19 x86_64 would be exactly what I need ... :-)

Thank you!

Ludwig, I applied your patch 0001-Initial-implementation-of-rfc4533-as-plugin on top of 36f3f34 and it doesn't work for me (without cookie, refreshAndPersist mode).

I will attach my testing data to the ticket as files test1-*. I used the program ldap_syncrepl.c​ as client, I will also attach it's output along with traffic capture.

There is no difference if I do bind as cn=directory manager or uid=admin,cn=users,cn=accounts,dc=ipa,dc=test, so it doesn't seem like an ACI problem.

I used the same client as is attached to this ticket, but I commented out the 'cookie' part:
{{{
#define INITIAL_SYNC_COOKIE "rid=000,csn=20130710145447.752775Z#000000#000#000000"
/ comment out following line if you want to start without cookie /
- sync_ctx.ls_cookie = ber_bvstrdup(INITIAL_SYNC_COOKIE);
+// sync_ctx.ls_cookie =
ber_bvstrdup(INITIAL_SYNC_COOKIE);
}}}

I let filters and scope on default values. Default settings with OpenLDAP return the whole sub-tree.

This is work in progress, the patch should save an intermediate state, so no surpris that not everything works. I did mainly test with cookies and that follows a different code path.

Thanks for testing and providing the results, I will work on it

I compeleted an implementation according to the doc:
http://port389.org/wiki/Content_synchronization_plugin
and sent it out for review, patch is attached.

After basic testing with 0001-Implement-RFC-4533-as-plugin-version-1.patch​, I think that there is a problem in Initial Content Poll phase ([http://tools.ietf.org/html/rfc4533#section-3.3.1 RFC 4533 section 3.3.1]), i.e. when client connects without cookie:

State field in Sync State Control has incorrect value during Initial Content Poll phase:
389 returns phase = LDAP_SYNC_CAPI_PRESENT (0x0)
OpenLDAP returns phase = LDAP_SYNC_CAPI_ADD (0x1)

I think that OpenLDAP behaves correctly, because RFC in section 3.3.1 prescribes state add.

See attached files openldap-331.pcap and 389ds-331.pcap. Data loaded in both servers are the same.

I will be happy to test next version of the patch!

traffic according to RFC4533 section 3.3.1 - 389 DS, patch 0001-Implement-RFC-4533-as-plugin-version-1.patch​
389ds-331.pcap

traffic according to RFC4533 section 3.3.1 - OpenLDAP openldap-servers-2.4.35-5.fc19.x86_64
openldap-331.pcap

The other problem relates to state-full operation. Cookie (as described in http://port389.org/wiki/Content_synchronization_plugin) doesn't contain <changenumber> part.

At the moment, cookie seems like this: vm-126.idm.lab.eng.brq.redhat.com:389#cn=directory manager:cn=dns-short,dc=ipa,dc=test:(objectClass=*)#(null) (length 109 bytes).

May be that it is intentional - I'm okay with that because I don't use state-full synchronization at the moment, I'm just giving you heads up.

Thanks for testing.

I will check the sync state control in refresh phase

Regarding the cookie, did you enable and configure the retro changelog and have at least on change - the changenumber is taken from the retro changelog?

Replying to [comment:17 lkrispen]: > Regarding the cookie, did you enable and configure the retro changelog and have at least on change - the changenumber is taken from the retro changelog? Your suspiction is correct, I didn't enable retro change log. Cookie (at the end of `refresh phase`) seems fine with configuration described in https://access.redhat.com/site/documentation/en-US/Red_Hat_Directory_Server/9.0/html/Administration_Guide/Managing_Replication-Using_the_Retro_Changelog_Plug_in.html. The documentation states this: {{{ To use the retro changelog plug-in, the Directory Server 9.0 instance must be configured as a single-master replica. }}} Can it work with FreeIPA? FreeIPA users multi-master replication all the time. Could you elaborate worst cases, please? I found two other issues regarding cookies: 1) In `refreshAndPersist` mode, starting without any cookie: A cookie is returned at the end of `refresh` phase but the cookie value doesn't change when a change is done in replicated part of the tree. E.g., see that value `15` is there all the time: {{{ ldap_sync_intermediate: cookie 'vm-126.idm.lab.eng.brq.redhat.com:389#cn=directory manager:cn=dns-short,dc=ipa,dc=test:(objectClass=*)#15' (length 105 bytes) phase: LDAP_SYNC_CAPI_DONE (0x50) => refresh phase is complete ==================== refresh done ==================== ==================== persist begins ==================== ldap_sync_search_entry: cookie 'vm-126.idm.lab.eng.brq.redhat.com:389#cn=directory manager:cn=dns-short,dc=ipa,dc=test:(objectClass=*)#15' (length 105 bytes) phase: LDAP_SYNC_CAPI_DELETE (0x3) entryUUID: 24a68682-1471-11e3-b806-d0d6bf892731 (length 16 bytes = 128 bits) DN: idnsName=rec-new,idnsName=u.test.,cn=dns-short,dc=ipa,dc=test ldap_sync_search_entry: cookie 'vm-126.idm.lab.eng.brq.redhat.com:389#cn=directory manager:cn=dns-short,dc=ipa,dc=test:(objectClass=*)#15' (length 105 bytes) phase: LDAP_SYNC_CAPI_ADD (0x1) entryUUID: 7d761f02-1471-11e3-8b87-d0d6bf892731 (length 16 bytes = 128 bits) DN: idnsName=rec-new,idnsName=u.test.,cn=dns-short,dc=ipa,dc=test ldap_sync_search_entry: cookie 'vm-126.idm.lab.eng.brq.redhat.com:389#cn=directory manager:cn=dns-short,dc=ipa,dc=test:(objectClass=*)#15' (length 105 bytes) phase: LDAP_SYNC_CAPI_MODIFY (0x2) entryUUID: 7d761f02-1471-11e3-8b87-d0d6bf892731 (length 16 bytes = 128 bits) DN: idnsName=rec-new,idnsName=u.test.,cn=dns-short,dc=ipa,dc=test }}} This IMHO breaks the synchronization between client and server, there should be a new cookie value after each change. OpenLDAP provides new cookie value after each change in replicated part of the tree: {{{ ldap_sync_intermediate: cookie 'rid=000,csn=20130902162530.886987Z#000000#000#000000' (length 52 bytes) phase: LDAP_SYNC_CAPI_DONE (0x50) => refresh phase is complete ==================== refresh done ==================== ==================== persist begins ==================== ldap_sync_search_entry: cookie 'rid=000,csn=20130903082243.880256Z#000000#000#000000' (length 52 bytes) phase: LDAP_SYNC_CAPI_ADD (0x1) entryUUID: c0489e10-a8bd-1032-8c2f-913463457c65 (length 16 bytes = 128 bits) DN: idnsName=rec-new,idnsName=u.test.,cn=dns-short,dc=ipa,dc=test ldap_sync_search_entry: cookie 'rid=000,csn=20130903082347.702613Z#000000#000#000000' (length 52 bytes) phase: LDAP_SYNC_CAPI_MODIFY (0x2) entryUUID: c0489e10-a8bd-1032-8c2f-913463457c65 (length 16 bytes = 128 bits) DN: idnsName=rec-new,idnsName=u.test.,cn=dns-short,dc=ipa,dc=test }}} [http://tools.ietf.org/html/rfc4533#section-3.4.2 RFC 4533 section 3.4.2: persist Stage] states that new cookie is optional, so server could theoreticaly provide cookie only at the end of refresh phase, but it seems strange to me. ---- Second problem is related to state-full `refreshAndPersist` - when client provides initial cookie value. It seems that server crashes when `changenumber` specified by client has `changeType` different than `add`. (I tested it with `modify` and `delete`, it crashes in both cases.) Error log contains this this: {{{ [03/Sep/2013:10:42:45 +0200] content-sync-plugin - Retro Changelog does not provied nsuniquedid.Check RCL plugin configuration. }}} Back trace: {{{ (gdb) bt #0 0x00007f4b8c3f8a81 in __strlen_sse2_pminub () from /lib64/libc.so.6 #1 0x00007f4b8e6f644d in slapi_value_set_string_passin (value=value@entry=0x7f4b74d4a710, strVal=0x0) at ldap/servers/slapd/value.c:381 #2 0x00007f4b8e6f64d8 in slapi_value_set_string (value=value@entry=0x7f4b74d4a710, strVal=strVal@entry=0x0) at ldap/servers/slapd/value.c:370 #3 0x00007f4b8e6f814e in valueset_add_string (a=0x7f4b5000ec40, vs=0x7f4b5000ec48, s=s@entry=0x0, t=t@entry=0 '\000', csn=csn@entry=0x0) at ldap/servers/slapd/valueset.c:1188 #4 0x00007f4b8e6922a8 in slapi_entry_add_string (e=e@entry=0x7f4b50019ae0, type=type@entry=0x7f4b852edfc9 "nsuniqueid", value=value@entry=0x0) at ldap/servers/slapd/entry.c:2792 #5 0x00007f4b852ec61f in sync_deleted_entry_from_changelog (cl_entry=cl_entry@entry=0x7f4b580096c0) at ldap/servers/plugins/sync/sync_refresh.c:394 #6 0x00007f4b852ec962 in sync_read_entry_from_changelog (cl_entry=0x7f4b580096c0, cb_data=0x7f4b74d50f20) at ldap/servers/plugins/sync/sync_refresh.c:514 #7 0x00007f4b8e6de30d in send_ldap_search_entry_ext (pb=pb@entry=0x7f4b50030890, e=<optimized out>, ectrls=ectrls@entry=0x0, attrs=0x0, attrsonly=0, send_result=send_result@entry=0, nentries=nentries@entry=0, urls=urls@entry=0x0) at ldap/servers/slapd/result.c:1516 #8 0x00007f4b8e6deb6c in send_ldap_search_entry (pb=pb@entry=0x7f4b50030890, e=<optimized out>, ectrls=ectrls@entry=0x0, attrs=<optimized out>, attrsonly=<optimized out>) at ldap/servers/slapd/result.c:1075 #9 0x00007f4b8e6c0389 in iterate (pb=pb@entry=0x7f4b50030890, pnentries=pnentries@entry=0x7f4b74d4ab48, pagesize=pagesize@entry=-1, pr_statp=pr_statp@entry=0x7f4b74d4aac4, be=0x7f4b50030890, send_result=1) at ldap/servers/slapd/opshared.c:1444 #10 0x00007f4b8e6c081d in send_results_ext (pb=pb@entry=0x7f4b50030890, nentries=nentries@entry=0x7f4b74d4ab48, pagesize=-1, pr_stat=pr_stat@entry=0x7f4b74d4aac4, send_result=1) at ldap/servers/slapd/opshared.c:1682 #11 0x00007f4b8e6c1ef1 in op_shared_search (pb=pb@entry=0x7f4b50030890, send_result=send_result@entry=1) at ldap/servers/slapd/opshared.c:853 #12 0x00007f4b8e6cedde in search_internal_callback_pb (pb=pb@entry=0x7f4b50030890, callback_data=callback_data@entry=0x7f4b74d50f20, prc=prc@entry=0x0, psec=psec@entry=0x7f4b852ec630 <sync_read_entry_from_changelog>, prec=prec@entry=0x0) at ldap/servers/slapd/plugin_internal_op.c:812 #13 0x00007f4b8e6cf2d9 in slapi_search_internal_callback_pb (pb=pb@entry=0x7f4b50030890, callback_data=callback_data@entry=0x7f4b74d50f20, prc=prc@entry=0x0, psec=psec@entry=0x7f4b852ec630 <sync_read_entry_from_changelog>, prec=prec@entry=0x0) at ldap/servers/slapd/plugin_internal_op.c:593 #14 0x00007f4b852ecdba in sync_refresh_update_content (pb=pb@entry=0x7f4b74d57ae0, client_cookie=0x7f4b5000c9e0, server_cookie=0x7f4b50006060) at ldap/servers/plugins/sync/sync_refresh.c:261 #15 0x00007f4b852ecf9d in sync_srch_refresh_pre_search (pb=0x7f4b74d57ae0) at ldap/servers/plugins/sync/sync_refresh.c:106 #16 0x00007f4b8e6cade5 in plugin_call_func (list=0x7f4b8f9e3630, operation=operation@entry=403, pb=pb@entry=0x7f4b74d57ae0, call_one=call_one@entry=0) at ldap/servers/slapd/plugin.c:1462 #17 0x00007f4b8e6caf4a in plugin_call_list (pb=0x7f4b74d57ae0, operation=403, list=<optimized out>) at ldap/servers/slapd/plugin.c:1424 #18 plugin_call_plugins (pb=pb@entry=0x7f4b74d57ae0, whichfunction=whichfunction@entry=403) at ldap/servers/slapd/plugin.c:398 #19 0x00007f4b8e6c1776 in op_shared_search (pb=pb@entry=0x7f4b74d57ae0, send_result=send_result@entry=1) at ldap/servers/slapd/opshared.c:565 #20 0x00007f4b8eb9aabd in do_search (pb=pb@entry=0x7f4b74d57ae0) at ldap/servers/slapd/search.c:413 #21 0x00007f4b8eb8a859 in connection_dispatch_operation (pb=0x7f4b74d57ae0, op=0x7f4b8ff33f80, conn=0x7f4b7555fd40) at ldap/servers/slapd/connection.c:677 #22 connection_threadmain () at ldap/servers/slapd/connection.c:2503 #23 0x00007f4b8ccbceb1 in _pt_root () from /lib64/libnspr4.so #24 0x00007f4b8c65ec53 in start_thread () from /lib64/libpthread.so.0 #25 0x00007f4b8c38c13d in clone () from /lib64/libc.so.6 }}} This modification & server restart didn't help: {{{ dn: cn=Retro Changelog Plugin,cn=plugins,cn=config changetype: modify add: nsslapd-attribute nsslapd-attribute: nsUniqueId }}} (I modified configuration, restarted server, did object add & delete and used latest change log number to be sure that change log entry was generated after the configuration change.)

Could you try to configure the RCL to include nsuniqueid as documented here:

http://port389.org/wiki/Content_synchronization_plugin#Access_to_nsuniqueid

But be aware that it will only be effective for new changes

The update of the change info in the cookie also depends on the presence of the targetuniquieid in the changelog entry.
The changes are queued in the postop pluguins in mod operation, but there is no knowledge of the corresponding changenumber, so when the entry is sent and the cookie has to be updated, the RCL is searched for (&(changenumber>latest_cookie_cnr)(targetuniqueid=entyr_nsuniqueid))

In multimaster replication changes can be applied on the different masters simultaneously and when replicated the update resolution protocol ensures that on all servers the data are consistent and that the changes are applied in csn order.
so if on master A givenname is replaced by NNN and on master B givenname is replaced by XXX, it depends on the csns to determine which change was the last one and has to be kept.
Since the RetroChangeLog only logs the modify operation and not the result of update resolution it can become unreliable in multimaster and therefor it is said to be only used in a single master environment.

But ReplSync dopes not send the changes from the changelog, it uses it only to detect which entries have been changed and then sends the full entry and the latest, correct version wins.

I did configuration according to http://port389.org/wiki/Content_synchronization_plugin#Access_to_nsuniqueid and the crash didn't happen again.

After that, I still can see the problem with cookie value update. The cookie value after any update is the same as value returned at the end of refresh phase. To summarize our dicussion on IRC:
If you specify a cookie to start with it works, the cookie gets incremented.
If you do a full refresh and persist it does not increment the cookie. Need to investigate.


To sum up known problems:
Server crashes if nsuniqueid is not stored in Retro Changelog
Server crashes if cookie value is some garbage: e.g. x
State in sync control is incorrect as described in comment:15 - this seems to be solved by 0001-Implement-RFC-4533-as-plugin-version-2.patch (it needs more testing).
The cookie value is not updated after a update - this happens only in refreshAndPersist mode if client didn't provide initial cookie.

I uploaded a new version (3) of the implemetation.

It should address all the issues raised and adds the following:
- the sync control is now in the dse.ldif and can have an aci to control use of it
- the the number of maximum concurrent persistent sync requests can be configured in the plugin as nslapd-pluginArg0

We will run into problems, sooner or later, with the fact that sync_acquire_connection/sync_release_connection are not implemented. We should expose as little as possible to slapi in order to implement the missing functionality (i.e. not expose the entire slapi_connection internals).

There is some code duplicated from 389 slapi - sync_get_attr_value_from_entry instead of slapi_entry_attr_get_charptr() - but I suppose that is necessary in order to be as slapi backwards compatible as possible

  • connection management: I did postpone the implementation of acquire/release until I would run into propblems to see what is the minimum required. So far everything seems to work fine - persistent requests can be terminated, the server can be cleanly shutdown, valgrind doesn't show any operation or connection related leaks.
    I do agree that there will be a need to handle connections, but I will need to run more load tests to see where the problems are

  • slapi_entry_attr_get_charptr: I just copied code from retrocl and didn't think of this function. There is no reason not to use it and will change it in the next version of the patch

Hi Ludwig,

from an application developer's view, is there any impact on the server (maintaining the cookies) when syncrepl is used? Does this impact (if any) grow with the number of clients?

the update of the cookie requires an internal search to the retro changelog to get the changenumber corresponding to a specific update. The rfc does not require that the cookie is updated with every change sent, but at the moment it is updated with every entry sent.
If this is noticably to much, we can do an update of teh cookie only every n entries.

If there are mor concurrent syncrepl clients this means there are more update operations, the cookies are maintained per session.
But the number of concurrent persitent sync clients can and should be limited

commited after applying tthe requested changes from review:
- add plugin dependency to retro cl
- remove not needed function
- cleanup code(tab,spaces,braces)
- activate error logging, which was still commented out

$ git merge ticket47388
Updating 4d20922..1e7c62d
Fast-forward
Makefile.am | 14 ++++
Makefile.in | 103 +++++++++++++++++++++--
ldap/admin/src/scripts/50contentsync.ldif | 22 +++++
ldap/ldif/template-dse.ldif.in | 21 +++++
ldap/servers/plugins/sync/sync.h | 196 +++++++++++++++++++++++++++++++++++++++++++
ldap/servers/plugins/sync/sync_init.c | 174 +++++++++++++++++++++++++++++++++++++++
ldap/servers/plugins/sync/sync_persist.c | 693 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
ldap/servers/plugins/sync/sync_refresh.c | 732 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
ldap/servers/plugins/sync/sync_util.c | 685 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
ldap/servers/slapd/connection.c | 7 +-
ldap/servers/slapd/libslapd.def | 4 +
ldap/servers/slapd/operation.c | 58 +++++++++++++
ldap/servers/slapd/plugin.c | 9 ++
ldap/servers/slapd/proto-slap.h | 2 +
ldap/servers/slapd/result.c | 80 +++++++++++++++++-
ldap/servers/slapd/slapi-plugin.h | 9 ++
16 files changed, 2799 insertions(+), 10 deletions(-)
create mode 100644 ldap/admin/src/scripts/50contentsync.ldif
create mode 100644 ldap/servers/plugins/sync/sync.h
create mode 100644 ldap/servers/plugins/sync/sync_init.c
create mode 100644 ldap/servers/plugins/sync/sync_persist.c
create mode 100644 ldap/servers/plugins/sync/sync_refresh.c
create mode 100644 ldap/servers/plugins/sync/sync_util.c
$ git push origin master
Counting objects: 46, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (26/26), done.
Writing objects: 100% (27/27), 28.54 KiB, done.
Total 27 (delta 16), reused 0 (delta 0)
To ssh://git.fedorahosted.org/git/389/ds.git
4d20922..1e7c62d master -> master

Thank you for your effort, guys!

Do you plan to release some version with support for RFC 4533 to Fedora 20?

It would be great, because FreeIPA would like to release new bind-dyndb-ldap (version 4.x) to Fedora 20 - and RFC 4533 is a requirement for that.

Replying to [comment:32 pspacek]:

Thank you for your effort, guys!

Do you plan to release some version with support for RFC 4533 to Fedora 20?

Yes. 389-ds-base-1.3.2 is planned for Fedora 20. We just finished the tickets for 1.3.2. We are moving into the testing phase.

It would be great, because FreeIPA would like to release new bind-dyndb-ldap (version 4.x) to Fedora 20 - and RFC 4533 is a requirement for that.

Metadata Update from @rmeggins:
- Issue assigned to lkrispen
- Issue set to the milestone: 1.3.2 - 09/13 (September)

7 years ago

389-ds-base is moving from Pagure to Github. This means that new issues and pull requests
will be accepted only in 389-ds-base's github repository.

This issue has been cloned to Github and is available here:
- https://github.com/389ds/389-ds-base/issues/725

If you want to receive further updates on the issue, please navigate to the github issue
and click on subscribe button.

Thank you for understanding. We apologize for all inconvenience.

Metadata Update from @spichugi:
- Issue close_status updated to: wontfix (was: Fixed)

3 years ago

Login to comment on this ticket.

Metadata