When frequently updating an attribute, all the deleted values are kept until they eventually are purged. But they "most likely" never will be needed in update resolution.
The task is to review update/conflict resolution to decide if it is "most likely" or "definitely not", so that the deleted values could be immediately purged. A side effect could be to create a design doc for the current replication protocol and identify other areas for improvement
I looked up old design documents and there is a scenario which requires to keep the deleted values with their deletion csn. The scenario is quite artificial, but needs to be handled. It involves three masters, where in parrallel on one master a value "v" of the naming attribute is deleted and on the two others there is a modrdn operation on each, the one with the smaller csn makes the value "v" distinguished. If then the order in which the mods are received are not in csn order there could be an inconsistent state if it is not recorded that "v" was deleted with a csn later than the modrdn operations.
So to handle this situation the deleted values need to be kept until they are purged. Need to look into the calculation of the purge csn and if the purge delay can be reduced
I started to write a doc on entry state resolution, which will explain which data are required, how the current code handles this and what could be optimized.
attachment 0001-Reduce-replication-meta-data-stored-in-entry-cf-tick.patch
The current status of the document is here: http://port389.org/wiki/Entry_State_Resolution
The current state of the implementation is in the attached patch file.
What still needs to be done: - complete the test cases and create test scripts - verify acceptance test results, some tests fail since they examine deleted values and value csns based on the old algorithm, need to modify these test cases. - there is one test failing because server seems to crash, need to investigate this - code also contains fix for ticket #602 by setting subsequence numbers, investigate if this is correct for all usages of csn_compare or if csn_compare needs to be differentiated
attachment 0001-New-Reduce-replication-meta-data-stored-in-entry-cf-tick.patch
verified TET results, failing tests are failing because they expect deleted values which are no longer kept or becasue of the presence of subsequence numberes. If the patch is accepted TET tests should be modified.
{{{ 1728 valueset_add_string (&a->a_deleted_values, "", CSN_TYPE_VALUE_DELETED, a->a_deletioncsn); }}} do we need to account for this when calculating the size e.g. for the entry size for the entry cache?
Otherwise, looks good. We will need to test this quite a bit.
This value is added eventually when the entry is written in id2entry_add, it will then added to the cache and its size will be calculated and take this into a account like the other changes to the entry.
yes, testing will go on
git merge ticket569 Updating 79346de..c7f6f16 Fast-forward ldap/servers/slapd/entry.c | 10 ++++ ldap/servers/slapd/entrywsi.c | 553 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++------------------------------------------------------------ ldap/servers/slapd/proto-slap.h | 1 + ldap/servers/slapd/valueset.c | 9 +++ 4 files changed, 378 insertions(+), 195 deletions(-) $ git push origin master Counting objects: 17, done. Delta compression using up to 4 threads. Compressing objects: 100% (9/9), done. Writing objects: 100% (9/9), 3.12 KiB, done. Total 9 (delta 7), reused 0 (delta 0) To ssh://git.fedorahosted.org/git/389/ds.git 79346de..c7f6f16 master -> master
git patch file (master) 0001-Ticket-569-examine-replication-code-to-reduce-amount.patch
Description: commit c7f6f16 made this change: In case the deleted value list in an attribute is empty: * this means the entry is deleted and has no more attributes, * when writing the attr to disk we would loose the AD-csn. * Add an empty value to the set of deleted values. This will * never be seen by any client. It will never be moved to the * present values and is only used to preserve the AD-csn. The AD-csn size was not counted for the buffer size to allocate. This patch adds the size.
Reviewed by Nathan (Thank you!!)
Pushed to master: commit ca02529
This breaks single valued attribute updates {{{ dn: cn=test,dc=example,dc=com objectclass: extensibleObject employeenumber: value1 }}}
then do ldapmodify {{{ changetype: modify add: employeenumber employeenumber: value2 - delete: employeenumber employeenumber: value1 }}}
The end result is this: {{{ dn: cn=test,dc=example,dc=com }}} The modify deletes the employeenumber attribute. it should have {{{ employeenumber: value2 }}}
attachment 0001-Fix-regression-from-fix-for-569-when-the-last-value-.patch
ack
$ git merge ticket569-regression Updating 161243d..b73f1e8 Fast-forward ldap/servers/slapd/entrywsi.c | 2 ++ 1 file changed, 2 insertions(+)
$ git push origin master Counting objects: 11, done. Delta compression using up to 4 threads. Compressing objects: 100% (6/6), done. Writing objects: 100% (6/6), 625 bytes, done. Total 6 (delta 4), reused 0 (delta 0) To ssh://git.fedorahosted.org/git/389/ds.git 161243d..b73f1e8 master -> master
Ticket has been cloned to Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1044216
Metadata Update from @lkrispen: - Issue assigned to lkrispen - Issue set to the milestone: 1.3.2 - 05/13 (May)
389-ds-base is moving from Pagure to Github. This means that new issues and pull requests will be accepted only in 389-ds-base's github repository.
This issue has been cloned to Github and is available here: - https://github.com/389ds/389-ds-base/issues/569
If you want to receive further updates on the issue, please navigate to the github issue and click on subscribe button.
subscribe
Thank you for understanding. We apologize for all inconvenience.
Metadata Update from @spichugi: - Issue close_status updated to: wontfix (was: Fixed)
Login to comment on this ticket.