wiki:BIND9/Design/RBTDB
Last modified 2 months ago Last modified on 02/11/14 14:27:23

Re-usage of BIND's RBT DB

Overview

Patched BIND has an API for database back-ends. Bind-dyndb-ldap re-implements big part of the API, but all functions required for DNSSEC support are missing and overall functionality is limited.

BIND's native database implementation is called RBTDB (Red Black Tree Database). RBTDB implements the whole API, supports DNSSEC, IXFR etc.

The plan is to drop most of code from our database implementation and re-use RBTDB as much as possible.

Discussion:

Use Cases

  • DNSSEC support will require significantly less code in bind-dyndb-ldap

We will get support for these features 'for free':

Design

  • For each LDAP DB maintained by bind-dyndb-ldap: Create internal RBTDB instance and hide it inside LDAP DB instance. E.g.:
     typedef struct {
            dns_db_t                        common;
            isc_refcount_t                  refs;
            ldap_instance_t                 *ldap_inst;
    +       dns_db_t                        *rbtdb;
     } ldapdb_t;
    
    • The new instance will be empty, i.e. without any data. It has to be populated with records from LDAP.
  • Remove our implementation of all functions in ldap_driver.c and turn most of functions into thin wrappers around RBTDB:
     static isc_result_t
     allrdatasets(dns_db_t *db, dns_dbnode_t *node, dns_dbversion_t *version,
                 isc_stdtime_t now, dns_rdatasetiter_t **iteratorp)
     {
           ldapdb_t *ldapdb = (ldapdb_t *) db;
    
           REQUIRE(VALID_LDAPDB(ldapdb));
    
    +      return dns_db_allrdatasets(ldapdb->rbtdb, node, version, now, iteratorp);
    -      [our implementation]
     }
    

Block diagram follows. Blue parts are controlled by bind-dyndb-ldap: block diagram for RBTDB integration

The problem is how to dump data from LDAP to the internal/hidden RBTDB instance and how to maintain consistency when changes in LDAP are made. There are several problems:

BIND start-up sequence

(This optimization is tracked by ticket 124.)

  1. Load saved (cached) version of zones from filesystem if they are present.
    • This ensures quick start-up and saves computing power (think about re-creating of DNSSEC signatures).
    • Zone will not be served to clients if data in filesystem cache are older than X seconds (interval is specified by SOA expiry field in particular zone).
    • BIND will start with disabled dynamic updates. This ensures that no changes will get lost during initial database synchronization.
  2. Dump all data from LDAP.
  3. Compute differences between data from LDAP and cached version of zones.
  4. Apply all differences to cached version.
  5. Consistency between LDAP and local database is restored - enable dynamic updates.

Initial database synchronization

Fortunatelly, 389 DS team decided to support RFC 4533 (so-called syncrepl, 389 DS ticket #47388). This will save us a lot of headaches caused by persistent search deficiencies.

The current plan is to use refreshAndPersist mode from RFC 4533.

This allows us to store syncCookie returned from LDAP server and resume synchronization process after restart/re-connection etc. As a result, we don't need to dump content of the whole database during each BIND restart.

Syncrepl puts a new requirement on the LDAP client: Bind-dyndb-ldap has to be able to map entryUUID to the associated entry in RBTDB.

We can create auxiliary RBTDB and store mapping entryUUID=>DNS name mapping inside it. This RBTDB will stored to and loaded from filesystem as any other RBTDB.

Entry renaming/moddn handling

(This work is tracked by ticket 123.) The entry was renamed if received change notification contains an entryUUID and some DN, but particular entryUUID is already mapped to some DNS name which doesn't match the name derived directly from DN. In that case, the old name will be deleted from RBTDB completely and the new entry will be filled with the data.

Run-time changes made directly at LDAP level

The content of changed LDAP entry is received by the plugin via syncrepl. The plugin has to synchronize records in RBTDB with received entry.

DNS dynamic updates

We can intercept calls to dns_db_addrdataset() and dns_db_deleterdataset(), modify LDAP DB and then modify RBT DB. The entry change notification (ECN) from LDAP will be propagated back to BIND via persistent search and then applied again (usually with no effect).

Race conditions

There is race condition potential. E.g. multiple successive changes in single entry (i.e. DNS name) done by BIND:

  1. First change from BIND written to LDAP and RBTDB
  2. Second change from BIND written to LDAP and RBTDB
  3. First ECN is received from LDAP by BIND: RBTDB is synchronized to state denoted by the ECN, second change is discarded.
  4. Second entry change notification is received and consistency is restored.

The other option is to not write directly to RBTDB, but this way have other problem:

  1. LDAP DB is updated by BIND, but RBT DB is not updated at the same time.
    • Queries between the moment of update and ECN from LDAP will return old results.
  2. BIND receives ECN from LDAP.
  3. The change is applied to RBT DB.
    • Clients can see new data from this moment.

Update filtering based on modifiesName attribute is not feasible, because modifiersName is not updated on delete.

Periodical re-synchronization

(This work is tracked by ticket 125.) During initial discussion we decided to implement periodical LDAP->RBTDB re-synchronization. It should ensure that all discrepancies between LDAP and RBTDB will be solved eventually.

Principle

  1. Get RBTDB iterator for one particular database
  2. Do full sub-tree LDAP search for particular zone in LDAP
  3. Go though sorted list of names and do name by name comparion
  4. Data from LDAP always win

Questions

  • Do we still need periodical re-synchronization if 389 DS team implements RFC 4533 (syncrepl)? It wasn't considered in the initial design.
  • What about dynamic updates during re-synchronization?
  • How to get sorted list of entries from LDAP? Use LDAP server-side sorting? Do we have necessary indices?

How to detemine re-synchronization interval?

Provide resync_interval_min and resync_internal_max configuration options. Start with some initial value (= minimal?) and double the interval if no discrepancies were found. Divide the interval by 2 in case of any error. New value has be belong into interval [resync_internal_min, resync_internal_max].

  • Question: Is it a good idea?

_location special case

(This work is tracked by ticket 126.) FreeIPA project proposed DNS-based location discovery mechanism. It requires automatic generation of _location records.

The idea is that each domain which holds an A/AAAA record should have own sub-domain _location which points to selected servers.

For example, the record host.example.com. A 192.0.2.1 automatically generates sub-domain _location.host.example.com. DNAME newyork._locations.example.com. Value newyork._locations.example.com. is configured per-server.

Synchronization process has to ensure that each domain which holds an A/AAAA record have own _location sub-domain. Value specified explicitly in LDAP wins over the default value.

(Filesystem) cache maintenance

Questions: How often should we save the cache from operating memory to disk?

  • On shutdown only?
  • On start-up (after initial synchronization) and on shutdown?
  • Periodically? How often? At the end of periodical re-synchronization?
  • Each N updates?
  • If N % of the database was changed? (pspacek's favorite)

Implementation

Initial implementation has some limitations:

#123
LDAP MODRDN (rename) is not supported
#124
Startup with big amount of data in LDAP is slow
#125
Periodical re-synchronization is not implemented
#126
Support per-server _location records for FreeIPA sites
#127
Zones enabled at run-time are not loaded properly
#128
Records deleted when connection to LDAP is down are not refreshed properly
#134
Child DNS zone is corrupted if parent zone is hosted on the same server

Feature Management

This feature doesn't require special management. Options directory, resync_interval_min and resync_interval_max are provided for special cases. Default values should work for all users.

Major configuration options and enablement

New options in /etc/named.conf:

  • directory specifies a filesystem path where cached zones are stored.
  • resync_interval_min and resync_interval_max control periodical re-synchronization as described above.

Existing SOA expiry field in each zone specifies longest time interval when data from cache can be served to clients even if connection to LDAP is down.

Replication

No impact on replication.

Updates and Upgrades

No impact on updates and upgrades.

Dependencies

This feature depends on 389 with support for RFC 4533 (so-called syncrepl). See 389 DS ticket #47388.

External Impact

No impact on other development teams and components.

Backup and Restore

Path specified by directory option has to exist and be writeable by named. It is not necesary to backup content of the cache.

Test Plan

Test scenarios that will be transformed to test cases for FreeIPA Continuous Integration during implementation or review phase.

RFE Author

Petr Spacek <pspacek@…>

Attachments