#2511 sssd SRV hardcoded timeouts (and general HA gripes)
Closed: Duplicate None Opened 9 years ago by gprocunier.

Given an environment that consists of multiple authentication nodes you can configure sssd to access these destinations via ordered list or SRV records.

The issue here is that neither of these methods scale well.

Example 1 - ordered list

ipa_server = a, b, c, d
krb5_server = a, b, c, d
ldap_uri = a, b, c, d

Given 3500 servers with this configuration, all of those servers will use server A until it fails, and then move on to B, then C, etc.

This can act on your environment like a wrecking ball as mass load moves around.

Example 2 - Replace the ordered list with SRV records

This seems better except for the fact that you guys hardcode a timelimit/dont respect SRV TTL's.

src/providers/data_provider_fo.c:73
opts->srv_retry_timeout = 14400;

At a minimum SSSD should respect SRV TTLs so I can try and round robin.

Ideally (!) there should be some sort of HA/Load balanace scheme for lists of providers.

Help :)


Sure, but that ticket was opened 20 months ago (with no public updates) and I have provided two use cases where the current behavior is causing problems in our environment.

In the ordered list scenario, given enough servers you can create a Denial of Service from volume.

Fields changed

milestone: NEEDS_TRIAGE => SSSD 1.13 beta

This ticket and #1884 was requested via a downstream support case. Moving back to NEEDS_TRIAGE.

milestone: SSSD 1.13 beta => NEEDS_TRIAGE

Fields changed

owner: somebody => jhrozek
status: new => assigned

Fields changed

milestone: NEEDS_TRIAGE => SSSD 1.12.4

We're not going to implement example 1 per se in 1.12, but something similar instead -- in 1.13, we're going to implement ticket #2499 which would handle the HA scenario using a single host name that resolves into multiple IP addresses.

For 1.12, I've just sent a patch for ticket #1884 that implements honoring the SRV TTL values. I'm therefore closing this bug as duplicate of #1884. Please let me know if you'd like some test packages (just the RHEL/Fedora release is fine) and I'll build them.

Thank you for your patience.

resolution: => duplicate
status: assigned => closed

Metadata Update from @gprocunier:
- Issue assigned to jhrozek
- Issue set to the milestone: SSSD 1.12.4

7 years ago

SSSD is moving from Pagure to Github. This means that new issues and pull requests
will be accepted only in SSSD's github repository.

This issue has been cloned to Github and is available here:
- https://github.com/SSSD/sssd/issues/3553

If you want to receive further updates on the issue, please navigate to the github issue
and click on subscribe button.

Thank you for understanding. We apologize for all inconvenience.

Login to comment on this ticket.

Metadata