Ticket #1570 (new defect)

Opened 4 years ago

Last modified 11 months ago

Keep track of per-PID/cgroup/process group fds in the PAM responder.

Reported by: jhrozek Owned by: somebody
Priority: minor Milestone: SSSD Future releases (no date set yet)
Component: SSSD Version: 1.9.1
Keywords: Cc:
Blocked By: Blocking:
Sensitive: no Tests Updated: no
Coverity Bug: Patch Submitted: no
Red Hat Bugzilla: 826192 , 887931, 887936, 887938 Design link:
Feature Milestone:
Design review: no Fedora test page:
Chosen: Candidate to push out: no
Release Notes:
Temp mark: no


This is another idea that came out of an IRC discussion between me, Sumit and Simo. Simo proposed that we keep track of open fds and when the count of open file descriptors reaches a certain limit, then kill the oldest one (=the one that had been open the longest). The question is, what should be used as the key in such structure - we can group the fds by PID, UID, process group or cgroup. The limit needs to be tunable.

Full discussion follows:

7:13 < simo> jhrozek: ok closing idle connections is ok
17:13 < simo> but does not protect you from malicious or simply badly misbehaving clients
17:14 < simo> jhrozek: I think we need to protect ourselves a bit more
17:14 < simo> and that can be relatively easily done with not too great changes 
17:14 < jhrozek> well, we would reach the fd limit soon with a misbehaving application
17:14 < simo> the way to do it is to keep a per-pid list of open sockets
17:14 < simo> perhaps in a hash table
17:15 < simo> if the same pid tries to open more than X connection we simply go and kill the least used one to make space
17:15 < simo> it also means keeping a timestamp associated with the fd that marks when a connection was used last
17:15 < simo> this way we can have a tunable parm that says something like: no more than 15 connections per pid
17:16 < jhrozek> wouldn't a malicious application simply fork and open a new fd in a child?
17:16 < simo> a very bad app could still fork children though ...
17:16 < simo> actually we could have a limit per process group
17:16 < simo> or even per user
17:16 < simo> but exempt root
17:16 < simo> or maybe we should have a combination of limits
17:17 < jhrozek> I think we should keep the limits simple
17:17 < simo> per-pid + per-cgroup + per-user
17:17 < jhrozek> to make sure we're not denying legitimate access
17:17 < simo> jhrozek: you can, but then sssd_pam can be easily abused and DoSed
17:18 < simo> all it takes is a user running a bash script that forks a child and cat the pam pipe
17:18 < simo> and presto you will use all FDs
17:18 < simo> so some limits that prevents that should be put in place on the server side
17:21 < jhrozek> OK
17:22 < jhrozek> I'll write it up into a ticket, but I still think the limits should be kept simple at least for configurations's sake
17:22 < jhrozek> X allowed connections per process group is much easier to understand than a combination of factors
17:22 < simo> jhrozek: oh yeah we need to try to make them genrally not something you want to actually explicitly configure unless you have so weird setup
17:22 < simo> jhrozek: yeah we can start simple and see if that suffices
17:23 < simo> jhrozek: I would think of using cgroups though
17:23 < simo> but not sure
17:23 < simo> the problem is that it is easy to create a new process group as a user
17:23 < simo> so it won't be effective
17:23 < jhrozek> I know little about cgroups
17:23 < simo> maybe we should simply have a relatively large X per-user
17:23 < simo> like 50 per-uid
17:23 < simo> tunable
17:24 < simo> root exampted
17:24 < simo> *exempted
17:24 < simo> for now
17:24 < jhrozek> and your proposal was to keep a timestamp or keep them ordered based on time and when the limit is reached, then go and kill the last one?
17:26 < simo> jhrozek: the first one (oldest) one
17:26 < simo> I assume that a misbehaving app is simply leaking fds
17:26 < simo> so killing the oldest should be safe
17:27 < jhrozek> simo: right, that's what I meant by "last".
17:27 < simo> for active DoSs I do not care which one is killed, they are all bad connections

Change History

comment:1 Changed 4 years ago by pbrezina

I'd do this generic for all responders not just PAM. It should be "easily" hidden inside responder common code.

comment:2 Changed 4 years ago by dpal

It seems that hash and collection interfaces from the ding-libs would be good primitives to use. It will be a two level object. The top level object the hash table of users. The key is the user and the value is the collection object. Collection will store the list of pids in the order of creation.

comment:3 Changed 4 years ago by jhrozek

  • Red Hat Bugzilla set to [https://bugzilla.redhat.com/show_bug.cgi?id=826192 826192]

Linked to Bugzilla bug: https://bugzilla.redhat.com/show_bug.cgi?id=826192 (Red Hat Enterprise Linux 6)

comment:4 Changed 4 years ago by dpal

  • Red Hat Bugzilla changed from [https://bugzilla.redhat.com/show_bug.cgi?id=826192 826192] to [https://bugzilla.redhat.com/show_bug.cgi?id=826192 826192] todo
  • Milestone changed from NEEDS_TRIAGE to SSSD 1.11 beta

comment:5 Changed 4 years ago by jhrozek

  • Red Hat Bugzilla changed from [https://bugzilla.redhat.com/show_bug.cgi?id=826192 826192] todo to [https://bugzilla.redhat.com/show_bug.cgi?id=826192 826192] , [https://bugzilla.redhat.com/show_bug.cgi?id=887931 887931]

Ticket has been cloned to Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=887931

comment:6 Changed 4 years ago by jhrozek

  • Red Hat Bugzilla changed from [https://bugzilla.redhat.com/show_bug.cgi?id=826192 826192] , [https://bugzilla.redhat.com/show_bug.cgi?id=887931 887931] to [https://bugzilla.redhat.com/show_bug.cgi?id=826192 826192] , [https://bugzilla.redhat.com/show_bug.cgi?id=887931 887931], [https://bugzilla.redhat.com/show_bug.cgi?id=887936 887936]

Ticket has been cloned to Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=887936

comment:7 Changed 4 years ago by jhrozek

  • Red Hat Bugzilla changed from [https://bugzilla.redhat.com/show_bug.cgi?id=826192 826192] , [https://bugzilla.redhat.com/show_bug.cgi?id=887931 887931], [https://bugzilla.redhat.com/show_bug.cgi?id=887936 887936] to [https://bugzilla.redhat.com/show_bug.cgi?id=826192 826192] , [https://bugzilla.redhat.com/show_bug.cgi?id=887931 887931], [https://bugzilla.redhat.com/show_bug.cgi?id=887936 887936], [https://bugzilla.redhat.com/show_bug.cgi?id=887938 887938]

Ticket has been cloned to Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=887938

comment:8 Changed 4 years ago by dpal

  • Milestone changed from SSSD 1.11 beta to SSSD 1.12 beta

comment:9 Changed 18 months ago by jhrozek

  • Priority changed from major to minor
  • Sensitive unset
  • Candidate to push out unset
  • Design review unset
  • Temp mark unset

comment:10 Changed 11 months ago by jhrozek

  • Milestone changed from SSSD 1.14 beta to SSSD 1.15 beta

Might make sense, but not in 1.14..

Note: See TracTickets for help on using tickets.