#967 Proposal for automated Fedora packaging cleanup policy.
Closed None Opened 11 years ago by johannbg.

Here is what I consider a concrete proposal to improve the situation

The only way we can determine if an contributor is active is to check if he has logged into our infrastructure and it just so happens to be we have an python script that does check for exactly that [1] so what we only needs is an way to match the user that has not logged in for X amount of time ( people can debate what that might be and or you simply decide what that time should be ) with package(s) (s)he owns and a simple email the person in question where we would politely ask him or her to log into fas and click big red button "I'M ACTIVE" within a certain amount of time which can be before the cron job is run again or something you decide I dont care, I however care that it's done before the next development begins cycle or no later then alpha freeze.

If the maintainer does not respond within a given time we should revoke that individuals maintainership and orphan the package.

If the component in question has "co-maintainer(s)" then he ( or one of them ) should be made the primary maintainer/owner of the component.

If there is no co-maintainer on the component in question the now orphaned package should be advertised on -devel to allow for it to be picked up and continued to be maintained within the distribution by someone else.

If in an individual has not offered to maintain the package no later then alpha freeze it will be drop from the release.

With my QA hat on I argue that the component in question that's being orphaned should go through the review process again since in a perfect pony world we kinda would like a confirmation that the individual that picks it up is in active communication with upstream.

Alpha freeze might be to late but the only way we can find out is for us to try it and see how it works out ( with the process in place we can always move it earlier in the development cycle )

You can argue all you want about me basing this proposal on login instead of packages or individuals component(s) but in the end of the day to me is the fact if you have never logged on to any of our infrastructure bits for quite sometime then you just have to accept the fact that you are a Fedora "User" not an "Contributor"...

1.https://github.com/pypingou/fedora-active-user


Adding meeting keyword.

If we do automate this, I wonder if we couldn't just have something that runs daily/weekly and keeps track of process...

  1. send email to possibly inactive
  2. user still inactive since X
  3. drop from packager, reassign packages/orphan
  4. If orphan, mail devel list.

We can leave all the retirement/dropping to the once per cycle time.

I think a good initial step would be to have a report generated from datanommer, running all the accounts in the FAS packager group, and listing last commit to git. (Datanommer doesn't have a lot of data right now, it has not been running very long, but we can go ahead and see what the report shows, it will be more useful as time goes on.)

We could then extend this such that when a threshold is crossed "packager has not committed to git in 60 days" (or whatever arbitrary time), we trigger an event to email the inactive maintainer warning them that they are hitting phase one on the inactive process. In lieu of building a big "alive button" web app, they could simply commit to git to reset their count. If they hit the second marker (e.g. 90 days), then they get warned that they have 10 days to either commit or have their packages orphaned.

There should be some logic overlap here with how we would need to handle badges, for example.

Based on mailing-list discussion, it would probably be good to exclude packagers that neither any bugs reported against their packages, nor any other "reason for activity". (What other reasons are there? The FTBFS report files bugs, doesn't it?)

We can make a datanommer report. Here's a preliminary script put together this afternoon: https://gist.github.com/4019299 It shows ~1250 contributors active since datanommer was turned on using data only from bodhi, the wiki, and FAS edits. That list is not limited to the packager group.

Getting git data tied to FAS has been tricky, but nirik suggested a good solution just today. I'll put it in place after the freeze.

We don't have any Bugzilla data in datanommer yet. The RH team has been working on a scheme to export an AMQP stream to us.. so we should have it in the future.

Probably makes sense to use one process to determine if maintainer is unresponsive and another one to take actions which follows https://fedoraproject.org/wiki/Policy_for_nonresponsive_package_maintainers and is run daily ( which basically means if bug is in status uncommented in status "new" for three weeks the component is orphaned )

And this process must be finished somewhere between branched and alpha freeze as in there must be a set date which when reached no more packages are orphaned/removed from the development release.

I, as a maintainer would be angry, if someone forced me to log somewhere and click that I'm active if I am. I'm afraid we loose maintainers who update their packages from time to time and they don't have any bugs.

The politics about opened bugs without response was discussed once. People could have different reasons for leaving bugs opened without response, for example development or fixing different bug. Bugzilla is a bug tracker not a support tool. You can't expect developers will post into all their new bugs "Thanks, for report, I don't have time, but I could try to debug it after a month" or "I don't know what's wrong."

If you insist on clicking somewhere, then please no more than once after Fedora release.

After discussion on devel list some developers would like to have co-maintainership of package. So what if we use the script https://github.com/pypingou/fedora-active-user everytime when someone asks fesco for acl of some package.
I'd rather give maintainers permission to fix things, then punish our developers.

We revisit the ticket after we gather specific data/use cases.

Has this data and these use cases been gathered? It's been 15 months.

I would suspect not and on related note our upstream release monitoring tool is mandatory right?

We are somehow keeping track of what we ship and what upstream ship or ships anything et all so we can keep track if upstream is dead/slow releasing/still releasing on all components or is that yet another process that is optional that should be mandatory and the "convenience" of the packager/maintainer but at the cost of everybody else?

Seriously we as a distribution should be able to easily find out if both the maintainers and upstream are alive and active.

In anycase we will be working around this limitation of our infrastructure team and a distribution as a whole in QA by adding the "lack of tools to figure this out" as part of manual inspection process of the packages that falls under our release criteria

This has not moved far enough up my todo list, sorry.

No, upstream release monitoring is not mandatory.

IMHO, any automated tooling in this area should take into account a number of factors, and simply note packagers/packages that are 'of concern' to look at. There are many cases here, and it's not a black and white world.

Replying to [comment:11 kevin]:

This has not moved far enough up my todo list, sorry.

No, upstream release monitoring is not mandatory.

IMHO, any automated tooling in this area should take into account a number of factors, and simply note packagers/packages that are 'of concern' to look at. There are many cases here, and it's not a black and white world.

I would think that each package we ship is 'of concern' to either the maintainer and or it's consumers but you are right we need to start small since there is no way we can cover the whole 14k range we have grown to in a one swoop.

That is how I intend we do with the QA community proposal I and roshi are working on to ensure we cover the most "important bits" ( core/base ) first then gradually expand to "least important" from a testing/triaging perspective and grow that coverage on a per component and or comps group bit by bit.

Hi,

A couple of months ago (day to day today), I looked at the activity of the
person in the packager group on FAS [1].
The graphs presented in that blog post just show the number of days between the
day I run the scrip and the day of the last action on datagrepper of that user.

So 2 months ago the statistics were:
* There are 1476 user in the packager group
* 224 were active today (day 0)
* 878 (59.5%) were active over the last 30 days
* 386 (26.2%) were not active for the last 100 days
* 296 (20%) were not active for the last 200 days
* 253 (17%) had no activity registered by fedmsg.
* The oldest activity registered is from 308 days ago.

So 75% to 80% of the members of the packagers group are active user. By active I
mean have had one action logged into the fedmsg bus, that might be on their
packages or just updating their personal biography on the wiki or tagging a
package on fedora-tagger.

This is interesting but does not mean much since we do not remove people from
their group. So these 20-25% might just be user that left, announced it and do
not maintain anymore packages in Fedora, so it might be all good.

So today, I re-ran the script to get newer data and started to look at it
querying the pkgdb2 api (which data is from ~10 days ago).

The output looks like:

296 packagers have not been active for more than 200 days
116 packagers have not been active for more than 200 days and are still POC on at least one package
180 packagers have not been active for more than 200 days and are not POC on any packages

246 packagers have not been active for more than 300 days
91 packagers have not been active for more than 300 days and are still POC on at least one package
155 packagers have not been active for more than 300 days and are not POC on any packages

So that means that we have 91 user, in the packager group, that did not perform
any action registered on fedmsg for 300 days or more.

There might be very low maintenance packages, but it might also be that these
persons have left the project and that we should re-attribute their packages.

Is this something we want to investigate further?

[1] http://blog.pingoured.fr/index.php?post/2013/12/18/Fedora-packagers-activity

Full list of these users:

''(note the number provided here are per branch, so 3 packages means 1 packages in 3 branches (devel, F20, F19), 5 packages: 1 packages 5 branches (Fedora, EPEL)) ''

Number of packages that the packagers is listed as POC while not having been active for 300 days
amdunn 21
rafalzaq 3
bamccaig 3
tmoertel 3
gospo 5
renep 5
gunnersrini 4
xning 5
magnu5 3
cerberus 6
grandcross 2
ekkis 5
timlank 5
jeffg 5
geofflevand 1
ndowens 7
krnowak 6
endur 3
avienda 3
masahase 6
progdan 3
stefan 6
musolinoa 8
jaydg 6
ebrand 5
pebenito 3
jrowens 3
fangq 9
nyrk71 3
sthiell 5
codehotter 1
furby 3
cgrau 13
amatubu 3
mstone 1
davidcornette 15
trausche 5
dignan 11
talcite 10
jolsa 4
ngompa 15
erikos 30
eallen 4
schmidtw 5
meyering 3
lucabotti 2
mzazrive 5
grue 3
imntreal 22
asaf 6
jasper 12
elanthis 11
imain 3
bogado 11
warthog9 5
plindner 4
als 5
sseago 5
aleksey2005 9
dbrasher 6
jmoyer 3
jgarzik 9
mbacovsk 11
wilmer 3
udushlivy 5
jomara 4
mebourne 6
lfaraone 3
uwog 18
jgold 3
icheishvili 10
jmatthews 5
sxw 41
cattelan 3
lyosnorezel 10
alcapcom 3
bagnara 3
roshansingh 11
bjrosen 5
tcolles 5
ravenoak 5
micklweiss 6
mnagy 10
ewan 3
brenton 18
llim 3
mintojoseph 12
dcm 4
davidf 2
arnd 4
* bskeggs 3

I looked on some maintainers and some simply maintain software which is useful, but don't have any new releases. Also some maintain upstream projects (there are upstream) and they don't do much releases (or it might be a release per year). Not sure how to tracked those really not active and those content with their software.

The package might be fine but these people did not do anything on Fedora (no badge, no wiki edit).

Maybe it's all fine, maybe we do not have anyone willing to react upon new releases/bugs for these packages.

@pingou: Can you cross off packages which had no bug open against them in those 300 days?

I'm not sure if there is anything actionable in this ticket. From the previous conversation it appears to me that there was no real consensus on how to address the proposal properly. Does anyone has anything to add?

This ticket has been open for three years with the last real activity on it two years ago. I'm closing this ticket as WONTFIX, please reopen it if there is new information.

Reopening ticket.

All the necessary information are FESCo just needs to start doing their due diligence as elected representatives and act upon it!

If the elected representatives are to incompetent to do so I suggest they step down and exclude themselves from further election in the community and allow for people that might be better at these things to step up and start doing what needs to be done rather then closing tickets.

Replying to [comment:21 johannbg]:

Reopening ticket.

All the necessary information are FESCo just needs to start doing their due diligence as elected representatives and act upon it!

If the elected representatives are to incompetent to do so I suggest they step down and exclude themselves from further election in the community and allow for people that might be better at these things to step up and start doing what needs to be done rather then closing tickets.

Johann, this response was unkind and unjustified. Please moderate your tone in the future.

If you have new information or suggestions on how to clarify the questions raised above (or want to contribute to developing tools to monitor activity), please do so. But the simple fact is that no one but you appears to consider this to be a priority and keeping this open indefinitely is not helping anyone.

If you choose to reopen this ticket, please do so with a calmly presented new proposal for us to discuss. It ''will'' be given appropriate consideration.

Login to comment on this ticket.

Metadata