Last modified 8 years ago Last modified on 08/06/08 19:21:47

I've been working on a new web-based tool for analyzing Fedora: rpmgrok

It digests built RPMs, analysing the metadata and payload, and stores the results in a database. There's a web UI for viewing the data, an XML-RPC interface for querying it, and a command-line tool for using the XML-RPC interface.

I've got a prototype running on:

More info (e.g. source code) can be seen at

The idea is to provide a new way for Fedora developers, testers, and other enthusiasts to track various things across the entire distribution, without having to have a full tree installed. It's probably usable by other Linux distributions.

rpmgrok is Free/Open? Source software (licensed under the LGPLv2.1)

What does it track?

Note that due to my poor css there are lots of links that don't show up as such in the various table views. You may need to explore with the mouse to find all of the cross-referencing that the web UI has.

What's it currently showing?

I queued up an analysis of all of rawhide as of 2008-07-25 on i386; a little over 10000 built packages. It took about a week to process, and about 200 of these jobs failed for one reason or another. See for more info.

So the db is currently just showing a snapshot in time of rawhide two weeks ago, on one architecture (and missing 2% of the packages due to errors).

Ultimately I want to build things up so that we can show time-based trend reports e.g. the size of a minimal install over time (or whatever).

Help Needed!

Hopefully this looks of interest to people.

I need help with coding, with sysadmin work, with making the UI better, and with things I probably haven't thought of yet etc. I hope this can be a useful tool for Fedora.

If you're interested in hacking on rpmgrok, get in touch. The README.txt file is hopefully of interest.

It's implemented using TurboGears? and SQLAlchemy (specifically, sqlalchemy 0.4, since it uses polymorphic inheritance features from that version).

It also has a somewhat general-purpose task scheduler, used to control a pool of worker hosts that do the actual analysis. It ought to be pluggable to do other types of task.

Source Code

Git URLS are:

git:// ssh://

(you need to be in the gitrpmgrok of the Fedora Accounts System to have git push privileges; talk to me if you want to get involved)

Related work

Inspiration includes

  • the OpenGrok project (though that appears to focus on source trees, whereas rpmgrok focuses on built packages)
  • the Debian project's lintian tool