What is rpmgrok?
rpmgrok is a web application for browsing the payloads of a large collection of RPM software packages. It digests a set of input RPMs, analysing the metadata and payload, and storing the results in a database. There's a web UI for viewing the data, an XML-RPC interface for querying it, and a command-line tool for using the XML-RPC interface.
The idea is to provide a new way for Fedora developers, testers, and other enthusiasts to track various things across the entire distribution, without having to have a full tree installed. It's probably usable by other Linux distributions.
rpmgrok is Free/Open? Source software (licensed under the LGPLv2.1). It's at an early stage of development.
What does it track?
- manifests of all RPMs, so that you can browse the files in packages via a web UI.
- all symbols in binaries/libraries, and the dependencies between them, so that you can see e.g. exactly what calls a particular function. This can also be used to locate instances of static linkage.
- all shared objects names, and the dependencies between them
- results of rpmlint of all rpms
- all .desktop files and their fields so that you can e.g. find applications that can handle PDF files
- SLOCcount stats for prepped source trees (e.g "what % of Fedora is in C/C++/Python?" etc)
- any other kind of thing we want to add (provided there's a sane way to gather it in a script and slurp it into the database, of course...)
- sizes of packages; why is package foo so big?
- report on all fonts in the distro, and what packages provide them etc
Where is it?
A public prototype was viewable at http://publictest7.fedoraproject.org/rpmgrok though this is currently down. I've reopened the hosting ticket so hopefully I'll be able to put a public prototype up again soon.
Dave Malcolm (dmalcolm@…) is the original author.
It doesn't yet have a dedicated mailing list; for now use email@example.com to discuss it.
It's implemented using TurboGears? and SQLAlchemy (specifically, sqlalchemy 0.4, since it uses polymorphic inheritance features from that version).
It also has a somewhat general-purpose task scheduler, used to control a pool of worker hosts that do the actual analysis.
rpmgrok uses git to store its source code.
The source code can be viewed via a web UI at http://git.fedorahosted.org/git/rpmgrok.git
If you're interested in hacking on rpmgrok, get in touch. The README.txt file is hopefully of interest.
You can get anonymous access to the source code via:
git clone git://git.fedorahosted.org/rpmgrok.git
For write-access you'll need to be in the gitrpmgrok of the Fedora Accounts System to have push privileges; talk to me if you want to get involved. For this case, the invocation is
git clone ssh://git.fedorahosted.org/git/rpmgrok.git
- the OpenGrok project (though that appears to focus on source trees, whereas rpmgrok focuses on built binary packages)
- the Debian project's lintian tool