Last modified 9 years ago Last modified on 06/06/08 03:41:24

Online application database (Amber)

The goal is to have an online application browsing and install experience for Fedora 10 that is rich and useful.

Why online?

Online makes it easier for users to submit content. We'll need to show lots of data about applications, and we aren't going to pay anyone to create all that content. We'll get 'seed' data from existing sources (such as comps and desktop files), but to make the app database useful, we have to provide an easy way for users to submit and maintain content. Requiring an internet connection shouldn't be a problem in today's world, especially since installing the applications themselves will (usually) require an internet connection.

Where does the software come from?

For the first pass, just the Fedora repositories. We can look at adding support for other distributions and 3rd party repositories in the future. Other distributions may have to be another instance of the web application. The issues with 3rd party repositories are probably more legal and social than technical. We'll use the existing yum and PackageKit framework to actually do the work of installing the packages.

What's an application?

An application is not a package. An application might come in a package along with other applications, or it might be spread out over several packages. An item in the menus (or a desktop file) is a good starting point for defining an application, but there are exceptions.

Who is the target market?

Fedora users. I'm not sure if Fedora has ever defined a target market, but we can probably safely start with Fedora's existing user base of mostly technical users, plus the mythical "Mom" - as in "Linux my mom can use."

Is this only for Fedora?

We aren't going to target other distributions initially. However, we should avoid coupling tightly with Fedora, so other distributions can easily adopt it. The primary issue is with package names. Packages will be named differently in other distributions, and applications may be split differently between packages. This is pretty similar to the issues found going between one version of Fedora and another. The mapping needs to be: Application -> OS version -> Package, not directly from Application -> Package. We should also import and export data in standard formats to make it more generally accessable.

What are the primary use cases?

  1. I want to do 'X' with my computer.
    • Import photos from my camera and put them on Facebook or Flickr.
    • Import videos and put them on youtube.
    • Fix the red-eye in this photo.
    • Connect to MSN/ICQ/AOL
    • Browse the internet
    • Read my email
    • Play a game
    • Write a report
    • Make a presentation
  1. I can already do 'X' with my computer, but maybe I can do it

better with different software.

  • This software sucks. Is there something better out there?
  • Someone told me about the great new web browser/email client/game that I should check out.
  • Games are a bit of a special case, I think, because a user is more likely to look for a new game without being disastisfied with their current games. Not really a primary use-case, though.

How will the apps be organized?

The database organizes things by frequency of use as a top-level property. I think this is wrong for an application installer for a couple of reasons. It doesn't really tell the user what they want to know, and it hides interesting new or little-known applications under the ones that everyone already uses (because they are in the default install). Popularity is important for a lot of reasons, though, just not as a top-level organizational unit.

When a user clicks on the 'Install new applications' button, he's probably either looking for an application to solve a specific problem, or he's just browsing around to see what's available. For the first use case, search should be front-and-center. Something along the lines of: "I'm looking for an application to _". For the browsers, window-shopping is the way to go. This maps pretty closely to 'groups' or 'categories', but in some cases the existing groups we use may not be a good match.

What exactly is an 'Accessory', anyway? Why are painting, drawing, and photo manipulation programs in the same menu as photo organizers? It might be interesting to look at organizing applications from a task-oriented point of view instead of 'groups'. The groups don't necessarily need to be a static list that we make up, or use the list of groups from comps or GNOME's application menu. For instance, let users tag applications with "I use this application to ", and the top 10 or so tags might make a pretty good list of top level categories. This is kind of speculative, and thus harder to get right, so a static list is good enough for a first pass.

Back to the example. The top result on that page is GNOME terminal. If I use a terminal in Fedora, that's almost certainly the one I already use, because that's the one that gets installed by default. But if I'm going to an application installer, it should tell me about *other* terminal options. It might prioritize them based upon popularity, group them based upon suitability to task, or what users 'like me' use - KDE users probably use kterm, xfce users probably use...something else.

For instance, Bill remembers using once a terminal replacement that works like the 'terminal' in first person shooters - it pops up when you press a hotkey, then goes away when you press the hotkey again. However, he's re-installed since using it, and doesn't remember what the package is called. He goes to his application installer, and has two choices. First, he could type 'terminal like quake' into the search box. If that doesn't work, he could click on the 'terminal emulators' category and browse the options. Popularity may come into play at this stage. For instance:

Terminal emulators:

You already have GNOME Terminal installed.

Other popular Terminal emulators:

  • kterm
  • tilda
  • Bob's own terminal
  • ...

For both browsing and search results, it's probably best to work on a format that let's people browse though results quickly, showing them just enough information so they can decide if they are interested in getting more detail or not. Google does this very well, IMHO.

Another idea is to have a sort of 'shopping cart' to select a group of applications and then install them all.

Suitability for a given task

I like the idea of applications being "Suitable for a task" rather than just being in static groups. I think it makes more sense from the perspective of finding applications that we show apps in terms of what they can do, combined with how well they do it. and glabels both have facilities for printing mailing labels, but glabels probably does it better. The UI might look a bit like:

I want to: Print mailing labels Writer: Installed - 2 stars [ Screenshot ]

User Comment: "I can't get this to work! Argh!"

glabels: Not installed - 4 stars [ Screenshot ]

User Comment: "Works like a charm!"

Again, speculative or experimental things like this are probably not something we can implement fully up-front, but something to keep in mind as we design and build this.

How is search implemented?

Search is hard. Two ideas I have are just using a google site search for the first pass (I'm not sure what the terms of use are offhand), or something like Lucene. Lucene is a high-quality open source search engine. Along the lines of the suitability for a given task idea, we might label the search box "I'm looking for software to ", and weight the search engine results appropriately.

The 'My Fedora' project is working on some search tools which might also be useful in this context.

How does installation work?

From the user's perspective, once he finds an app, he should be able to install it with a minimum of effort. Clicking on the install button should pop up an 'are you sure?' type window, and then go. PackageKit makes the authorization bit fairly straightforward; entering the root password might be a second window if the user hasn't already elected to remember authorization. Since for the first pass we plan to

From an implementation perspective, there are a couple of ways to go about it. On one end of things, we need to know what's installed on the user's computer, and what's available for install. On the other, we need to actually do the install. Browser extensions might allow a lightweight client-side application to pass data back and forth. For installation, we could perhaps use a magic mime type to trigger the installation request.

Privacy - if we associate packages directly with users, we have privacy issues to be concerned about. If we simply aggregate data, then privacy is less of an issue.

How do users contribute data?

I think it's fair to require authentication before a user contributes data. Just browsing and installing applications should not require a login, however. I'm personally opposed to creating yet-another-login, so I'm in favor of using open ID. The existing Fedora Account system uses a Open ID, and may soon be an Open ID aggregator. We should avoid having to accept the CLA to contribute application descriptions, however. Some data will be pulled from the original data sources (see below), while other data will be user-editable.

How does the data get populated?

We will probably start with data from the Fedora package database, comps.xml, and .desktop files. However, we absolutely must make sure this data is kept up to date when packages are removed, renamed, split, joined, etc. This is a big challenge, but it has to be in place as soon as possible, or we'll end up in the same situation as crufty data with no good way to update it. We'll need a way to watch changes to packages. Some changes will almost certainly need manual intervention.

For instance: gnome-terminal is renamed to 'GNOME-terminal'. From a packaging perspective, a new package will be created (GNOME-terminal) that Obsoletes: gnome-terminal < x.y and Provides: gnome-terminal = x.y.

Another example: mahjong is moved from gnome-games to gnome-games-mahjong.

And another: the desktop file for xemacs changes, moving it from the "Office" category to the "Accessories" category (I'm just making up category names here).

We'll need a process or set of processes to monitor the various seed data sources for changes. Another interesting possibility is to just make all of this data editable, and rely on users to fix things when the underlying data changes.

Another idea for data population is to display data directly from the upstream application developer. The DOAP data format promises to exchange data about software applications, and may be an avenue to explore. Regardless of whether we use DOAP or not, an intermediate data exchange format is advisable. One possible scenario is, we may provide the ability to pull data from a DOAP source - in this case, the upstream developer could host DOAP data on the project's website that we will check periodically. Obviously we can't try to force upstream developers to do this, but there is some value for upstream developers as well - they can control the data we (and other distrubutions) show users about their project.

What kinds of data do we show?

[1] I like this better than 'ratings' - lots of applications do more than one thing. Emacs is a pretty good text editor, but not necessarily the best IRC client. This might look like:

I use this application to ___________" and give it "****" stars

We can weight the data entered in this field for search, based on the idea that people are searching for software to 'do stuff'.

[2] We should also provide a free-form comment section for longer comments:

Comments: _______

[3] There are lots of relationships - Freecell is installed when you install Mahjong, since they're both in gnome-games. is part of the Open Office suite. Abiword is an alternative word processor to We'll want to show these relationships in a way that makes sense, not a 'related applications' list.

[4] Links to the application's homepage, mailing lists, forums, bugs...

[5] A new version of Firefox is available! Click to install!

Post-install help

Where does the user go to install the application? How can the user learn how to use the app?

How is the web app implemented?

TurboGears. It's already got a good following among the Fedora infrastructure team.

Biggest technical issues

  • How does the webapp know what's installed, what's available, and install new packages?
  • How does the webapp stay consistent with the underlying data (the packages in Fedora)?