#3358 migrate the file upload hash algorithm from md5 to sha256
Closed: Fixed None Opened 11 years ago by pwouters.

= bug description =
fedpkg uses md5 instead of sha256, causing problems when running in fips mode.

See further https://bugzilla.redhat.com/show_bug.cgi?id=834740

Use of md5 in the sources file should be converted to sha256 so that in the future, md5 is no longer needed, and won't cause problems when running in fips mode


If I'm reading that bug correctly, fedpkg breakage needs to be solved by having python use Private_MD5Init(). This is because fedpkg will need md5 to compute the hash that rpm uses and older rpm versions can only handle MD5.

fedpkg will also be talking to the lookaside cache, though. The lookaside cache currently uses md5 to make unique file paths when files uploaded to it share the same name. An example path is:

/srv/cache/lookaside/pkgs/python/Python-2.7.3.tar.bz2/c57477edd6d18bd9eeca2f21add73919/Python-2.7.3.tar.bz2

If a second version of the Python-2.7.3.tar.bz2 file were uploaded, it would hash differently and would be placed in a different directory.

Is this a valid use of md5? If it's not, we'll have to change the hashing algorithm in the lookaside upload.cgi, issue a new fedpkg that uses the new hashing algorithm when uploading to lookaside, compute new hashes on all of the files that are currently in the lookaside cache, and probably change code in koji to use the new hash.

Devil's advocate here. Since this might be non-trivial to implement, why not implement it less and switch to sha512?

I'm not sure if sha512 offers that much more over sha256. I'd think it more likely the world moves from sha256 to sha3. But, there is no harm in picking sha512 over sha256 either.

The issue for python is that it should expose a "fips ok" and "fips not ok" version of the md5 functions, with the default being the "fips not ok" one.

Migrating fedpkg away from md5 just avoids all issues of "upstream" needing to change code while allowing people to run fedora in fips mode and still build packages.

No, it doesn't. It avoids the issue for fedpkg. It does not avoid the issue for any of the other users of md5 from within python. Lookaside moving to sha256 alone also (once again, if I am reading the fedpkg bug correctly) does not solve the problem for fedpkg as fedpkg uses md5 for multiple things --

  1. to communicate with lookaside (can be solved using the rough steps that I outlined above)
  2. to save a checksum in the sources file (requires work on fedpkg. Might require work on koji depending on how koji uses it.)
  3. to operate on rpm files that are meant for RHEL5 where md5 is the only hash.

We can work on (1) in this ticket but that won't fix fedpkg unless there's a commitment to work on (2) and (3) as well. Alternately, if the use of md5 is okay here (as suggested in the fedpkg bug) then we could also wait for python to grow an additional md5 hash constructor and when fedpkg starts using that, nobody else will need to do anything. (You're perfectly welcome to analyze this and tell us that use of md5 here is a security concern, though -- and with that, we would work on coordinating this change with all the upstreams that this touches).

sgrubb emailed me to let me know this is a security concern. So we will need to start the process of updating uses of md5 to sha256 where we can. I think doing this for lookaside is fairly straightforward. The only design decision we need to make is whether to support md5 in addition to sha256 for compatibility when people have older client-side tools. CC'ing jkeating so he's in the loop.

For the sources file, fedpkg changes should be straightforward but I'm not sure if there's koji changes or coordination required. CC'ing Dennis for that.

I don't presently see a way to address RHEL5's use of md5 as their hash in rpms. I think this will have to remain md5 and people wanting to generate SRPMS (or rpms but that's easier to work around) for RHEL5 while in FIPS mode are out of luck. (Someone could open a bug for the rpm maintainers to look into it -- rpm probably could use other openssl API to generate md5 hashed rpms even in FIPS mode when the user specifically requests it. I don't know if that's a security concern or not, though. sgrubb?)

Also adding to the Meeting agenda for this week's infrastructure meeting.

Replying to [comment:5 toshio]:

I don't presently see a way to address RHEL5's use of md5 as their hash in rpms. I think this will have to remain md5 and people wanting to generate SRPMS (or rpms but that's easier to work around) for RHEL5 while in FIPS mode are out of luck. (Someone could open a bug for the rpm maintainers to look into it -- rpm probably could use other openssl API to generate md5 hashed rpms even in FIPS mode when the user specifically requests it. I don't know if that's a security concern or not, though. sgrubb?)

The signatures within RPMs, at least for signed RPMs, are security-relevant and need to be cryptographically secure, and therefore shouldn't use MD5 nowadays. It's perfectly acceptable to refuse to be compatible with RHEL5 in FIPS mode.

As a general guideline, any use of the "alternative API" for MD5 is highly suspect. Yes, in theory there are non-cryptographic uses of MD5 where the API makes sense, but they are very few, perhaps 1% of packages. For example, any time an intentionally-induced collision could break the application design, e.g. because the application assumes that collisions will not happen, the use of MD5 is IMHO security-relevant and the alternative API should therefore not be used.

It is expected and OK that some real-world applications can not be implemented in a fully FIPS-compliant way, and therefore are not possible to do in FIPS mode (OTOH this also in most cases means that these "real-world applications" are insecure or will likely be soon insecure, and need to be migrated to a different algorithm or design soonish).

Fixing RHEL5 is not a concern. Its impossible to fix or we would have.

I can implement the changes in lookaside cache and fedpkg. when i brought up in the past Migrating the lookaside to sha256sum I was told that I was wasting time and effort. from memory the discussion was on #fedora-devel

the steps would be to make sha256sum based links on all existing files in lookaside. leaving the existing md5sum based ones in place. update the upload script to place new files in sha256sum based locations. update fedpkg to use sha256sum for sources entries on new files. we possibly want to have a period of time say 1 release where we allow md5sum based usage to allow for people to update to newer fedpkg builds. As far as koji goes its not effectedit doesnt directly talk to the lookaside cache it calls fedpkg sources to fetch the sources.

The main decision here is do we have a flag day where we tell everyone they need a newer tool set or do we allow for a transition.

I think the plans listed above are fine. I would advise against anything that will require modifying all the repos, instead make the change happen as new sources are uploaded. Existing sources will need to remain accessible over md5 in order to reproduce any older builds exactly from git hash sums / cvs tags. I don't see this being proposed above so all is good.

Login to comment on this ticket.

Metadata