#3214 mirrormanager serving old data
Closed: Fixed None Opened 12 years ago by toshio.

= bug description =

It was reported that the metalink.xml files had out-of-date hashes for repomd.xml which lead to yum updates failing.

  • md5sum on master mirror b54b83c437b027e4dde2c5ec244dfc28 /pub/fedora/linux/updates/16/x86_64/repodata/repomd.xml
  • bapp01's mount had b54b83c437b027e4dde2c5ec244dfc28 /pub/fedora/linux/updates/16/x86_64/repodata/repomd.xml
  • -rw-r--r-- 1 263 mirrors 4588 Mar 26 17:38 repomd.xml
  • md5sum in mirrormanager db 07f7e48d208577e947676b6c2b470e52
  • select * from directory where name = 'pub/fedora/linux/updates/16/x86_64/repodata';
  • select * from file_detail where directory_id = 21881;
  • bapp01 /var/lib/mirrormanager/mirrorlist_cache.pkl and metalink.xml had this data as well.

= bug analysis =

For some reason the cron job was not updating the db:

0,30 * * cd /usr/share/mirrormanager/server && ./update-master-directory-list /etc/mirrormanager/prod.cfg >> /var/log/mirrormanager/umdl.log 2>&1

I watched one complete (don't know start time) without updating the db.

Ran this by hand and the db was updated:
sudo -u mirrormanager ./update-master-directory-list /etc/mirrormanager/prod.cfg

= fix recommendation =

Need to check two things:

  1. Did the subsequent rebuilding of the pkl cache and the metalinks sync out the proper data?
  2. Why is this happening? At least check that the next time the repomd.xml is updated, the metalink.xml is also updated.

The f16 updates metalinks seem to match the mirrors now. So check #1 is done. Still don't know the answer to #2.

There was a traceback/crash in the update log in /tmp when I looked at this last night. Is it possible that was keeping it from completing but we didn't know it was happening?

Where are we here? Any ideas on root cause? Or should we just watch for this happening again and try and gather more data?

the only traceback now seen from bapp01 historical umdl logs are from MM being unable to connect to the database:

Starting umdl 2012-05-04T16:00:03.188291

Traceback (most recent call last):
File "./update-master-directory-list", line 515, in ?
sys.exit(main())
File "./update-master-directory-list", line 507, in main
sync_directories_from_directory(i['path'], i['category'], excludes)
File "./update-master-directory-list", line 476, in sync_directories_from_directory
sync_category_directories(category, category_directories)
File "./update-master-directory-list", line 358, in sync_category_directories
make_file_details_from_checksums(dir)
File "./update-master-directory-list", line 136, in make_file_details_from_checksums
sha256dict = _checksums_from_globs(dir.name, sha256_globs, 64)
File "<string>", line 1, in <lambda>
File "/usr/lib/python2.4/site-packages/sqlobject/main.py", line 1175, in _SO_getValue
results = self._connection._SO_selectOne(self, [column.dbName])
File "/usr/lib/python2.4/site-packages/sqlobject/dbconnection.py", line 662, in _SO_selectOne
return self.queryOne(
File "/usr/lib/python2.4/site-packages/sqlobject/dbconnection.py", line 395, in queryOne
return self._runWithConnection(self._queryOne, s)
File "/usr/lib/python2.4/site-packages/sqlobject/dbconnection.py", line 267, in _runWithConnection
self.releaseConnection(conn)
File "/usr/lib/python2.4/site-packages/sqlobject/dbconnection.py", line 309, in releaseConnection
conn.commit()
File "/usr/lib/python2.4/site-packages/cherrypy/_cpengine.py", line 24, in SIGTERM
cherrypy.engine.stop()
AttributeError: 'module' object has no attribute 'engine'

if there were any other logs, they're gone now. :-(

I do see the traceback failure from 4/25/2012 in mirrorlist.log due to NULL values in the Host.bandwidth_int field, which has since been hotfixed.

mirrorlist cache is up-to-date on bapp02 now. So let's watch for this again, but there's no action to take now.

Login to comment on this ticket.

Metadata