#2790 Outage: Upgrades/Reboots - 2011-05-31 18:00 UTC
Closed: Fixed None Opened 12 years ago by kevin.

Outage: Upgrades/Reboots - 2011-05-31 18:00 UTC

There will be an outage starting at 18:00 UTC on 2011-05-31,
which will last approximately 2 hours. During this time there may be very short
outages of services as machines are updated and rebooted into new kernels.

Machines will be rebooted in an order that allows for least disruption to services.

In many cases, there will be no noticeable downtime due to redundancy and fail-over.

To convert UTC to your local time, take a look at
http://fedoraproject.org/wiki/Infrastructure/UTCHowto
or run:

date -d '2011-05-31 18:00 UTC'

Reason for outage:

System updates/Reboots.

Affected Services:

BFO - http://boot.fedoraproject.org/
Bodhi - https://admin.fedoraproject.org/updates/
Buildsystem - http://koji.fedoraproject.org/
GIT / Source Control
DNS - ns1.fedoraproject.org, ns2.fedoraproject.org
Docs - http://docs.fedoraproject.org/
Email system
Fedora Account System - https://admin.fedoraproject.org/accounts/
Fedora Community - https://admin.fedoraproject.org/community/
Fedora Hosted - https://fedorahosted.org/
Fedora Insight - https://insight.fedoraproject.org/
Fedora People - http://fedorapeople.org/
Fedora Talk - http://talk.fedoraproject.org/
Main Website - http://fedoraproject.org/
Mirror List - https://mirrors.fedoraproject.org/
Mirror Manager - https://admin.fedoraproject.org/mirrormanager/
Package Database - https://admin.fedoraproject.org/pkgdb/
Smolt - http://smolts.org/
Spins - http://spins.fedoraproject.org/
Start - http://start.fedoraproject.org/
Torrent - http://torrent.fedoraproject.org/
Wiki - http://fedoraproject.org/wiki/

Unaffected Services:

Ticket Link: https://fedorahosted.org/fedora-infrastructure/ticket/2790

Contact Information:

Please join #fedora-admin in irc.freenode.net or add comments to the ticket for this outage above.


Monday is a holiday. Switching to tuesday.

This outage is finally over.

Several things did not go smoothly, so we will want to learn from this and do better next time.

Some points:

  • 2 hours was not enough time to get to all the machines.

  • We need to make sure guests are not still set to start on old hosts in addition to new ones.

  • perhaps we should look at splitting out and doing these over several days.

Additionally a new rhel5 kernel update came out today after we finished. ;(
So, likely we will need to schedule another round next week or so.

Login to comment on this ticket.

Metadata