Ticket #334 (closed outage: fixed)

Opened 6 years ago

Last modified 6 years ago

xen1 + xen2

Reported by: mmcgrath Owned by: mmcgrath
Priority: critical Milestone:
Component: Systems Version: Production
Severity: High Keywords:
Cc: Blocked By:
Blocking: Sensitive:

Description

Xen1 is in a state that its been crashing between 1 and 3 times a day. For now its biggest component was app1 which is now running on xen8. I've placed a ticket with the soc though now I doubt there's anything they can do. I'll have to contact Dell.

Change History

comment:1 Changed 6 years ago by mmcgrath

  • Status changed from new to assigned

At present xen1 is running on 2.6.18-53.1.4.el5xen, I'm going to try 2.6.18-8.1.14.el5xen.

comment:3 Changed 6 years ago by mmcgrath

Jan  9 03:10:56 xen1 iscsid: Nop-out timedout after 15 seconds on connection 1:0 state (3). Dropping session.
Jan  9 03:11:18 xen1 iscsid: connection1:0 is operational after recovery (3 attempts)
Jan  9 03:14:04 xen1 syslogd 1.4.1: restart.
Jan  9 03:14:04 xen1 kernel: klogd 1.4.1, log source = /proc/kmsg started.

comment:4 Changed 6 years ago by mmcgrath

Running 2.6.18-8.1.14.el5xen failed. back to 2.6.18-53.1.4.el5xen now.

comment:5 Changed 6 years ago by mmcgrath

  • Summary changed from xen1 to xen1 + xen2

Interestingly as soon as we upgraded xen2 to RHEL5, this started happening to it as well. Bugzilla:

https://bugzilla.redhat.com/show_bug.cgi?id=429469 and https://bugzilla.redhat.com/show_bug.cgi?id=245823 and http://rhn.redhat.com/errata/RHBA-2007-0791.html

Are worth investigating, this first of which was created by us.

Additionally I've rolled back the xen rpms, and kernel to their latest FC6 version as we weren't having issues there, testing will be done on xen1.

comment:6 Changed 6 years ago by mmcgrath

  • Status changed from assigned to closed
  • Resolution set to fixed

The new RHEL5 kernel works as does the FC6 kernel, no more reboots.

Note: See TracTickets for help on using tickets.