#6142 Has there been any kernel update on Fedora build boxes recently?
Closed: Fixed None Opened 9 years ago by jakub.

In the past 2 days or so, I'm seeing very weird failures during gcc rebuild (both in f22 and f23, on x86_64 as well as i686).
Compare e.g.
http://koji.fedoraproject.org/koji/taskinfo?taskID=9399430
and
http://koji.fedoraproject.org/koji/taskinfo?taskID=9399427
build.log results (from a scratch rebuild of gcc-5.0.0-0.21.fc2{2,3}) with
https://kojipkgs.fedoraproject.org/packages/gcc/5.0.0/0.21.fc22/data/logs/
https://kojipkgs.fedoraproject.org/packages/gcc/5.0.0/0.21.fc23/data/logs/
built some time ago, at least in the f22 case tcl has not changed, dejagnu neither, nor expect.
And, I can't reproduce myself in mock running on F21 kernel.
The logs look like if the compiler output (which is emitted to a pseudo terminal) is randomly trimmed if it is too large, as if the kernel had issues with flushing output to pseudo terminals or something similar.
Any help in trying to narrow this down appreciated (info if there have been kernel upgrades etc.).


there has been no updates in the last few days, the builders were rebuilt a couple of weeks ago and updated from 3.18 to 3.19 as part of that process.

Weird. All I know is that March 20th build worked, and April 1st rebuild of the identical package (as well as April 2nd) regresses a lot against that. And while it is random, on the testcases with larger amount of diagnostics I've seen it now in 6 builds already, but not in the mock rebuild (which was on 3.17.8 kernel).

So, I did change gcc to use the arm builders we put ssd's in. Is this issue arm specific?

Those arm builders were reinstalled on the ssd's on the Mar 27th, and I put gcc in the channel to use them the afternoon of Mar 29.

I did apply a small number of updates on monday Mar 30th, but did not reboot them:

{{{
nss-tools-3.18.0-1.fc21.armv7hl Mon 30 Mar 2015 06:54:17 PM UTC
libedit-3.1-12.20150325cvs.fc21.armv7hl Mon 30 Mar 2015 06:54:17 PM UTC
librbd1-0.80.9-1.fc21.armv7hl Mon 30 Mar 2015 06:54:16 PM UTC
kernel-3.19.2-201.fc21.armv7hl Mon 30 Mar 2015 06:54:15 PM UTC
kernel-modules-3.19.2-201.fc21.armv7hl Mon 30 Mar 2015 06:54:14 PM UTC
librados2-0.80.9-1.fc21.armv7hl Mon 30 Mar 2015 06:54:04 PM UTC
nss-sysinit-3.18.0-1.fc21.armv7hl Mon 30 Mar 2015 06:54:03 PM UTC
nss-3.18.0-1.fc21.armv7hl Mon 30 Mar 2015 06:54:03 PM UTC
nss-softokn-3.18.0-1.fc21.armv7hl Mon 30 Mar 2015 06:54:02 PM UTC
python-requests-2.5.3-2.fc21.noarch Mon 30 Mar 2015 06:54:01 PM UTC
dracut-config-rescue-038-33.git20141216.fc21.armv7hl Mon 30 Mar 2015 06:54:01 PM UTC
python-urllib3-1.10.2-1.fc21.noarch Mon 30 Mar 2015 06:54:00 PM UTC
nss-softokn-freebl-3.18.0-1.fc21.armv7hl Mon 30 Mar 2015 06:53:59 PM UTC
kernel-core-3.19.2-201.fc21.armv7hl Mon 30 Mar 2015 06:53:58 PM UTC
dracut-038-33.git20141216.fc21.armv7hl Mon 30 Mar 2015 06:53:47 PM UTC
nss-util-3.18.0-1.fc21.armv7hl Mon 30 Mar 2015 06:53:45 PM UTC
}}}

All of them (arm, buildvm, buildhw) are booted on the 3.19.1-201.fc21 kernel.

I have not tried arm (I was doing a scratch build to test some patch, and expected it to fix arm profiledbootstrap, but that didn't fix it; but to my surprise noticed regressions on x86_64 and i686 from the builds that failed on arm, and that is where I tried to reproduce it multiple times).
I haven't seen this kind of issue on ppc64{,le}, which is the only other arches I did a scratch build these days on (since March 20th). But on ppc64{,le} we are seeing another weird issue, which looks different (sometimes in the pseudo terminal output random sequence of characters is replaced with NUL characters). But this is happening for two months already or so, while this x86_64/i686 issue is recent.
If one uudecodes the build.log files, there is a tarball with the build logs and one can look at the random regressions in there.

Strange. The x86 ones haven't really changed since they were last installed, which was Mar 21.

Happy to gather any info from them you might think could be related...

We have had some koji hub issues, but I wouldn't think that would affect builds. It seems more stable now, so you could try another build and see if it's happier.

Replying to [comment:5 kevin]:

Strange. The x86 ones haven't really changed since they were last installed, which was Mar 21.

Well, Mar 21 is in between Mar 20 and April 1st.
The problem happened again yesterday in gcc builds,
http://koji.fedoraproject.org/packages/gcc/5.0.0/0.22.fc22/data/logs/
http://koji.fedoraproject.org/packages/gcc/5.0.0/0.22.fc23/data/logs/
Makes gcc regression testing hard to impossible.

Are there still some x86_64 or i686 build servers that weren't reinstalled on Mar 21th? What exact kernel change was done on that date? I'd say the kernel is the main suspect...

If the builders are running the 3.19 kernel, this might be this bug:

https://bugzilla.kernel.org/show_bug.cgi?id=96311

H.J. opened that this week and Peter is working on a fix upstream. He's posted a patch but I don't believe it has been tested yet.

builders went from 3.17 to 3.19 when being rebuilt.

I do not see what needs to be done here, the initial question was answered. If there is anything specific that rel-eng needs to do, please re-open the ticket and describe it.

Metadata Update from @jakub:
- Issue set to the milestone: Fedora 22 Beta

7 years ago

Login to comment on this ticket.

Metadata