fedora-infrastructure

#2255 Very poor CVS+Koji connectivity

Closed: Fixed None Opened 13 years ago by monnerat.

= phenomenon =
Connection attempts to cvs.fedoraproject.org or koji.fedoraproject.org on July 5th, 2010 PM CEST fail about 20% of times.

I've spend the afternoon trying to execute common/cvs-import.sh script for a package that has ~15 CVS sources and patches, at last succeeding for devel and F-12, but not (yet) for F-13.

The error never occurs at the same place (i.e.: network dependent) and the most common error issued is:

ssh: connect to host cvs.fedoraproject.org port 22: connection timed out[[BR]]
cvs [add aborted]: end-of-file from server (consult above messages if any)

Other connections are satisfactory (I do not think the problem is on my side).

= recommendation =
See thread at http://lists.fedoraproject.org/pipermail/devel/2010-July/138355.html

alexlan commented 13 years ago

I can confirm similar outages with both CVS and koji (both web and uploading), e.g.:

{{{
cvs up
ssh: connect to host cvs.fedoraproject.org port 22: Connection timed out
cvs [update aborted]: end of file from server (consult above messages if any)
}}}

This happens to almost every other attempted CVS connection. I try once, get the error, try again, it works, then get the error.

Also attempting the use the "make srpm-scratch-build" will stall or timeout trying to upload the SRPM to koji quite frequently.

till commented 13 years ago

You can try to spot the problem using:
{{{sudo mtr cvs.fedoraproject.org}}} or {{{sudo tcptraceroute cvs.fedoraproject.org 22}}}

These commands show no problem from here.

monnerat commented 13 years ago

{{{

mtr --report -c 100 cvs.fedoraproject.org

HOST: linuxdev.datasphere.ch Loss% Snt Last Avg Best Wrst StDev
1. firewall.datasphere.ch 0.0% 100 0.1 0.1 0.1 4.5 0.4
2. fa0-0.rt1.plo1.dfinet.net 0.0% 100 0.8 1.1 0.7 5.4 0.7
3. gi1-2.rt-b2.cc.dfinet.net 0.0% 100 0.8 3.2 0.7 176.5 17.8
4. ge-2-4.r00.gnvasw01.ch.bb.gi 0.0% 100 1.0 28.5 0.9 191.4 50.4
5. xe-3-1.r01.gnvasw01.ch.bb.gi 0.0% 100 1.6 24.4 1.0 258.8 51.8
6. ge-8-4.r00.frnkge03.de.bb.gi 0.0% 100 11.8 30.6 11.5 237.4 45.7
7. xe-0.globalcrossing.frnkge03 0.0% 100 14.0 21.9 11.5 188.3 36.3
8. internap-ken-schmid-phx.ge-3 0.0% 100 168.4 171.2 168.4 341.8 18.3
9. border1.po1-bbnet1.phx004.pn 0.0% 100 169.4 177.8 169.0 338.6 28.8
10. redhat-2.border1.phx004.pnap 0.0% 100 169.7 173.6 169.6 285.3 18.3
11. ??? 100.0 100 0.0 0.0 0.0 0.0 0.0
12. ??? 100.0 100 0.0 0.0 0.0 0.0 0.0
13. ??? 100.0 100 0.0 0.0 0.0 0.0 0.0
14. ??? 100.0 100 0.0 0.0 0.0 0.0 0.0
15. ??? 100.0 100 0.0 0.0 0.0 0.0 0.0
16. cvs.fedoraproject.org 21.0% 100 172.9 173.3 172.9 179.1 0.9
}}}

... seems that my 20% subjective estimation was quite accurate ;-)

mmcgrath commented 13 years ago

This is a strange network issue, it's not impacting everyone. And for some people it is a total outage right now (like those in the westford office). I'll keep everyone posted when I hear more. At the moment we just know people are looking into it.

mmcgrath commented 13 years ago

This should be fixed now but we haven't heard a root cause yet.

monnerat commented 13 years ago

Works perfectly today for me. Thanks for the action.

mmcgrath commented 13 years ago

Now the bad news. It's fixed but they have no idea what was going on nor what fixed it. It's unlikely we're going to get a root cause for this :-/