#461 F14 blessing for systemd
Closed None Opened 13 years ago by kevin.

http://fedoraproject.org/wiki/Features/systemd

The F14 systemd feature is 100% and normally would continue to be a Fedora 14 feature. However, since it's such a core (ha) part of our operating system, FESCo wanted to look and see how it looks before the Beta Change deadline. If it is deemed that it requires more time in development, we could move this feature off to F15, and ship F14 with upstart again.

There was a test day on systemd yesterday:
https://fedoraproject.org/wiki/Test_Day:2010-09-07_Systemd

Which generated the following bugs:

630914 NEW - Single user mode reported as "unknown"
630915 NEW - Stopping rpcbind (via systemctl): /etc/init.d/killall: line 16: 1654 Terminated /etc/init.d/$subsys stop
631160 NEW - kdm is (re)started just before shutdown???
631584 NEW - getty and firstboot running on the same virtual console tty
631588 NEW - KDE couldn't shutdown
631590 NEW - I have a different prompt in single mode
631592 NEW - halt command leaves display at a black screen
624149 ASSIGNED - systemd's shutdown command should follow traditional semantics
627014 ASSIGNED - systemd provided telinit does not work as advertised

https://admin.fedoraproject.org/pkgdb/acls/bugs/systemd
shows currently 36 open bugs (8 in on_qa)

Lennart: Do you consider systemd ready for F14? Or would you like more time in rawhide and target F15?

QA folks: Do you think systemd is on target for F14?

If we wish to push this to F15, we must decide before tomorrow, as the first test compose for Beta is then, and we need to know whats in it, so FESCo folks with a strong opinion on this, please note so in this ticket. If it's helpfull to have a irc session to discuss this, we can schedule that.


From a critical path perspective, I'm pretty happy with systemd. Some of those bugs are important ones (and would be/are beta blockers), but it's a fairly short list and from my first quick glance through none of them look very difficult to fix; I'd be surprised if we can't get the important ones addressed for Beta.

The other big question is in terms of managing the switchover, which involves considering issues of documentation and tool compatibility which have been discussed on the -devel list. I'm less familiar with those questions, so I don't want to offer a definite yay/nay opinion there, but rather point it up as something that should be considered: what's a sufficient level of documentation and backward compatibility in various tools, and will we be able to achieve that sufficient level.

To the list of bugs found during the test day, I'd add bug 630952 "Does not enable the prefdm and rc-local services" and bug 630174 "RFE: systemd-unscrew-my-system". As noted in the bugs, the test day made me realize there may be a design issue in how scriptlets are supposed to enable services using systemctl.

Replying to [comment:1 adamwill]:

From a critical path perspective, I'm pretty happy with systemd. Some of those bugs are important ones (and would be/are beta blockers), but it's a fairly short list and from my first quick glance through none of them look very difficult to fix; I'd be surprised if we can't get the important ones addressed for Beta.

It would be good to get some initial estimate as to which of the listed bugs you consider beta blocker candidates, and get some input from lennart on the feasibility of fixing those.

My opinion is that I'm concerned. While I think it's possible we could make it work, I'd probably be more comfortable with doing it in F-15. That actually gives us time to address michich's concern about how packages should handle script enabling better, for instance.

Various bugs that concern me:

  • bug 631620 - HAL not started in certain circumstances
  • bug 625027, bug 627014 - telinit does not work correctly/reliably
  • bug 630490, bug 630225 - disabled services still end up getting started via bus/socket activation
  • bug 630401 - doesn't work with certain log daemons
  • bug 629040 - reporting of SysV service failures is incorrect

(There's also a firstboot/systemd beta blocker, but that one is a quick fix.)

Adam - how much of the compatiblity matrix/list of requirements was exercised in the test day?

Replying to [comment:4 notting]:

  • bug 630490, bug 630225 - disabled services still end up getting started via bus/socket activation

From the description in the bugs, it sounds like this is a nmcli issue and will hit us with upstart, just the same.

It does not, as upstart's not proxying the dbus socket, nor the dbus service as the systemd .service file does.

Here are the bugs that I think really matter and should be considered MUSTFIX for the final release:

bug 631620 - this is a dep cycle loop, probably pretty easy to fix. Probably just a matter of removing a misplaced dependency from an LSB header of the SysV scripts of some of the involved services.

bug 630490 - this one is trivial, a single line fix in systemd upstream

bug 625027 - should be fixed already by the dbus 1.4.0 upload

bug 630401 - mostly fixed already in systemd upstream (internal logging should now work with SOCK_STREAM loggers, this needs to added to a second place as well)

bug 627014 - i am not entirely sure what's going on here. Doesn't look difficult, but I need to to have a closer look before i can say something reliable about this.

bug 630225 - i think this i easy to fix too, i.e. nmcli invocations in the network scripts simply should be prefixed with "systemct check NetworkManager.service &&". (Plus some mountpoint /sys/fs/cgroup/systemd checking).

bug 631160 - not really a clue what is going on here so far, will look into this tomorrow. From a first look it doesn't look too difficult to figure out.

So much about the MUSTFIXes, on the other bugs:

I know that Bill doesn't agree with me on this: but I don't think 629040 is a real issue. And it's definitely not a regression since the sysV scripts do not forward the "status" command to systemctl right now, so the bheaviour doesn't change here. I believe the right approach to "fix" this issue is to move the services in question to native unit files in F15. But I don't think this needs fixing in F14.

And then there's 631588 which I so far cannot really make much sense of (language barrier). I have asked for clarifications there now, we'll have to see what this problem is really about. For 630781 I asked for clarifications, too. I can't say much about this so far.

So much about the issues we have bugs about. Here's some stuff that's not in the bug list (but we probably should have some):

  • chkconfig hookup. There is no glue between chkconfig and "systemctl enable". A number of folks wanted to look into this, but nobody did. My personal opinion would be to run "systemctl daemon-reload" and not doing much more, but I know that other people disagree. Adding the "systemctl daemon-reload" call should be trivial to add, and anything else should not be hard either.

  • The remaining upstart scripts need systemd unit files. I think this are mostly two packages: readahead and vpnc. I have a patch for the former, and discussed this with hhoyer a few times, but it is not merged but this should be easy. vpnc should be easy too, but I think it wouldn't even be a complete desaster if we didn't get this one package done in time, as all the upstart file did is to undo some potential resolv.conf changes of vpnc on system reboot.

  • Packaging systemd unit file guidelines. My suggestion would be to simple add one sentence here for F14: "Please do not include systemd unit files in your package unless you talked to Lennart Poettering, Bill Nottingham, ... first".

  • The unscrew thing, if we want this. Probably should be sufficient to have a simple shell script for this, which runs a couple of ln -s and systemctl enable.

And I think this is mostly what I have on my list.

So, summarizing this. From the MUSTFIXes the difficulty of 631160 and 627014 is unclear, the rest appears "easy" to me. So from my perspective, given that I still have time till the beta freeze, I'd vote for a "Yes on systemd for F14". But that's just me who's always optimistic. ;-)

[If somebody wants to help: I'd be very thankful if somebody who's not me would be willing to look into the chkconfig, vpnc, 630225, unscrew issues, and maybe prep a patch or two?]

Replying to [comment:7 lennart]:

bug 630225 - i think this i easy to fix too, i.e. nmcli invocations in the network scripts simply should be prefixed with "systemct check NetworkManager.service &&". (Plus some mountpoint /sys/fs/cgroup/systemd checking).

I don't think this is right... callers of nmcli shouldn't have to care how NM is started for the status to behave sanely. There's certainly the 'easy' fix of reverting how NM is activated, in any case.

So much about the MUSTFIXes, on the other bugs:

I know that Bill doesn't agree with me on this: but I don't think 629040 is a real issue. And it's definitely not a regression since the sysV scripts do not forward the "status" command to systemctl right now, so the bheaviour doesn't change here. I believe the right approach to "fix" this issue is to move the services in question to native unit files in F15. But I don't think this needs fixing in F14.

The sysv scripts are forwarding status to systemctl in many cases. This can be reverted, of course. But I'd prefer this to be supported much better than it is now.

I intend to look at chkconfig, but have been examining if there's ways to improve this some first. Probably should switch the order.

So, I'm sort of on the fence here.

PROS:
- systemd has already gotten a lot of press for f14
- We have already made changes in docs/etc for it.
- It's a cool idea.
- The bugs/issues are not a flood and look potentially solveable.

CONS:
- We still need more docs
- This may drive away users if it's not very stable by release.
- We need at least some packaging guidelines.
- readahead would be very good to get fixed asap.
- If we drop this for f14, our feature set looks very very dull.

I wish we had had just around another month of time here to do more
testing feedback and clear out the bugs and get things solid.

I guess I am a slight +1 to keeping it in f14 but could be talked into
letting it 'bake' more in rawhide for f15 if others see problems.

To mitigate issues, we should also then:

notting: sorry, I forgot to answer your question. With the proviso that we have no direct proof the people attending the test day went through the entire test cases, the answer is that the coverage is fairly good, because the test cases provided for the test day cover everything in your list that is directly testable. Obviously the stuff about documentation and process (like the provision of packaging guidelines) doesn't come under the test day, but all the items about how systemd actually works I put into the test cases.

I intend to go through and triage the bugs today. I wouldn't worry too much about bug 631620 - HAL not started in certain circumstances, since our HALectomy is pretty advanced these days, but I think the rest of your list is fairly sound.

CONS:
- If we drop this for f14, our feature set looks very very dull.

This looks like it is a PRO, rather than a CON. do we want to ship a very dull feature set ?

I personally think that the PROS outweigh the CONs, and Lennart has proven that he's very capable of fixing issues quickly, so I am confident that we can get the relatively short list of mustfixes emptied soon.

As for documentation: yes, more is always better. I'll volunteer to help out with improving docs.

This looks like it is a PRO, rather than a CON. do we want to ship a very dull feature >set ?

Yeah, sorry, this was in the wrong place. ;)

So, I see +1 from mclasen, a weak +1 from me, and +0 from notting?
Any other fesco members wish to weigh in? Or should we just close this ticket as "no one has strong feelings, might as well just do it".

Based on what I have read here and on the lists I am leaning towards +1. The issue do not seem to be show stoppers and Lennart feels he can nail down the final few bugs. I would feel better if as Kevin suggested we had another test day and make sure we have people to test who use more than just the default DE.

Steven

Those links are looking good to me johannbg.

I'm going to toss this on the meeting agenda for tomorrow in the event that any of the fesco folks who didn't provide input here would like to, but as far as I can see we are go for systemd in f14. :)

Lennart: If you need any resources to help make things a success, please just let us know (more testing, more maintainers fedora side, etc).

FWIW, I'd like to see some effort by lennart & mclasen (who volunteered to help beef up the docs) to work with the team preparing the F14 Deployment Guide to include systemd procedures. This includes but isn't limited to Chapter 6 on services:

According to https://fedoraproject.org/wiki/Docs_Project_meetings#Guides the lead for that doc is dsilas (cc'ing).

Replying to [comment:19 pfrields]:

FWIW, I'd like to see some effort by lennart & mclasen (who volunteered to help beef up the docs) to work with the team preparing the F14 Deployment Guide to include systemd procedures. This includes but isn't limited to Chapter 6 on services:

According to https://fedoraproject.org/wiki/Docs_Project_meetings#Guides the lead for that doc is dsilas (cc'ing).

Thanks for pointing that out. I wasn't even aware of that document.

The Fedora Deployment Guide has not received any changes to reflect systemd yet, though in my estimate, and those of co-authors Jaromir Hradile and Martin Prpic, doing so won't affect many areas. Making the DG reference systemd instead of init scripts, etc. will be the first stage of the changeover, and one we're aiming at for DG publication before F14 GA. It is also possible that we could integrate more specific systemd content into the F14 DG before GA, especially if is mostly hammered out already.

Jarek updated the Controlling Access to Services chapter recently; I have CCed him and Martin.

I have also contacted Lennart and Matthias separately re: getting the docs into the Fedora DG.

After a long meeting today, FESCo decided to defer systemd to the f15 release to give it time to become rock solid. We will still need the docs and such, please keep working on them for f15.

Notting will look at changes needed for f14, and I hope systemd development will continue in rawhide at a rapid pace.

Would it help to clarify that systemd still remains in Fedora 14, it is just not enabled by default, but can be used if desired?

I am leery of supporting it in F14 if it's not the default (in that I don't know that it fits our new update strategy to push fixes to non-systemd-critical packages to F-14 to fix systemd integration issues). But it's an option.

What are the expected plans and changes for the systemd feature page?

Move to F-15,I presume.

Yeah, move to f15 and if the feature owner(s) could look it over and see if there is any different scope for f15, that would be great.

We can then take a look at it and get it approved for f15.

The feature page was already set to ReadyForWrangler, but it doesn't appear to have been updated to reflect Fedora 15 fully. I've requested an update in the associated talk page. Once that is done I'll queue it up for FESCo

Login to comment on this ticket.

Metadata