#4796 RFR: machines for documentation development
Closed: Fixed None Opened 8 years ago by immanetize.

= phenomenon =
The Fedora Docs team would like new publishing tools. So far, I've been working on this using private virtual machines, but this limits the ability of others to participate. Development instances accessible to interested Fedora contributors would resolve this.

= reason =
The tooling under development is continuous integration focused, using buildbot. This will allow fully automated publishing and translation integration, and enables programatic documentation (think manpages, package readmes, python modules, API documentation, etc). Because much of the work to do revolves around this process, standalone local development is ill suited for the task.

= recommendation =
Provide ansible-managed virtual machines (Fedora Cloud or libvirt guests should work either way afaik) for development use. I'd like one instance for a buildbot master, and one for a buildbot slave - separating roles early on will enable us to add buildslaves if needed.

Initially, resources required look like:

20GB storage
2GB RAM
2 cores

This is more than enough storage for what's in place now, but it could potentially require more. RAM and processing requirements would scale out if needed, those requirements are not likely to change.

Initial configuration can be barebones with FAS, sudo allowed for sysadmin-docs. buildbot roles will probably be reused (although perhaps modified with Tim's help) from those used with taskotron. Please advise if that doesn't sound right.

Besides buildbot straightforward buildbot configs, the primary development will be around a python project I started called 'anerist'. The project's goal is to facilitate programatic generation of buildbot configuration for defined types of content, extract metadata from that content, perform html conversions, and assemble the site structure using generated content and extracted metadata. I lazily threw this on github but will likely use fedorahosted going forward, but I can manage setting up the fedorahosted resources without consuming sysadmin time.

I'd also like a service account for handling translations. This would require someone to explicitly log into zanata to initialize the account, and retrieve the API key for {{ private }} for use in an ansible-generated user config file. With that in place, automated pushing of strings out to the publishing platform is relatively low hanging fruit. Pulling translations and committing to the content repos could be done if su - $serviceuser is possible.


as a heads up - we're working to remove the last taskotron-specific bits from the buildmaster/buildslave roles and move them out of the taskotron/ subdir.

If you run into problems on that front - find me so I can help. We're tracking the role rework in a phabricator ticket on our end:

https://phab.qadevel.cloud.fedoraproject.org/T499

AFAIK, there are no buildbot packages for el7. It doesn't have the same problems with twisted versions that el6 did but the last time I tried, it wasn't just a simple rebuild (didn't spend the time to figure out exactly what's going on). We've been explicitly avoiding epel7 packaging because we plan to move to buildbot nine when it's released (nine beta 1 is due to be released this week or early next week) and don't want to deal with compat packages. If you all are planning to do some custom UI work, I'd suggest you plan to move over to nine as soon as it's ready.

That being said, we may still move over to el7 for our buildmasters - that decision is ongoing. If that happens, we're planning to do the packaging work for buildbot on el7 (in COPR until buildbot nine is released, probably) if it isn't already done at that point.

I've been using F21 so far, mostly because I had a local mirror for it. There's nothing too involved happening in the buildbot space, so far; I'm sure we can follow Taskotron to whatever platform. That said, EL7 would be better eventually, because of potentially better handling of the master service when there's long startup times (I've hit limits on number of factories as well as startup timeouts, both improved with better code, but it did bring issues to light.)

ok, so a few questions from me:

  • Does this mean we are not needing the docs-backend01 instance? ie, this is going to replace that process?

  • I think it would make sense for these to be cloud instances for dev (we have been using cloud for all our dev instances).

  • For dev, I don't really care what OS /version you use. We have fedora 21/22 cloud images and also rhel/centos7. Whatever works best. For stg/prod it would be nice if we used rhel7.

  • Do you need a persistent volume for data? Or one for each host?

  • I assume the buildbot builds stuff and the master serves it out? Or could the machine that serves the content be some other machine?

Replying to [comment:3 kevin]:

ok, so a few questions from me:

  • Does this mean we are not needing the docs-backend01 instance? ie, this is going to replace that process?

The instance can probably go away, but let's please leave the koji tag in place for now, in case the unexpected happens.

  • I think it would make sense for these to be cloud instances for dev (we have been using cloud for all our dev instances).

  • For dev, I don't really care what OS /version you use. We have fedora 21/22 cloud images and also rhel/centos7. Whatever works best. For stg/prod it would be nice if we used rhel7.

RHEL7 would be ideal IMO. I should probably talk to tflink and see if I can help with packaging buildbot for it. I'd like to keep parity with the Taskotron deployment to avoid potentially redundant work, but since this is... less fully feature complete, docs-dev could probably go first with EL7.

  • Do you need a persistent volume for data? Or one for each host?

The master at least should have a persistent volume, mostly because I'd be impatient if everything had to rebuild. The intended design would allow disposable instances to do the building, but it's not something we want to rebuild regularly. The slaves basically do a git pull from fedorahosted then do stuff to the repo content, lack of storage it would only cost some network traffic so we can get by if it's scarce.

  • I assume the buildbot builds stuff and the master serves it out? Or could the machine that serves the content be some other machine?

The master is a job broker; it knows about the repos, the jobs to do with the repos, schedules, job triggers, etc. The slaves are assigned build jobs from the master. In this case, the 'publishing' build jobs will upload their built content (artifacts, in taskotron parlance) to the master, and another job will walk over the mass of content and drop the site navigation bits on top. The output of this last job is intended to be rsynced to the proxies.

Replying to [comment:4 immanetize]:

Replying to [comment:3 kevin]:

ok, so a few questions from me:

  • Does this mean we are not needing the docs-backend01 instance? ie, this is going to replace that process?

The instance can probably go away, but let's please leave the koji tag in place for now, in case the unexpected happens.

Yeah, thats fine. I didn't mean we should immediately delete everything, just more what the longer term plan is. ;)

  • I think it would make sense for these to be cloud instances for dev (we have been using cloud for all our dev instances).

  • For dev, I don't really care what OS /version you use. We have fedora 21/22 cloud images and also rhel/centos7. Whatever works best. For stg/prod it would be nice if we used rhel7.

RHEL7 would be ideal IMO. I should probably talk to tflink and see if I can help with packaging buildbot for it. I'd like to keep parity with the Taskotron deployment to avoid potentially redundant work, but since this is... less fully feature complete, docs-dev could probably go first with EL7.

ok.

  • Do you need a persistent volume for data? Or one for each host?

The master at least should have a persistent volume, mostly because I'd be impatient if everything had to rebuild. The intended design would allow disposable instances to do the building, but it's not something we want to rebuild regularly. The slaves basically do a git pull from fedorahosted then do stuff to the repo content, lack of storage it would only cost some network traffic so we can get by if it's scarce.

ok. What sized volume would you like for that persistent storage?

  • I assume the buildbot builds stuff and the master serves it out? Or could the machine that serves the content be some other machine?

The master is a job broker; it knows about the repos, the jobs to do with the repos, schedules, job triggers, etc. The slaves are assigned build jobs from the master. In this case, the 'publishing' build jobs will upload their built content (artifacts, in taskotron parlance) to the master, and another job will walk over the mass of content and drop the site navigation bits on top. The output of this last job is intended to be rsynced to the proxies.

Ah, excellent. ;)

Would you then like a third instance as a frontend that you sync this content to to confirm that it's all standalone, etc?

Replying to [comment:5 kevin]:

ok. What sized volume would you like for that persistent storage?

20GB should give us plenty of headroom. I had factored that in for the initial "20GB storage per host" figure; the master can get by with less given an additional persistent volume, say 10GB for the root filesystem.

  • I assume the buildbot builds stuff and the master serves it out? Or could the machine that serves the content be some other machine?

The master is a job broker; it knows about the repos, the jobs to do with the repos, schedules, job triggers, etc. The slaves are assigned build jobs from the master. In this case, the 'publishing' build jobs will upload their built content (artifacts, in taskotron parlance) to the master, and another job will walk over the mass of content and drop the site navigation bits on top. The output of this last job is intended to be rsynced to the proxies.

Ah, excellent. ;)

Would you then like a third instance as a frontend that you sync this content to to confirm that it's all standalone, etc?

You're thinking ahead of me :) A third instance would better parallel the eventual configuration; it isn't strictly required, but running httpd on the master would mean paring those bits out of ansible later on.

Are these instances publicly accessible by default? The master does have a rudimentary web UI that would be nice to access, but if it would be a problem an ssh tunnel would do.

Replying to [comment:6 immanetize]:

Replying to [comment:5 kevin]:

ok. What sized volume would you like for that persistent storage?

20GB should give us plenty of headroom. I had factored that in for the initial "20GB storage per host" figure; the master can get by with less given an additional persistent volume, say 10GB for the root filesystem.

ok.

  • I assume the buildbot builds stuff and the master serves it out? Or could the machine that serves the content be some other machine?

The master is a job broker; it knows about the repos, the jobs to do with the repos, schedules, job triggers, etc. The slaves are assigned build jobs from the master. In this case, the 'publishing' build jobs will upload their built content (artifacts, in taskotron parlance) to the master, and another job will walk over the mass of content and drop the site navigation bits on top. The output of this last job is intended to be rsynced to the proxies.

Ah, excellent. ;)

Would you then like a third instance as a frontend that you sync this content to to confirm that it's all standalone, etc?

You're thinking ahead of me :) A third instance would better parallel the eventual configuration; it isn't strictly required, but running httpd on the master would mean paring those bits out of ansible later on.

Are these instances publicly accessible by default? The master does have a rudimentary web UI that would be nice to access, but if it would be a problem an ssh tunnel would do.

They don't have to be, but then you wouldn't be able to access them except by another machine with an external ip in the same account in openstack, so it's probibly best to give them all external ips.

What would you like these called?

docs-dev-frontend / docs-dev-master / docs-dev-builder ?

Works for me, except let's make it docs-dev-builder01, the future may want additional builders. I've been using an ansible group for the builders to generate that portion of the buildbot configuration template file, ie

{{{
{% for host in groups['docs-dev-builders'] %}
# define each build slave
{% endfor %}
}}}

ok. I think I have it all setup for you:

docs-dev-frontend.fedorainfracloud.org
docs-dev-backend.fedorainfracloud.org
docs-dev-builder01.fedorainfracloud.org

They should all have your ssh key on them.
The backend has a /dev/vdb persistent volume (you can format and mount as desired).

The playbook is in playbooks/groups/docs-dev.yml

Let us know if you need us to add any other users or groups there or we need to change anything.

Can you please rebuild these as F22 instances? Taskotron is running on Fedora at the moment, and eyeing buildbot nine; I'm willing to start that effort but waiting on package updates for RHEL and working out deps in that space isn't the time scale we want to work on for a development instance. All other aspects are fine.

The instances have been rebuilt as Fedora 22.
Note that this doesn't have yum installed, so you'll have to change the playbook accordingly.

Login to comment on this ticket.

Metadata