Issue #382: Go Packaging Guidelines Draft - packaging-committee

Hey all,

I'll start addressing what I can, and tonight I've made a few amendments to the draft. There are more updates needed to the draft, like for gccgo usage, and some updates need more validation. I'm making a list for myself.

version/release section: I concur. We should be including the date stamp. I'm not sure how I missed that from the standard versioning guideline. I'll take a TODO to start the fixing of this. (side question: What's the process to propose updates to the existing guidelines?)
wrong link: I fixed the link
debuginfo: I clarified the draft a little around this, and also took a TODO to get an absolute position on this presently. Early go (1.0 - 1.1.1) had failures when stripped. This is hardly the case anymore. Also, gdb works nicely with external debugging symbols of a stripped go binary. One case that is still for certain is that go binaries built with gccgo can not be stripped. There is an existing BZ for this, and I've referenced it now.
macros: next release of the golang package will provide rpm macros including %gopath and %go_arches
name section: Sorry, I'm not sure what is trying to be achieved here. What is a reference example?
library section: add that wording and additional clarification too

As for the native golang .a archives, I'm not immediately finding precise statements on that. http://golang.org/cmd/go/ does have information on the staleness of a package (like a library), and other behaviour about the building.
Though this prebuilt .a are not required, similar to .pyc for python. If not present the code in the .py is still used.
I just attempted to build an application in go1.2.2, then again with go1.3beta1, with just go build github.com/openshift/geard/cmd/gear, it fails with
{{{
/tmp/tmp.4D7rTFv9sz/src/github.com/openshift/geard/cmd/gear/commands.go:15: import /tmp/tmp.4D7rTFv9sz/pkg/linux_amd64/github.com/openshift/docker-source-to-images/go.a: object is [linux amd64 go1.2.2 X:none] expected [linux amd64 devel +f8b50ad4cac4 Mon Apr 21 17:00:27 2014 -0700 + X:precisestack]
}}}
if you run strings /path/to/a/file.a | grep "go object" you can see the compiler version used.

This is why we don't want the system libraries to carry these *.a archives, otherwise libraries would need to be rebuilt on every golang version too.

For user defined GOPATH, if the system compiler is updated, this would affect them too, but it is common enough for folks to respond "just rm -r $GOPATH/pkg", and there is no harm done. There are go build flags that can recompile all libraries used for a build target, though that includes stdlib, which would require root privilege, so that is not encouraged.

toshio commented 9 years ago

We talked about the library linkage at last week's meeting. We're currently seeing this as a case of upfront pain versus deferred pain. In most cases, Fedora prefers upfront pain in these situations as it usually obvious what that pain is and therefore easier to do work for while deferred pain has hidden costs that people do not see immediately but then have to deal with when you are in a time crunch later.

A couple highlights from the meeting:

When a packager gets a bug report against their application, is one of their first steps going to have to be to build new packages that pick up all the updated deps and then ask the bug reporter to retest because they don't know how the state of the packages they have on their system now relates to the state of the system when the package was built?
Taken to extrema: requiring rebuilds will mean nobody ever updates a dep. Not requiring updates means that everything will be broken all the time, and nobody will update leaves. Since FESCo/Fedora updates policy is to move to a model where rawhide is full of updates but stable Fedora releases do not update without reason and that things that are depended upon by more things should be updated less, that seems to fit with the former model better than the latter.
It would be helpful if we could have two spec files to look at as well: A lib + a binary using this lib). We usually have these sample spec files to experiment with when creating guidelines for a new programming language.

Replying to [comment:17 toshio]:

We talked about the library linkage at last week's meeting. We're currently seeing this as a case of upfront pain versus deferred pain. In most cases, Fedora prefers upfront pain in these situations as it usually obvious what that pain is and therefore easier to do work for while deferred pain has hidden costs that people do not see immediately but then have to deal with when you are in a time crunch later.

A couple highlights from the meeting:

When a packager gets a bug report against their application, is one of their first steps going to have to be to build new packages that pick up all the updated deps and then ask the bug reporter to retest because they don't know how the state of the packages they have on their system now relates to the state of the system when the package was built?

This will be something that will require recommended process and setting a good
example. If a packager gets a bug report, it may be likely that upstream is
aware of the issue, or able to have a channel to get that fix into itself or the
dependent library.
Also, as is the case with docker, upstream bundles all the dependent libraries
in a directory, and enforces the exact version of those dependencies. We
presently unbundle these dependencies and package them independently. But it
allows us to set explicit versions of these libraries that docker needs in
BuildRequires.
So just having a blind requirement like:

{{{
BuildRequires: golang(github.com/gorilla/context)
}}}

may be a bit dangerous over time. Such that you set a '>= 0-0.13' for the
initial version you use in your package, then constrain further as needed.
As this landscape grows, and various packages will have conflicting versions of
dependent libraries, this may seem tricky, but still doable.

Would it be worth including a %doc file that includes the NVR of the libraries
used at build time, since that information is inherently dropped from RPMS due
to the expectation of only using shared/dynamic linking?

Taken to extrema: requiring rebuilds will mean nobody ever updates a dep. Not requiring updates means that everything will be broken all the time, and nobody will update leaves. Since FESCo/Fedora updates policy is to move to a model where rawhide is full of updates but stable Fedora releases do not update without reason and that things that are depended upon by more things should be updated less, that seems to fit with the former model better than the latter.

Understandable. And one that could be argued for bundling. The source would be
present for licensing and debugging, it would allow the binary to match exactly
what the upstream supports and expects (giving a better communication for
getting fixes into upstream), and for golang, there would be no concern of rpm
name collisions or import path collisions.
Once there are multiple leaves that have common parent libraries, and there
becomes a lock on version, and the leaves atrophy due to this, that will be an
issue. I do not have a good answer for that worse case scenario.

It would be helpful if we could have two spec files to look at as well: A lib + a binary using this lib). We usually have these sample spec files to experiment with when creating guidelines for a new programming language.

Added.

as a bump, I've been working on the rpm dependency arrangement to accomodate builds using gccgo, as well as with Richard Henderson (rth) on having some tooling available to non-golang supported architectures.
https://bugzilla.redhat.com/show_bug.cgi?id=1156129 has been opened and has progress.

Additionally I've made some updated to the guidelines itself. Mostly clarifying shared libraries and the repoquery command for dependencies.

Has this moved forward to the point that the packaging committee should look at it again?

lsm5 commented 9 years ago

Since many golang based tools like etcd, kubernetes, docker bundle up their dependencies via godeps or a vendoring script, I would like to suggest we prefer using these bundled libs whenever available and depend on golang rpms if bundled libs aren't provided or aren't good enough.

Case for bundled libs:

- zero time spent in packaging up the deps

- use versions/commits known to build the package successfully / tested by upstream

- golang libs often end up having cyclic dependencies which are painful to solve via rpm updates.

- authors sometimes move their source code hosting (for example: from googlecode to github causing changes in import paths and thus changes in package names, and metaproviding new import paths or creating new packages is an added overhead.)

Case for rpms:

- the bundled golang libs though available in tarballs released by the tools (docker/kube etc.) technically come from different upstreams.

- sometimes the vendors bundling these tools may not have accounted for bug fixes patched in the upstream libraries, which might be easier to fix via rpms.

fweimer commented 9 years ago

Using bundled libraries means that we will discover crucial differences only in the most inconvenient moment possible, when we have to apply a critical bug fix (security or otherwise). I think we should really avoid that and do the required consolidation work upfront, otherwise we end up porting security fixes to all the different bundled versions, instead of getting away with just rebuilding the affected packages against the fixed library dependency.

lsm5 commented 9 years ago

Replying to [comment:24 fweimer]:

Hi Florian,

Using bundled libraries means that we will discover crucial differences only in the most inconvenient moment possible, when we have to apply a critical bug fix (security or otherwise).

In such cases we could of course introduce a builddep on the updated rpm until upstream bundles the patched lib. But such cases are few and far between at least yet in golang-land so I'm not certain having tools depend on updated rpms all the time is worth the effort.

I think we should really avoid that and do the required consolidation work upfront, otherwise we end up porting security fixes to all the different bundled versions, instead of getting away with just rebuilding the affected packages against the fixed library dependency.

Depending on rpms probably doesn't consider the case when library authors introduce changes which breaking existing stuff. Don't remember off-hand if this ever occurred, but I guess that's not impossible.

Most of these golang packages are usually updated only when a tool using them needs them to be some version/commit. These packages usually don't even have numbered versions (just commit values), so keeping them updated as such would be a never ending task.

Also cyclic deps at times: for example in the case of docker and libcontainer, if we consider rpms, deps are as such: docker -> golang-github-docker-libcontainer-devel -> docker-pkg-devel (a subpackage of docker).

I am in favor of packaging up golang libraries as rpms. But imho, it's preferable to depend on rpms only in case of security/bug fixes, preferably that too if upstream isn't quick enough (defining 'quick enough' is upto SRT people I guess) to incorporate fixes in their bundle.

Please let me know if I'm missing something.

rjones commented 9 years ago

What's missing is good software engineering. Maybe Go doesn't encourage it (well, that's pretty obvious when you look at the language itself), but that doesn't mean Fedora has to put up with it.

fweimer commented 9 years ago

Replying to [comment:25 lsm5]:

Replying to [comment:24 fweimer]:

Hi Florian,

Using bundled libraries means that we will discover crucial differences only in the most inconvenient moment possible, when we have to apply a critical bug fix (security or otherwise).

In such cases we could of course introduce a builddep on the updated rpm until upstream bundles the patched lib.

This assumes that the versions have not drifted, or that the bundled sources remain unpatched.

Past experience in this area suggest that both cases happen regularly, unless some downstreams regularly build with system libraries (which is what we and others do with C/C++). I'm not sure if we can rely on other distributions to keep things in shape for us with Go.

I think we should really avoid that and do the required consolidation work upfront, otherwise we end up porting security fixes to all the different bundled versions, instead of getting away with just rebuilding the affected packages against the fixed library dependency.

Depending on rpms probably doesn't consider the case when library authors introduce changes which breaking existing stuff. Don't remember off-hand if this ever occurred, but I guess that's not impossible.

They happen, and we have to catch these cases early, not when we're trying to fix a critical bug.

Most of these golang packages are usually updated only when a tool using them needs them to be some version/commit. These packages usually don't even have numbered versions (just commit values), so keeping them updated as such would be a never ending task.

Yes, but that's a basic fact of life once you package something that isn't completely dead upstream. So is coordinating with your reverse dependencies, to avoid breaking them with your latest package update.

Also cyclic deps at times: for example in the case of docker and libcontainer, if we consider rpms, deps are as such: docker -> golang-github-docker-libcontainer-devel -> docker-pkg-devel (a subpackage of docker).

What kind of dependency is this? Build time or installation time?

I am in favor of packaging up golang libraries as rpms. But imho, it's preferable to depend on rpms only in case of security/bug fixes, preferably that too if upstream isn't quick enough (defining 'quick enough' is upto SRT people I guess) to incorporate fixes in their bundle.

We only want to apply minimal changes to fix critical bugs, not overhaul the build system to build against a completely different unbundled library version. That's far more risky than something we'd generally like to ship as a security update.

I understand that this needs more work up front, but I really don't see a way around that.

lsm5 commented 9 years ago

Replying to [comment:27 fweimer]:

Replying to [comment:25 lsm5]:

Also cyclic deps at times: for example in the case of docker and libcontainer, if we consider rpms, deps are as such: docker -> golang-github-docker-libcontainer-devel -> docker-pkg-devel (a subpackage of docker).

What kind of dependency is this? Build time or installation time?

Buildtime, most of these deps are almost always buildtime (sometimes installtime deps only for other golang libraries), don't think anybody bothers with using them for purposes other than rpm-ing. Also, there was at least one other cyclic dep case which I don't remember off hand but can dig out if need be.

fweimer commented 9 years ago

Replying to [comment:28 lsm5]:

Also cyclic deps at times: for example in the case of docker and libcontainer, if we consider rpms, deps are as such: docker -> golang-github-docker-libcontainer-devel -> docker-pkg-devel (a subpackage of docker).

What kind of dependency is this? Build time or installation time?

Buildtime, most of these deps are almost always buildtime (sometimes installtime deps only for other golang libraries), don't think anybody bothers with using them for purposes other than rpm-ing. Also, there was at least one other cyclic dep case which I don't remember off hand but can dig out if need be.

What is the benefit of building golang-github-docker-libcontainer-devel from a different source package than Docker?

Building it along with Docker would at least eliminate the cyclic dependency. It's a hack, but it might work now and for the forseeable future. It would certainly avoid the source code duplication, and the risk of accidental version drift.

Replying to [comment:29 fweimer]:

Replying to [comment:28 lsm5]:

Also cyclic deps at times: for example in the case of docker and libcontainer, if we consider rpms, deps are as such: docker -> golang-github-docker-libcontainer-devel -> docker-pkg-devel (a subpackage of docker).

What kind of dependency is this? Build time or installation time?

Buildtime, most of these deps are almost always buildtime (sometimes installtime deps only for other golang libraries), don't think anybody bothers with using them for purposes other than rpm-ing. Also, there was at least one other cyclic dep case which I don't remember off hand but can dig out if need be.

Buildtime in most cases. Which implies installation dependency if you want to install source codes with all its deps (during building). At the moment there are about 3 pairs of packages I have encountered with (not counting docker-io), which has cyclic deps and in order to Require them, some BuildRequires and Requires are missing. This can be solved only by a two phase build. In the first phase only source codes are updated (go build and go test are disabled for the moment). In the second phase, go build and go test are enabled, force minimal version of cyclic dependencies. Between the phases you must wait about 30 minutes before built source codes gets into buildroot-override. Kubernetes already presented cyclic deps on github.com/skynetservices/skydns/. I am trying to make daily builds of kubernetes and only reasonable way how to do this is to remove deps of skydns on kubernetes. It is only a matter of time when someone will want to build skydns. Then it will be impossible to build both kubernetes and skydns in a simple way. Plus kubernetes from time to time introduces new deps like today => about 6 new packages into Fedora => create review request => wait for repository => scratchbuild => git push => build => update => buildroot-override. Having more such tools requiring daily builds ...

What is the benefit of building golang-github-docker-libcontainer-devel from a different source package than Docker?

Packaging time saved and automating of updates if we talk about go applications in general. Using vendored sources, daily builds can by croned. Otherwise as I wrote above. Yes, being able to patch one rpm with security fix and rebuild all packages depending on the rpm does not take too much time and you do all the work in one place.

Building it along with Docker would at least eliminate the cyclic dependency. It's a hack, but it might work now and for the forseeable future.

You will not elimate the cyclic deps unless you merge docker-io and golang-github-docker-libcontainer into one package.

It would certainly avoid the source code duplication, and the risk of accidental version drift.

It will solve version drift of docker-io and libcontainer. But introduces a drift among other packages. When new commit of some dependency is introduced, all projects relying on it should update their vendored codes. But this is not a rule and in reality some projects are still using older commits, some even old import paths which no longer exists on the latest release. In order to support all this projects in Fedora, you must duplicated source code in an rpm itself and sed import paths (golang.org/x/net and golang.org/x/text are examples). So using rpms we don't duplicate code among packages but duplicate code in the rpm itself. Which is better for security fixes but still...

Replying to [comment:26 rjones]:

What's missing is good software engineering. Maybe Go doesn't encourage it (well, that's pretty obvious when you look at the language itself), but that doesn't mean Fedora has to put up with it.

A lot of steps are automated right now. But still there are some that require unspecified amount of time. Having versions of golang devel source codes instead of commits, rebase to a newer version is not very often and the time spent on packaging is reasonable. But building one project from 40 other projects and trying to keep up with the latest changes is not a one man show. Not talking about f20 and f21 when you have to assure system stability. Go tools are still young and still evolving but can we be sure that some commit between released versions of docker does not cause crashes? This is the reason why we rebase usually only in rawhide. But a user who wants to test stable kubernetes will not go for rawhide unless he knows what is doing. He will take f21. In order to keep f21 build of kubernetes up2date, you have to update docker. We can not update docker every week if we don't have a stable version.

Having separate package for source codes and separate one for binaries would solve this problem. But this is not a solution from a packaging side. Increasing a number of golang packages just in order to have a system stability and a framework to keep all go tools up2date on f20/f21 ...

Nor is using vendored source codes. What other option we have? Quality vs. Time.

Could someone summarize the current state of this ticket/draft so that the committee can either put it on its agenda or keep it in a holding pattern? Thanks.

Just a ping. It's been another month with no progress. Is anyone still working on this?

sharkcz commented 9 years ago

With gcc5 with an up-to-date Go support in F-22 and Rawhide we should really consider supporting both compilers in some way.

sharkcz commented 9 years ago

Replying to [comment:34 sharkcz]:

With gcc5 with an up-to-date Go support in F-22 and Rawhide we should really consider supporting both compilers in some way.

And we are able to build docker for ppc64* and s390x with gcc-go, so it is real :-)

true true.

And now that gcc-go 5.0.0 is providing /usr/bin/go, we have a new problem. golang can use the gccgo compiler, but you can not have golang and gcc-go installed at the same time presently.
We'll likely need to take an approach like other languages have that have multiple implementations, where pieces are not mutually exclusive. (i.e. meta-Provides, update-alternatives, etc.).

I am very interested in seeing this done and closed. and welcome help on it as well. Like everyone, this is not the only thing on my TODOs. :-)

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1184221

Replying to [comment:33 tibbs]:

Just a ping. It's been another month with no progress. Is anyone still working on this?

Current issues:

1) Changing build requirements, list of commits of dependencies (the latest source codes can introduce dependencies on new packages)

In a time of building a binary (e.g. cadvisor), the package has certain build-time dependencies. Meantime the dependencies get updated (each update can and usually does break back-compatibility; it introduces new dependencies). Once we want to apply a patch (e.g. security fix, bump release, bug fix, ...) the package can not be built. Or if we want to know which source codes were used to build it we don't have this information (repoquery does not work as the dependencies has changed over time). If we want to debug the binary we can not as binaries has no debugging information (etcd is the first binary providing debug info).

Yes, shipping Godeps.json file or any other reasonable list of all dependencies is a good idea. It is enough for all packages/tools providing binaries. This puts some responsibility on developer to keep the list up2date. Kubernetes and docker already do this. This should be required if we want to keep all tools in shape.

2) Stripping debug info

Should be working now, etcd already strips debug info. There are still some issues [1].

3) Package source codes only for building, not development

Devel subpackages should not be used for development as I can not guarantee back-compatibility. Not until there are stable and versioned source codes. Updates of devel subpackages break back-compatibility.

4) Bug reporting

How to reproduce an issue? Which dependencies were used for building? Right now the only source of information about used dependencies is root.log. As koji's GC started to remove builds some dependencies no longer exist. But switching to correct branch and commit of corresponding package all builds (dependencies) can be rebuilt and analyzed. Sadly it is possible the fix is in the latest commit of a dependency which can break back-compatibility. So it has to be backported. Is it worth backporting the change? As devel source codes are used only for building not development there has been only a few bugs so far. Once the number of package providing binaries grows up, we will have to deal with a demand on different commits of the same dependency.

5) Packaging in f20/f21 (reasonable updates + bug/security fixes) and f22/f23 (up2date updates)

docker, kubernetes, etcd, cadvisor, flannel, fleet, asciinema are packages providing binaries. Influxdb and hugo are comming soon. Each with its own Godeps or vendored directory containing bundled dependencies. We can not debundle all dependencies because of commit collisions (different commits of the same upstream project/dependency) unless we package all of them. What does it mean for f20-f23? A lot of pain. Kubernetes is still evolving, etcd as well. Recently I have updated etcd from 0.4.6 to 2.0.3 in all fedoras. All its dependencies were updated as well. What happens if I update kubernetes? 10 golang devel packages get updated/new packages get into Fedora/some packages are no longer necessary. It makes sense to "in a some way" debundle all dependencies for future use once all tools get stable. Then we can provide stable ecosystem/framework for golang development on Fedora. But debundling is already complicated.

6) Bundled vs. debundled dependencies

There are already collisions between different commits of the same dependency. In general, each tool can depend on different commit of the same dependency among all tools. Those commits does not have to be back-compatible with each other. If we want debundle all dependencies we have to create a subpackage for each such commit. What is name convention for these subpackages going to be? %{commit} prefixing/suffixing import path? How many subpackages a given package is going to have? As many as there are tools? There are 5 branches so far (not counting epel7). With 10 subpackages 50 subpackages need to be fixed if security bug shows up. Do we really want this scenario? If we use bundled deps instead, we will have to patch and rebuild the same number of subpackages anyway. Bundled vs. debundled dependencies should be definitely discussed.

7) Api back-compatibility (updates are usually breaking back-compatibility)

Don't have a complete information about API. I can detect change in API between two commits of the same project. However, for a given project (source codes) I can not detect which symbols from which package are actually used. Started to work on a script to list all used symbols but this is not trivial. With every update of any package there is a high chance some package gets broken, some package can not be built anymore, some import paths get missing.

As long as upstream projects does not stabilize (keep breaking back-compatibility) I don't think we are ready to close this draft. It is possible I missed some issues but these are the most important from my packaging point of view.

jcajka commented 9 years ago

Hello,

I have been trying to build docker using gcc-go(successfully :)), on ppc64(le)(just built, no testing yet) and on x86_64(built/installed, it seems to work, but no extensive testing done). Also sharkcz built it on s390x. I'm definitely no Go(lang) expert, but this came to my mind while modifying specs to build docker(and its BR) using gcc-go, reading Go guidelines draft and this discussion.

%{gopath} should be defined outside golang package
use go(...) for general dependencies(devel/source packages), and golang(...)/gccgo(...) for binary(compiler) based dependencies
rename %{go_arches} to %{golang_arches}(used that way now) and maybe introduce %{gcc-go_arches}
introduce a rpm macros for a commonly used constructs like "go build" -> %{go_build}, "GOPATH=%{buildroot}/%{gopath} go test" -> %{go_test}, to simplify packaging and also to have some(limited global) control over arguments/command(s) used(more easily switch between compilers, specify compiler specific flags, maybe even %{gcc-go_build}/%{golang_build}?)

(all assuming gcc-go and golang will coexist together)

These note are just thoughts and I hope they are sane.

There are huge loads of go packages going in at this point even though there are no guidelines at all. I have no idea how package reviews are even proceeding. This really is quite a mess, it would be great if someone could give FPC some idea of where we can go from here. We really are going to need am updated draft, and FPC can't really write it.

[1] https://github.com/ingvagabund/gofed

Replying to [comment:40 tibbs]:

There are huge loads of go packages going in at this point even though there are no guidelines at all. I have no idea how package reviews are even proceeding. This really is quite a mess, it would be great if someone could give FPC some idea of where we can go from here. We really are going to need am updated draft, and FPC can't really write it.

Spec files are generated by gofed [1]. The spec file conforms to draft. List of Provides, BuildRequires and Requires is generated by inspecting project's tarball. Gofed is built over Go parser (provided as a package in the Golang language itself). The parser provides AST (abstract syntactic tree) which gofed reads and returns a list of all imported and provided packages. Entire hiearchy of each project is checked and compared with language specification [2-6]. So basically spec file is checked only for the correct license, description, for missing documentation. List of provided and [B]R is checked for nonsense paths.

The tool itself allows you to create a graph of dependencies among all golang packages. When there is a security issue you can then detect which packages need to be rebuilt.

However there is not too much to add to the draft as long as there are back-compatibility issues breaking API and bundling/debundling question (6 and 7 problems in comment 38). I am updating golang packages and polishing spec files based on new information I am getting with new projects, new repositories and problems I discover.

[2] https://golang.org/cmd/go/#hdr-Import_path_syntax

[3] https://golang.org/cmd/go/#hdr-Remote_import_paths

[4] http://golang.org/ref/spec#Import_declarations

[5] http://golang.org/doc/effective_go.html#names

[6] https://golang.org/doc/code.html#PackageNames

tibbs commented 8 years ago

It's been several more months with no forward movement here. Is anyone still interested in working on guidelines? We really shouldn't have any Go packages at all in the distribution without at least some guidelines.

mattdm commented 8 years ago

I was just talking to Adam Miller about this. He'd like to package OpenShift v3 for Fedora; it makes significant (total? i dunno) use of Go, and he'd like there to be guidelines in place before going forward.

Despite having kicked this off, I'm really not the right person to work on this, as I'm not a Go subject area expert. (I have built several programs, and I have written Hello World.) I will, though, try to find an owner.

tibbs commented 8 years ago

The thing is, if we could just get something simple:

Where to put the files, how to build, etc.
An example spec.
A reference to the tool you should generally use to generate a guidelines-compliant spec.

That would be so much better than having nothing. The horrible static linking thing doesn't have to be addressed initially.

rjones commented 8 years ago

OpenShift is a binary presumably, not a library? I guess that binaries would not need special packaging.

FWIW also libguestfs has dropped its golang bindings (which are a library), because they caused endless trouble. Go seems to specifically be designed to be hard to binary package.

vbatts commented 8 years ago

Replying to [comment:45 tibbs]:

The thing is, if we could just get something simple:

Where to put the files, how to build, etc.
An example spec.
A reference to the tool you should generally use to generate a guidelines-compliant spec.

This is largely underway, despite the slow conversations. Jan Chaloup has put most work in here.

That would be so much better than having nothing. The horrible static linking thing doesn't have to be addressed initially.

The static linking is slowly changing. Go1.5 (which had its beta1 release yesterday) introduces shared linking, but it will need a baking period before jumping whole-sail into on the packaging guidelines.

mattdm commented 8 years ago

Replying to [comment:46 rjones]:

OpenShift is a binary presumably, not a library? I guess that binaries would not need special packaging.

However, it uses about 200 libraries. Scary times in Swamp of Unbundling.

jchaloup commented 8 years ago

Although I don't have a full list of projects that can not be built from debundled dependencies to objectively show it is really impossible and why it is, I would like to summarize the current effort.

gofed is continuously being extended for new tools, scans, checks to provide the best experience with golang packaging (extending to secondary architectures, support for testing, minimization of size of subpackages, checks for missing dependencies, etc.)
implementation of gofed-web server (http://209.132.179.123/, currently down, working on to get it back online) which scans all golang projects and reports backward compatibility issues, provides a graph of changes among commits, graph of dependencies, REST api for retrieving some information about projects, ... (under development)
support for CI testing of all projects to discover issues with golang or gcc-go compiler among other things (currently starting)

As it is merely impossible to remove problems with backward/forward incompatibilities, I would suggest to focus on minimizing all such incompatibilities or violations, minimizing dependencies among golang projects (in a form of partially generated spec files, increasing number of devel subpackages within one package) and more violations-oriented scan/checks. There is a lot of to improve, lot of to implement [1] on both gofed and gofed-web.

So if you have any idea how to improve packaging, what else we could check for, what scan we could run, you are welcome to open an issue or pull request on [1] or [2].

[1] https://github.com/gofed/gofed/issues

[2] https://github.com/gofed/gofed-web/issues

agoode commented 8 years ago

Any ideas on moving this forward? I am trying to package a go program that has dependencies not in Fedora, but doesn't bundle them.

When I started this, I assumed this would be the perfect case: Fedora hates bundling. But it's proven to be quite confusing.

First, I started packaging all the dependences, but was recommended not to package a dependency for epel7, since epel7 prefers bundling(!). But it just feels wrong to try to force bundling on a project that doesn't use it.

Is there any chance of reconciling Fedora with EPEL here? I would love to avoid special casing across these two systems.

decathorpe commented 7 years ago

As of the last stable golang release (1.6 series), the upstream go compiler seems to support building shared libraries [1] for most of fedora's arches (i686, x86_64, armhfp, aarch64) [2].

It seems this addresses the issues of "particular note" mentioned in the ticket - so, is there any chance of getting shared libraries into fedora for a future release?

This would (AFAIK) completely remove the issue of "bundling" (and could even bring the security benefits of not having all golang* packages statically linked to - maybe old and insecure - versions of packages?).

[1] https://docs.google.com/document/d/1nr-TQHw_er6GOQRsF6T43GGhFDelrAP0NqSS_00RgZQ/edit?pli=1#heading=h.od7scvi4n6pe

[2] https://golang.org/doc/go1.6#compiler

jchaloup commented 7 years ago

Any ideas on moving this forward? I am trying to package a go program that has dependencies not in Fedora, but doesn't bundle them.

The guidelines are ready for review.

When I started this, I assumed this would be the perfect case: Fedora hates bundling. But it's proven to be quite confusing.
First, I started packaging all the dependences, but was recommended not to package a dependency for epel7, since epel7 prefers bundling(!). But it just feels wrong to try to force bundling on a project that doesn't use it.

The former reason was the RHEL7 vs. epel7 issues. I have already opened bz (https://bugzilla.redhat.com/show_bug.cgi?id=1409553) to stop supporting all go packages with source codes in RHEL in favor of epel7. Once approved, we can de-bundle in epel7 as well. I have already started with rebuilding rpms in epel7. Just currently blocked on the bz#1409553.

Is there any chance of reconciling Fedora with EPEL here? I would love to avoid special casing across these two systems.

The ultimate goal is to build all rpms across various distros/versions from a single spec file.

rathann commented 6 years ago

What's the current status here? I've just tried to package a CLI tool written in Go and I have to admit I do like the gofed github2spec file generator, though it seems to have some bugs.

For example, the generated spec doesn't seem to cover the case when I want to run the tests but not produce any -devel subpackages and it defines a %gobuild macro which is not explained in the draft.

Edited 6 years ago by rathann

zbyszek commented 6 years ago

Ping?

In the interest of being public I'm trying to package the blackbox and snmp prometheus exporter, which require the creation of dozens of dependency packages.

I don't like the proposed guidelines much, they contain too much boilerplate, I don't care if it's cut and pasted manually or generated by gofed the result is too long, cryptic, error-prone and heavy maintenance. I'd rather have short and succinct human-readable specs than some sort of bloated intermediary machine-generated rpm code.

So I'm slowly moving all the parts I see repeated in all the specs I'm going through right now to rpm automation (macros and autoprovides). Every time I macroize part of the guidelines and refactor past specs to use them it exposes packaging inconsistencies and problems that were induced by the awful amount of boilerplating.

When I'm reasonably sure I've stabilized those I will post them here. If people are interested I can post early versions sooner if they want to play with them.

If gofed is smart enough to deduce requires from go source code it should be turned into a Go autorequires engine not used to lint rpms post-creation.

jcajka commented 6 years ago

@nim I think that all boils down to not having enough manpower, if you can get in touch with jchaloupka he will definitively have lots of work/tasks that he will appreciate help with.

TBH one(or small set of) magical %gopackage macro will be in the end more fragile than boiler plate code that can be easily modified without too many side effects(across whole distribution) for any purpose/package.

@jcajka There is no inherent robustness in boiler plate. Boiler plate means it's easy to workaround problems by adapting manually and locally the boiler plate code, which means:
- it's easy to forget or avoid fixing templates
- any generated boilerplated spec will degrade fast over time as it will drift from the original intentions and won't be reactualized.
- the resulting spec pool is completely heterogeneous and difficult to audit/fix as different packagers choose different adaptations.

Nothing that can't be fixed by infinite amounts of manpower (packaging and review side) but that's not really available.

This guidelines proposal is pretty old, there is a pool of packages in Fedora that try to follow it, it's now possible to evaluate how the result ages. And from my semi-random sampling those packages age terribly. The state of the Go specs I see in Fedora is pretty awful.

And that's not because the Fedora packagers wanted to do a bad job, it's a natural consequence of generating boilerplate spec code without consolidating it in common routines (the usual copylibs result). rpm has lots of possibilities to automate repetitive tasks, this proposal makes little use of them

So:
- the technical parts of the proposal are way better than anything I could do alone (my level of Go knowledge is approaching zero)
- the spec interface proposed to packagers is awful, gofed is used to workaround this but this workaround has detrimental side effects on the maintenability of the result.

Edited 6 years ago by nim

One concrete example of what one can achieve using rpm facilities instead of just pumping lines of generated spec code.

36-lines Go spec file, most of it being a useful package description. You can bump it to a new commit or a new version just by changing two lines in the header and the rest of the spec will still work. All the created directories are properly owned by the package.

The original Fedora spec is here:
https://src.fedoraproject.org/rpms/golang-github-miekg-dns/blob/master/f/golang-github-miekg-dns.spec

It takes 157 lines to achieve about the same result, has a minimal 4-words description, but at least its Source URL is working (not the case of several specs I checked). And it's not even a complex Go package, I didn't choose it because it was particularly good or bad, it was just one of the refactored specs I had readily available.

Edited 6 years ago by nim

And while I'm at it the automation files this spec uses

Probably a lot of things could be improved, from the macro naming to the macro content, but the whole point is those can be improved in a single place over time instead of needing to sift through hundreds of existing Go packages to separate past generated boilerplate from new generated boilerplate from specific package-specific adaptations that need to be preserved and ported or the result will break.

And I doubt all the inadequacies in my code are any worse than what the average packager will do to beat the proposed guidelines boilerplate into semi-working shape.

zbyszek commented 6 years ago

Can you start a repo with your macros somewhere, e.g. on pagure? It'd be easier to watch their evolution and contribute (although I'm not volunteering to that).

Your spec file is indeed very readable. I think you must wrap the description though, otherwise rpmlint will complain about long lines.

Can you start a repo with your macros somewhere, e.g. on pagure? It'd be easier to watch their
evolution and contribute (although I'm not volunteering to that).

I was sort of hoping to wrap those quickly, get them included in existing packages, to avoid creating and managing a separate repo. Will do it if the quickly does not happen.

Your spec file is indeed very readable. I think you must wrap the description though, otherwise
rpmlint will complain about long lines.

Yes, I know (even if the rpmlint rule is getting ridiculous in a 4K screen world where everyone uses terminal emulators and not prehistoric hardware dumb terminals). I posted the spec to showcase concrete results, and as it existed on my system, not as a perfect example.

Edited 6 years ago by nim

jchaloup commented 6 years ago

@nim in general, I like what you are suggesting. From a point of view of the maintainer (and of any living human being) it is important to provide transparent and easy-to-maintain spec files so we don't spend our time on designing a new universe. I would really like to see a working example over some spec file, e.g. md2man, etcd, flannel.

On the other hand I see some downsides of that approach:
- when updating how are you going to figure if new build-time deps are in the distro without running scratch-build? I don't think you will be able to unless you evaluate the macros first
- how are you going to analyze the spec if the most part of it are automatically generated? I don't think you will, unless you evaluate the macros first. I got a tooling that runs periodic scan of all go packages in Fedora rawhide. It reads specfiles and collects information about current commit, dependencies, provided packages, etc.
- the dependency analysis will be nondeterministic (impossible to get a list of direct dependencies -> harder to debug and determine where is a dependency missing)
- what about use cases where you will want to provide more devel subpackages, e.g. devel-client, devel-server, devel-types? One will have to provide more complex like %_go_meta_devel([list,of,relevent,go,packages]) that will generate a subset of Provides, BuildRequires and Requires. The same holds for the list of tests. Some tests fail on some architectures and they need to be skipped. One would need to create a conditinal macro that would filter some tests on some architectures.

So no matter which way we go, if it is going to be fully parametrized spec file, boilerplate generated spec file or something between, it would be great to take into consideration not just a maintainer point of view but a tooling point of view as well.

Hi @janchaloup

@nim in general, I like what you are suggesting. From a point of view of the maintainer (and of any
living human being) it is important to provide transparent and easy-to-maintain spec files so we
don't spend our time on designing a new universe. I would really like to see a working example over
some spec file, e.g. md2man, etcd, flannel.

Well, I am targeting prometheus and its snmp and blackbox exporters, and so far the macros are working (for golang-foo-devel packages, golang-bar packages with a subpackage for binaries commands, app packages that should not use the golang namespace, multi-language packages where golang-zoo is just one subpackage).

I thought I had finished but a new round of automating revealed I had missed a few deeply hidden unit tests in prometheus common and those need a whole new set of buildrequires which are proving nasty to untangle (cyclic dependencies and too old bitrotten golang packages in rawhide when they exist).

If you're interested I can post the whole in-progress set, it's certainly big and varied enough for a first evaluation. They specs are actually more ambitious than current golang packages since they don't sport multiple layers of ifdefing to avoid thard problems.

On the other hand I see some downsides of that approach:
- when updating how are you going to figure if new build-time deps are in the distro without running > scratch-build?

Doing scratch builds is part of updating and setting up mock is more or less mandatory if you want to maintain software packages in Fedora ;). That being said, I'm not opposed at all to language-specific helpers that can be pointed to an upstream version or commit, and output its Build and Runtime dependencies. They exist for other Fedora ecosystems, and reduce guesswork delays. When their interface is correctly designed they are reused by rpm as language-specific autoprovides/autorequires engine.

What I'm opposed to is something that outputs badly structured spec files, that can not be really maintained, that either require regeneration from scratch at every update (losing the adaptations the packager added to workaround upstream problems), or bitrot at a fast pace.

how are you going to analyze the spec if the most part of it are automatically generated?

They are not automatically generated, they are consolidated in common routines that can be audited and fixed and improved and extended in a single place without rewriting all existing Go spec files all the time

I don't think you will, unless you evaluate the macros first.

There are basically two approaches: either your tooling is able to interpret spec syntax in a sufficiently complete and coherent with rpmbuild way you can work from spec files (that's what spectools and rpmdevtools do, or it can not, and you're better of querying the packages produced by rpmbuild, either via rpm -qp or via dnf repoquery.

Regardless of the result of this issue I strongly advice not working from spec files unless you want to learn a lot more about rpm internals you've ever wanted to. Leave spec file reading to rpm build, it's the reference implementation, it's not worth your energy to try to decode the advanced spec syntax which is occasionally necessary to package complex projects.

I got a tooling that runs periodic scan of all go packages in Fedora rawhide. It reads specfiles and
collects information about current commit, dependencies, provided packages, etc.

rpm -qp --provides golang-github-alecthomas-assert-devel-0-0.123.git405dbfe.el7.llt.noarch.rpm
golang(github.com/alecthomas/assert) = 0-0.123.git405dbfe.el7.llt
golang(github.com/alecthomas/assert)(commit=405dbfeb8e38effee6e723317226e93fff912d06) = 0-0.123.git405dbfe.el7.llt
golang-github-alecthomas-assert-devel = 0-0.123.git405dbfe.el7.llt

easy to parse commit hash without needing to parse the spec file (when commit is absent that's a full-version release). Repeat with rpm -qp --requires

the dependency analysis will be nondeterministic (impossible to get a list of direct dependencies ->
harder to debug and determine where is a dependency missing)

On the contrary, if you turn your tool in an rpm autodep engine, the dependency analysis will be more deterministic, because there will be an exact match between the deps that exist in rpm metadata, the files contained in the rpm package, and the output of your tool when processing those files.

Right now you are at the mercy of something changing between the spec generation by your tool and the actual rpm creation (manual changes in the spec file, patches, commit/version bump without regenerating the spec, neutralization of part of the spec by rpm conditionals).

what about use cases where you will want to provide more devel subpackages, e.g. devel-client,
devel-server, devel-types?

Put everything in a devel subpackage (default common case)

gofiles=$(find . -iname ".go" ! -iname "_test.go" -print)
%_go_install $gofiles

%files devel -f devel.file-list

Separate client stuff

gofiles=$(find . -iname ".go" ! -iname "_test.go" ! -ipath "/client/" -print)
%_go_install $gofiles
goclientfiles=$(find . -iname ".go" ! -iname "_test.go" -ipath "/client/" -print)
%_go_install -f client.file-list $goclientfiles

%files devel -f devel.file-list
%files client-devel -f client.file-list

So, as long as you're able to write a command that outputs the files to put in a particular subpackage, things will just work (I did this originally for unit test subpackages, before deciding shipping unit tests in addition to running them was a bad idea).

You don't need to specify Provides they are auto-computed.
BuildRequires are srpm-wide the subpackage layout does not change those.
You do need to specify the correct requires in each subpackage, because I don't autocompute those. Turning your tool in an an autorequires engine would fix this and make golang packages resilient to restructuring.

Some tests fail on some architectures and they need to be skipped. One would need to create a
conditinal macro that would filter some tests on some architectures.

As it happens I already did this one. I started like you by writing a go test line per subdirectory, got sick of reading project trees to write the correct number of lines (that's where I missed a unit test in prometheus common BTW), added looping, got sick of providing complete import paths, switched to looping over a list of subdirs, got sick of providing the subdir list, made the loop process all subdirectories except a blacklist, got sick of blacklisting example unit tests one by one, added wildcard expansion to the blacklist today.

So now my macros provide the infrastructure to do

%if run_every_possible_test_in_every_existing_subdir_with_go_files
%_go_checks
%else
%_go_checks tests_wants_network 'examples/*' unit_test_is_broken_and_I_dont_care
%endif

Will probably add auto-symlinking of vendor to %{gopath} since some evil uber unit tests stomp on GOPATH and expect deps in vendor.

So no matter which way we go, if it is going to be fully parametrized spec file

Parametrizing is fine, that's why I propose go macros (plural) and not macro (singular, that does everything in one shot).

You don't need to output unstructured raw code to allow parametrizing, you only need to think hard about the macros interfaces and API.

@janchaloup

Anyway since I finally managed to see the end of go-kit (though I suspect I'll find out on el-7 I still missed a few dependency loops)

Here is a full dump. Macros need building first, be installed on the system for spectool to work, be installed in the mock buildroot for mock to work
config_opts['chroot_setup_cmd'] = 'install @buildsys-build golang-srpm-macros'

(the forge macros are intended to be merged in non golang-specific place and the copy of go-compilers macros is just there so I don't have to worry about EL7/rawhide differences)

That gives you lots of real-world examples to check how the macros can be used. The specs that do not use the macros are packages I needed to rebuild locally (not available in EL7, or too old in rawhide, or both). I didn't try to fix them apart from bumping and rebuilding.

jchaloup commented 6 years ago

Well, I am targeting prometheus and its snmp and blackbox exporters

Please, keep in mind the Go packages in the distribution are not just about those. Some packages need customization. It would be unfortunate to make your macros prometheus-specific.

Doing scratch builds is part of updating and setting up mock is more or less mandatory if you want to maintain software packages in Fedora ;).

Are you saying anyone who does not use mock is not package maintainer in Fedora? ;) I am not saying the mock is a bad tool, not at all. In order to minimize the packaging time, I build and update all go packages locally and once they are "good-enough", I update the packages in Koji (I have the latest rpms installed naturally). The mock is useful when you want to test various Fedora version at once on the same machine. However, that is not done due to so many branches. So the trick is to update all branches at once when updating Rawhide (not always easy-to-do or possible).
So running the scratch-build is not what I want to do in order to see which dependencies are missing or needs updating when there are faster alternatives. Plus, the scratch-build will only tell if the dependency is in Fedora or not. It will not tell you if it is outdated, up-to-date or newer.

What I'm opposed to is something that outputs badly structured spec files, that can not be really maintained, that either require regeneration from scratch at every update (losing the adaptations the packager added to workaround upstream problems), or bitrot at a fast pace.

That is an overstatement. What is badly structured depends on your point of view. I am not saying the spec files looks like a new Ferrari. But I admit it is not easy-to-read for the first time. WIth the proper tooling updating a specfile does not take more time then making a tea. One just needs to know what to look for. Just like with anything else. Any particular adaptations you have in mind?

They are not automatically generated, they are consolidated in common routines that can be audited and fixed and improved and extended in a single place without rewriting all existing Go spec files all the time

The question was not about the automatically generated pieces of the specfile (the word generated is arbitrary). The question was about the specfile analysis and remains the same.

There are basically two approaches: either your tooling is able to interpret spec syntax in a sufficiently complete and coherent with rpmbuild way you can work from spec files (that's what spectools and rpmdevtools do, or it can not, and you're better of querying the packages produced by rpmbuild, either via rpm -qp or via dnf repoquery.
Regardless of the result of this issue I strongly advice not working from spec files unless you want to learn a lot more about rpm internals you've ever wanted to. Leave spec file reading to rpm build, it's the reference implementation, it's not worth your energy to try to decode the advanced spec syntax which is occasionally necessary to package complex projects.

I disagree the tooling needs to know a great deal of the rpmbuild. To add a new subpackage, update a list of dependencies, tests, ... the tooling does not need to know much about the rpmbuild. It just chooses the right place where to put them. When doing that it can check if the dependencies are valid (are they in the distro, are they of the proper commit, any known issue with it, ...). The aim is not to implement a tooling that will fully understand a specfile.
Querying a package (or rpm to be precise) requires a build. I don't see a way to get a list of build-time dependencies if you can not build a package. Srpm can help here but not sure if it will cover all use-cases.

easy to parse commit hash without needing to parse the spec file (when commit is absent that's a full-version release). Repeat with rpm -qp --requires

You can't encode all relevant information in the rpm. If you check the go spec file, there are other macros for import_path, provider, etc. that you will not get by querying (s)rpm. Even the commit is
shorter unless we use the full commit which may decrease ability the read the rpm properly.

Right now you are at the mercy of something changing between the spec generation by your tool and the actual rpm creation (manual changes in the spec file, patches, commit/version bump without regenerating the spec, neutralization of part of the spec by rpm conditionals).

I know the pain. However, at the end updating the spec file is negligible compared to the backward compatibility issues in the distribution or building a new package in the distro and waiting once it gets overriden. Or when your build fails just because a test that was not written to run on some architecture. The neutralization part has its importance. If everything goes fine, some of the if with_* conditions will go away. But again, on Fedora when most of the with_ macros are set to true, it is the same as ignoring them.

Put everything in a devel subpackage (default common case)

Works for most of the packages so far. Does not work for medium or higher sized projects. Imagine your project (among other things) defines a types package. In overall, the entire project requires 15 dependencies to be built, the types package only 2. Some other projects (SOP in short) just need the types package. If you provide two devel subpackages, one for the types and the other one for the rest, the SOP will just pull the types devel and two other deps. If you provide just one devel subpackage, the SOP will pull all 15 packages (or even more due to indirect deps). In a world where the disk size does not matter, why not. But, I don't want to install more deps than I need. Pulling less dependencies goes hand to hand with minimizing the deps problems.

That gives you lots of real-world examples to check how the macros can be used.

What about auto-generated build-time dependencies? The last time I heard it was still not possible to do that.