#3524 createAppliance not working in koji
Closed: Fixed None Opened 11 years ago by mattdm.

= bug description =

createAppliance is failing. See for example:

http://koji.fedoraproject.org/koji/taskinfo?taskID=4568545

= bug analysis =

The appliance log is:

{{{
Adding disk sda as /var/tmp/imgcreate-l4uR5m/tmp-mYGLa0/Fedora-18-Beta-ec2-20121007-x86_64-sda.raw
Extending sparse file /var/tmp/imgcreate-l4uR5m/tmp-mYGLa0/Fedora-18-Beta-ec2-20121007-x86_64-sda.raw to 10485760000
Losetup add /dev/loop0 mapping to /var/tmp/imgcreate-l4uR5m/tmp-mYGLa0/Fedora-18-Beta-ec2-20121007-x86_64-sda.raw
Formatting disks
Initializing partition table for /dev/loop0 with msdos layout
Unable to create appliance : Failed mount disks : Error writing partition table on /dev/loop0
Losetup remove /dev/loop0
}}}

The same appliance-creator command works when run on a bare RHEL 6.3 host.

= fix recommendation =

?


this ticket really belongs in releng trac.

working on rhel 6.3 is not relevant. it needs to work in a f18 chroot. the testing needed is bare metal f18 and f18 in a chroot on rhel 6 to try and work out where exactly the failure is. koji runs appliance-creator in a chroot. thats what we need to get to work. if it works on bare metal but fails in chroots we need to work out what exactly causes the failure. is it changes in f18? if so we need to get it fixed in the packages. if its something mock is doing we need to get that fixed.

There is no where near enough or the correct data here.

I know it's not enough, but I wanted to get it started.

Is it the case that appliance-creator can really only properly create an image for the distribution that it's running on (hence, the chroot)?

we always compose everything on the target, we use the buildroot. we need to use the correct version of appliance-tools. which is in the target. using rhel 6 could result in differences due to different versions of the tools doing different things.

Okay, so: updated a running cloud image to F18 and re-ran appliance-creator in that environment. Not chrooted, but all native. Haven't tested the resulting image, but one comes out just fine (i.e. not the error above).

Okay, ''so'':

I used mock to create an F18 chroot on RHEL 6.3. /dev/loop0 does not exist in the chroot by default, so of course that fails. Just to see what would happen, I created it by hand — and then, appliance creator gives exactly the error above.

{{{
Adding disk sda as /var/tmp/imgcreate-6pDx5m/tmp-mz9T4a/F18test-sda.raw
Extending sparse file /var/tmp/imgcreate-6pDx5m/tmp-mz9T4a/F18test-sda.raw to 10485760000
Losetup add /dev/loop0 mapping to /var/tmp/imgcreate-6pDx5m/tmp-mz9T4a/F18test-sda.raw
Formatting disks
Initializing partition table for /dev/loop0 with msdos layout
Unable to create appliance : Failed mount disks : Error writing partition table on /dev/loop0
Losetup remove /dev/loop0

}}}

Outside of the chroot, {{{/sbin/parted -s /dev/loop0 mklabel msdos}}} returns 0. Inside the chroot, it returns 1 but prints no obvious error message.

hrrm ok, well koji makes sure that /dev/ is bind mounted when we make appliances. seems we have either a parted bug or a mock bug.

Can you attach/upload a strace of the failing parted command?

Also, just to confirm - any SELinux issues in play?

builders have selinux disabled.

{{{
[root@adria ~]# /sbin/parted -s /dev/loop0 mklabel msdos
[root@adria ~]# echo $?
0
}}}

thats from my laptop running f18 so on a f18 host parted works just fine. when i have better internet access ill try in a chroot to reproduce what matt got and try debug what is going on. livecd creation works almost exactly the same and is not failing in the same way. so perhaps it is just a bug in appliance-creator

I made a F17 chroot in the EL6.3 environment. The parted command returns 0.

Then, built the F18 parted package against F17 and installed into that chroot. Now the same command returns 1.

(Straces coming soon)

The difference is in ped_disk_commit_to_os in [http://git.savannah.gnu.org/cgit/parted.git/tree/libparted/disk.c?id=v3.1 libparted/disk.c]. On line 457, ped_architecture->disk_ops->disk_commit (disk) fails in the chroot but succeeds outside. Tracking that down now.

In the chroot, ped_disk_get_max_supported_partition_count is coming up zero, which causes _disk_sync_part_table (called by the function above) to exit with an error (and not do its thing). Outside of the chroot, it's "64".

Hmmm. /sys/block/loop0/ext_range is "1" on RHEL 6.3. It is 256 on F18. I think that might be our problem. (Note that it's only in a chroot on RHEL that we have a problem -- it works in a chroot on F18.)

In parted 3.0, the logic for reading this is different, and a value of 0 is allowed.

Annnd, apparently there is a max_part option to the loop module. Setting this to some higher number will probably make it work. See https://bugzilla.redhat.com/show_bug.cgi?id=771641 for a tiny bit of background: setting loop.max_part to 256 is new in kernel 3.2.

builders are all running 2.6.32 from rhel 6

Seth set loop.max_part=256 on one of the systems temporarily and I manually logged in and created an F18 mock chroot. And it works -- not just parted, but an images is generated. Didn't test the image, but I think any problems there are Further Issues.

So, we set this option on the builders that do appliance creation... but they are still not working?

Anything more we can do, or is this now a issue in appliance-creator itself?

Dennis actually did a build last week and it was successful, although I'm not sure if it actually worked once built. (In any case, I think this issue is resolved.)

ok, cool.

So, lets close this then, and reopen if there's anything more we need to do later...

Login to comment on this ticket.

Metadata