Sigh, for virtual box the update build wrecks the boot MBR. After the updates are installed and the script reboots the vm it hangs at “GRUB” I was pretty sure that installed SuSE with by path instead of id and yes, autoyast has:

    <device_map config:type="list">
      <device_map_entry>
        <firmware>hd0</firmware>
        <linux>/dev/sda</linux>
      </device_map_entry>
    </device_map>

some how the boot manager got installed to /dev/disk/by-id/ata-VBOX_HARDDISK_6VM23FX0 according to /boot/grub2/device.map. And of course after packer clones the original VirtualBox vm the uuid changes and not boot. So I had to add

    echo "(hd0)   /dev/sda" > /boot/grub2/device.map  

to the start of the update script

Another problem that came up with VirtualBox is that the interface was renamed to eth1 after the reboot but the configurations were still pointing to eth0. The workaround for that was adding

    # remove the old udev rule before the reboot
    rm /etc/udev/rules.d/70-persistent-net.rules

Which meant the old eth0 was removed before the reboot and was then automatically set up at the boot-up.

SuSE bug 821879 – udev update breaks network:

My work for the problem with udev breaking the network is this script :

    #!/bin/bash

    t=/tmp/update-script
    #$( mktemp )

    cat >${t} <<ENDL
    zypper --non-interactive modifyrepo --disable --no-refresh openSUSE-12.3-1.7
    zypper --non-interactive addrepo --check --refresh --name "Update Repository (Non-Oss)" http://download.opensuse.org/update/12.3-non-oss/ download.opensuse.org-12.3-non-oss
    zypper --non-interactive addrepo --check --refresh --name "Main Repository (NON-OSS)" http://download.opensuse.org/distribution/12.3/repo/non-oss/ download.opensuse.org-non-oss   
    zypper --non-interactive addrepo --check --refresh --name "Main Repository (OSS)" http://download.opensuse.org/distribution/12.3/repo/oss/ download.opensuse.org-oss     
    zypper --non-interactive addrepo --check --refresh --name "Main Update Repository" http://download.opensuse.org/update/12.3/ download.opensuse.org-update 
    zypper --non-interactive refresh
    zypper --non-interactive update
    sleep 10 
    reboot
    # to make sure we wait for the reboot 
    sleep 10 

    ENDL

    chmod u+x ${t}

    echo storing output under VM : ${t}.out
    nohup bash -vx ${t} > ${t}.out &

for packer I also added this:

    echo Initiating sshd shutdown and killing all ssh* processes.
    echo The install process will seems to hang for serveral minutes 
    echo while the updates are installed.
    echo the reboot at the end will start sshd and the process will continue

    # prevent reconnect for next script until after the time out
    systemctl  stop sshd
    ps -ef | grep ssh | grep -v grep  | awk &#39;{print $2}&#39; | xargs kill

The OpenSuSE 12.3 issue with the slow network start-up time seems to be some timing issue. It might be because two processes start the network at the same time. One being auto plug and the other the network start time. I was looking into it until it suddenly stopped happening, so I stopped looking into it.

The next problem was/is that packer does not always reconnect to the install after the reboot. that might be related to the network issue or not. But the odd thing is that it worked once with a two hour period of successful sshd polling and the next time it failed after one minute of failed polling. did change my script from being in the background with nohup (to work around the udev network issue) in between, but I am not sure if that is related. started to check in the changes in to the https://bitbucket.org/markus_ebenhoeh/packer-opensuse.12.3 repository to track what I am doing.

shaving the yast

So I wanted to have a play with packer but decided to have a look at autoyast while I am at it.

OpenSuSE 12.3 has a bug [BNC#801878] though that prevent the network and ssh to be started up.

And it seems the does not work as expected either since if always shut it down, or with updates enabled created an additional interface.

ended up not caring about the package updates (since the vagrant box would be outdated anyway) and instead doing this:

 <noscripts>
    <chroot-scripts config:type="list">
      <noscript>
        <!&#8212; FIXING BNC#801878 &#8212;>
        <filename>chroot_bugfixing_801878.sh</filename>
        <source><![CDATA[#!/bin/bash
        #Fix ag_initscripts according to BNC#801878
        sed -i &#39;s/.<strong>(/sbin/runlevel.</strong>)/SYSTEMCTL_OPTIONS="" 1/&#39; /mnt/usr/lib/YaST2/servers_non_y2/ag_initscripts
        ]]>
        </source>
      </noscript>
    </chroot-scripts>
    <postpartitioning-scripts config:type="list">
      <noscript>
        <filename>ssh_key_installation.sh</filename>
        <source><![CDATA[#!/bin/bash
         # Evaluate the http server IP PORT to retrieve the ssh key and then set it up
         httpServer=$( grep AutoYaST /etc/install.inf | sed &#39;s~.<strong>(http://.</strong>/).*~1~&#39; )
         httpUrl="${httpServer}/id_rsa_packer_images.pub"

         rootSsh=/mnt/root/.ssh
         mkdir -p ${rootSsh}
         wget "${httpUrl}" -O ${rootSsh}/authorized_keys
         chmod -R og-rwx ${rootSsh}

         verpackerSsh=/mnt/home/verpacker/.ssh
         mkdir -p ${verpackerSsh}
         wget "${httpUrl}" -O ${verpackerSsh}/authorized_keys
         chmod -R og-rwx ${verpackerSsh}
         chown -R 1000:100 /mnt/home/verpacker        
         ]]>
        </source>
      </noscript>         
    </postpartitioning-scripts>
    <post-scripts config:type="list">
      <noscript>
        <filename>post_script_no_network_shutdown.sh</filename>
        <source><![CDATA[#!/bin/bash
        # preventing the shut down of the network at the end of the installation
        sed -i &#39;s~(.*rcnetwork .*stop)~# removed by post_script_no_network_shutdown 1~&#39; /usr/lib/YaST2/startup/Second-Stage/S09-cleanup
        ]]>
        </source>
      </noscript>
    </post-scripts>    
  </noscripts>

ok, the ssh-key is for packer and not the network issues.

Packer : ~1 evening autoyast : ~5 evenings

I also decided to become a more sharing person :

https://bitbucket.org/markus_ebenhoeh/packer-opensuse.12.3

Upgraded OpenSuSE 12.1 RC1 to what I believe is 12.1 Beta 1 via yast2 online_update .

There were 1600 or so packages waiting to be updated. It seems that after a thousand, Yast2 shut down, but was able to continue afterwards (ie everything was downloaded, but about 600 still had to be installed). The second batch included the kernel update from 3.0.something to 3.1.0-rc7-3 and the rest of the KDE 4.6 to KDE 4.7 update.

Somethings were or are broken and so I had to:

Change one of my mandatory MDs to mount via UUID because it used to be md3 but now is md127, to be able to boot.

(reinstall the Nvdia driver, nouveau did not work in the 30 minutes that I was willing to play with it, then it took me another 30 minutes to keep nouveau from being loaded in the first place)

Postgres did not start:

Oct  9 09:13:57 meserv postgres: 2011-10-09 09:13:57 EST   FATAL:  could not create shared memory segment: Invalid argument
Oct  9 09:13:57 meserv postgres: 2011-10-09 09:13:57 EST   DETAIL:  Failed system call was shmget(key=5995001, size=37978112, 03600).
Oct  9 09:13:57 meserv postgres: 2011-10-09 09:13:57 EST   HINT:  This error usually means that PostgreSQL's request for a shared memory segment exceeded your kernel's 
SHMMAX parameter.  You can either reduce the request size or reconfigure the kernel with larger SHMMAX.  To reduce the request size (currently 37978112 bytes), reduce PostgreSQL's 
shared_buffers parameter (currently 4096) and/or its max_connections parameter (currently 104).
Oct  9 09:13:57 meserv postgres: [1-4] #011If the request size is already small, it's possible that it is less than your kernel's SHMMIN parameter, in which case raising the 
request size or reconfiguring SHMMIN is called for.

The funny thing was that I first tried decreasing the default shared_buffers size of 32MB to 30MB in postgresql.conf, but that resulted in the message:

DETAIL:  Failed system call was shmget(key=5995001, size=35807232, 03600).
Oct  9 09:47:53 meserv postgres[:  2011-10-09 09:47:53 EST   HINT:  This error usually means that PostgreSQL's request for a shared memory segment exceeded your kernel's SHMMAX parameter.  You can either reduce the request size or reconfigure the kernel with larger SHMMAX.  To reduce the request size (currently 35807232 bytes), reduce PostgreSQL's shared_buffers parameter (currently 3840) and/or its max_connections parameter (currently 104).

So I reduced it to 20MB for now. Also, the message states 104 max_connections but that is set to 100 (I think by default as well) So somehow the values get messed up somewhere (e.g. 32MB is not 37978112 and 30 is not 35807232) , but I guess it could be that the 32MB are added to some base value …

“NFS” was not working from clients

Thought it was my fault, because I was changing “PAM” settings and thought I might have messed up the “LDAP” authentication, but it seems that it has something to do with the firewall on the server. I tried turning it off, but it did not let itself showdown vial command-line rcSuSEfirewall2 or Yast. Anyway, things worked again after I turned of the firewall and stopped working when I turned the firewall back on. So now I have “MOUNTD_PORT” set to “12345” in / etc / sysconfig / nfs and I opened the ports in 111, 991, 2049 and 12345 in the firewall. Not sure why this is not necessary, but I assume it is about the order of the boot sequence and possibly this already was changed by me before (I kept the start up from running parallel tasks)

log settings

I have a few log settings which I adapted for rsyslog after updating to 12.1 “RC1” They stopped working after updating to 12.1 B1 and now “named” is spamming my syslog again. I temporarily added $IncludeConfig / etc / rsyslog.d / *.conf to / etc / rsyslog.early.conf (used to be only in / etc / rsyslog.conf) until I check why rsyslog is not being started

Apache web server

rcapache2 (or / etc /init.d / apache2 start) does not start (no log no nothing) just sits there for a minute or more, found some indicators but not solution, decided to start via apache2ctl start instead, for now

bash -vx / etc / init.d / apache2 start ends with: + + exec /bin/systemctl start apache2.service Job failed. See system logs and “systemctl status” for details.

kmail2 (kde4.7) no more mails, no migration, many errors:

I downgraded kmail to 4.4 (with whatever dependencies Yast wanted me to downgrade and deinstalled kaddressbook because it had a conflict and we don’t use it. But since we want to use NX, I thought I go for a quick solution which was downgrading, which worked without a hitch, which impressed me.