ReleaseNotes/080125

Release Notes - 25 Jan 2008

Summary

The following RPMs have been released to:

  • improve the gridpulse monitoring output provided to the  GOC
  • provide scripts for upgrading to VDT1.8.1
  • update the MIP
APAC-gateway-gridpulse-0.2-12.noarch.rpm
Gbuild-1.8-3.noarch.rpm
APAC-lxml-1.3.6-1.i386.rpm        
APAC-glue-schema-0.1-4.noarch.rpm     
APAC-mip-module-py-1.0.444-5.noarch.rpm
APAC-mip-0.2.7-5.noarch.rpm 
APAC-mip-globus-0.1-5.noarch.rpm  

Review

Reviewed by:

  • Gerson 25/1/08

Reviewed and tested by:

  • Darran 30/1/08 (on a VDT1.6 VM with running MIP) - took less than 20min

Upgrading gridpulse

This should be done on all machines reporting to the GOC.

At least: grid-gateway, ng1, ng2, ngdata, nggums, ngportal

In the case of grid-gateway where gridpulse.sh was manually installed, first:

  1. Delete the old script: rm /usr/local/bin/gridpulse.sh
  2. Remove the cron entry: crontab -e
  3. Continue with Step 2 below

Instructions (for machines with the previously named Gpulse installed):

  1. Delete the old package: yum remove Gpulse
  2. Install the new one: yum install APAC-gateway-gridpulse
  3. Test the output: /usr/local/bin/gridpulse
  4. Check the  GOC page after cron has had a chance to run the script (30min)
  5. Make sure no extra output is being sent to /var/mail/root by cron

Upgrading NG2 VDT 1.6 to 1.8 and MIP

These instructions serve as a guide to quickly upgrade an existing NG2 machine, probably running CentOS4 and VDT1.6 installed using InstallNg2 instructions.

This process should take approximately 30min - the longest part is waiting for VDT 1.8.1 downloads.

Check there are currently no grid jobs running (qstat, …) and possibly plan downtime in advance.

Instructions:

  1. OPTIONAL - Create a snapshot, following instructions in the next section
  2. Upgrade gridpulse first (above)
  3. Remove /etc/yum.repos.d/jcu-apac.repo
  4. Update: yum update
    • this will also upgrade the MIP
  5. Disable services: vdt-control --force --off
  6. Rename existing installation: mv /opt/vdt /opt/vdt16
  7. Disable PRIMA configuration: rm -f /etc/grid-security/prima-authz.conf
  8. Run the 1.8.1 build script (set http_proxy first if necessary): /usr/local/bin/BuildNg2Vdt181.sh
  9. Restore your original PBS.pm: cp /opt/vdt16/globus/lib/perl/Globus/GRAM/JobManager/pbs.pm /opt/vdt/globus/lib/perl/Globus/GRAM/JobManager/pbs.pm
  10. If you missed deleting the old PRIMA configuration
    • Manually enable: /opt/vdt/vdt/setup/configure_prima_gt4 --enable --gums-server nggums.DOMAIN
  11. Enable auditing: /usr/local/bin/AddAuditNg2Vdt181.sh
  12. Secure the MIP and prepare hierarchy.xml: /usr/local/mip/config/globus/mip-globus-config -l /opt/vdt/globus install
    • this is normally handled by APAC-mip-globus during the post-installation (including Step 4) but since the package is probably installed and working with VDT1.6 it will not have a chance to run the script again
  13. Make sure Globus is enabled and started: chkconfig globus-ws on && service globus-ws stop; service globus-ws start
    • check for errors in /opt/vdt/globus/var/container.log

Tests:

  1. Test gridpulse output: /usr/local/bin/gridpulse
  2. Submit a small job
  3. Check for a valid jobid entry in /var/log/messages. Force the hourly script to run first: /etc/cron.hourly/auditquery
  4. Check MIP output: /usr/local/mip/mip
  5. Check published MIP data:  WebMDS site resources
    • This may take up to 15min to be updated properly

Safety with a Snapshot

Using an LVM snapshot means that you can easily reboot the machine in it's old state should anything go wrong with not enough time to resolve the issue.

Shutdown the virtual machine first.

You may need to reduce the existing volume size (eg. 16G to 8G) to make enough space. BEWARE This in itself can be risky, double check the commands carefully and/or mount the file system first and create a tar file.

e2fsck -f /dev/VolumeGroup00/ng2-root
resize2fs -p /dev/VolumeGroup00/ng2-root 8G
lvreduce -L 8G /dev/VolumeGroup00/ng2-root
e2fsck -f /dev/VolumeGroup00/ng2-root

Create the snapshot. BEWARE Where possible the snapshot should be the same size as the original volume. This guarantees that it will never fill. A full snapshot is instantly made unavailable with no way to repair and will need to be deleted before the original volume can continue to operate.

modprobe dm-snapshot
lvcreate -s -n ng2-080126 -L 8G /dev/VolumeGroup00/ng2-root

Restart the machine again. You can edit /etc/xen/ng2 to configure the machine to boot from the snapshot or even make the snapshot volume available Read Only to mount in the virtual machine.

It can also be very useful to boot a extra VM using the snapshot with a different IP address for testing changes. The hostname in this case would not correctly match the IP address so secure globus connections will fail but everything else should still work.