Crucial Paradigm Australia Official Blog
March 2011
If you ever find your server won’t exist maintenance mode after a power failure, or even just normally you may have a corrupt state.db file.
The error message you may get while trying to make a change on the node is:
Cannot forward messages because the host cannot be contacted. The host may be switched off or there may be network connectivity problems.
If this is in fact a state.db corruption in a pool, you can try the following steps to fix the issue:
- cd /var/xapi
- ls -al -> check the results of this, usually the state.db will have an old time stamp – the time at which is became corrupted.
- mv state.db state.db_bak
- service xapi restart
If this does not fix the issue, I’ve heard of users opening state.db with an xml editor (its just an xml file) and fixing the errors in the xml file. Some other users have mentioned that the cause of the issue could be that the config in the node that is down in maintenance mode could have the incorrect config in /etc/xensource/pool.conf. The format should be “slave:[masterip]“.
So today I was tasked with creating an entire system that does ubuntu automated installations, the first hurdle was actually getting ubuntu automated deployment, unfortunately there was a lot of misguidance laying around the net on the best way to approach this, so this is my attempt… it worked and I think it does the job…
You can use the following commands to change the Citrix XenServer pool’s master server:
First you need to disable high availibility:
xe pool-ha-disable
Then you need to find out which host you want to change the master to:
xe host-list
Then change the master:
xe pool-designate-new-master host-uuid=[uuid of new master host]
Turn HA back on:
xe pool-ha-enable
Sometimes when shutting down a VM via the standard XenCenter interface or command line (xe) on a Citrix XenServer machine the shutdown will not complete. The first thing to try is a force shutdown on the VM:
xe vm-shutdown –force vm=[vm name]
If this still doesn’t work you can try taking a look in the XenServer pending task queue:
xe task-list
And cancel the process’ that seem to be holding up the system:
xe task-cancel uuid=[task uuid]
If this still fails you can try the following:
xe-toolstack-restart
If you have noticed your Citrix XenServer machine grind to a halt due a process “cdrommon” this is a known bug with a particular CD-ROM drive in XenServer. A fix for the bug can be found here:
http://support.citrix.com/article/CTX126919
In short to check if you are effected by the bug, run the following:
dmesg | grep DV-28
This bug applies to “TEAC DV-28E-V” CD-ROm drives.
Adding a local storage repository to a Citrix XenServer 5.x server can be done as follows:
NOTE: Extra precaution should be taken while using the following commands, as it could result in data loss. Only perform these steps if you know what you are doing:
- Locate the disk ID by using the following command:
# ls -al /dev/disk/by-id - Create the local storage repository:
# xe sr-create content-type=user device-config:device=/dev/disk/by-id/<scsi-xxxxxxxxxxxxxxxxxxxxxxxxx> name-label=”Local Storage X” shared=false type=lvm
This is more an information post if anybody happens to come across a similar issue… while doing some extensive testing on our new HP C7000 Blade setup, running 10GbE Flex10 modules we came across an issue where the Flex10 modules started playing up and would not load the config, resulting in all network connections connected to the effected switches to go down.
When logging into the HP Virtual Connect Manager it came up with a few issues with the domain, including:
- The external links from the chassis were showing no link
- No Communication error on the Domain Status page. Stating: The Virtual Connect Manager is unable to communicate with the Onboard Administrator. Please ensure the module has an IP address.
- Cross Link/Stacking links showing as down in some instances
In some instances it was not even possible to bring up the Virtual Connect Manager, it just sat on a loading page for an extensive period of time. Running the vcutil health check would show random errors such as the following:
vcutil.ex e -a healthcheck -i XXXXXX -u Administrator -p XXXXXX
Domain Configuration: Not In Sync
Module Configuration: Not In Sync
Module Configuration: Invalid
Taking a supportdump using the vcutil package would show errors such as the following:
Saving Virtual Connect Support Information…Error
Could not get Primary VCM IP Address from Bay 1
This problem is solved by removing the DNS servers listed in “Enclosure Bay IP Addressing” for the Flex10 modules, and then rebooting them.
After solving this problem with the help of HP support, I was pointed to this article which seems to describe this exact problem: http://h20000.www2.hp.com/bizsupport/TechSupport/Document.jsp?objectID=c02720395&lang=en&cc=us&taskId=101&prodSeriesId=3794423&prodTypeId=3709945
I hope this saves someone facing the issue a bit of time resolving the issue!

