Recently I ran into some problems with VMware vCloud Director 1.5.1 (although these issues might also occur on vCD 5.x I guess). If a users tries to connect to the console of a virtual machine, the status remained on “Connecting…”. At the end a console connection could not be created. This issue occurred in a multicell vCloud Director environment.
The issue seemed to occur randomly, and sometimes it did not occur at all. Sometimes the issue could be resolved by rebooting a vCD cell, sometimes the issue was only resolved after rebooting all the vCloud Director cells.
Because we’re talking about a multi cell environment, it is not always clear if the issues only occur on one failing cell, or on all the cells. After some investigation I could pinpoint one of the cells which had troubles. Analyzing the vcloud-debug-container.log (available in /opt/vmware/vcloud-director/logs) showed an interesting message:
2012-11-30 10:58:40,469 | DEBUG | consoleproxy | LoginByCookieHandler | Logging in via a COOKIE: COOKIE-ID 2012-11-30 10:58:40,470 | WARN | consoleproxy | LoginByCookieHandler | Cannot parse the cookie: COOKIE-ID
* COOKIE-ID = The ID of the cookie
There seems to be a problem with a login based on a cookie. These cookies are used when your session is load balanced from one cell to the other. The message “Cannot parse the cookie” was only shown when the console connection remained in the “Connecting…” status.
The a-ha moment – vCloud Director timing issues
The question is: What was the reason this cookie could not be parsed? The answer to this question can be found in the date & time used on the vCloud Director cells. In my case one of the cells was not in sync with the other cells, and guess what…..exactly this cell showed the cannot connect issue and the messages in vcloud-debug-container.log.
A short investigation revealed that NTP was not functioning correctly on the cells (for reason not so relevant for this article). After solving the NTP issue, console connections could be created successfully without any problem.
So why was the console connection sometimes working correctly? Well, everything depends on the date/time on every vCloud Director cell. As long there’s not much load on the cells (they’re virtual machines in this scenario) time will remain in sync, if you’re putting a load on the cells time will drift of and connection problems might occur at the end…