Hi everyone,
I'm here as a last resort, literally. Here is my problem.: The monitors cut in and out periodically throughout the day for all my users, sometimes once an hour, sometimes every 10 minutes, sometimes it won't happen for a day or two. User will be working and both screens just go black for a second or two then they get their desktops back.
We have a View 5.3 instance running (we've been using it since 5.1) with a 50 person office and a full gigabit network that can easily handle 200 users worth of traffic. It was overbuilt on purpose. I have 20 users that i've moved to View. Our goal from day 1 was to achieve as close to a real desktop experience for my users as possible.
I purchased a 4 node Super Micro with 2 x 6 Core Intel Xeon's, 128GB per node, 2 1GB NICs per node and 8 Intel 520 series SSD in RAID 10 (2.8TB) per node. The system is a beast, we hammered this system to hell and back before we installed view on it.
We purchased HP T310 Zero Clients.
We are using GPO's to manage the settings (all per recommendations from VMware and Teradici)
Base Image (and desktops)
Windows 7 x64
2 Cores
6GB RAM
60GB HDD (not using persona or persistent disks)
Linked Clone dedicated pools
Enable 3D rendering
Set Video RAM to 512MB
Force PCoIP as connection type
Aero Enabled Desktops
Office 2013 primary set of apps used
Initially the pilot went very well but i quickly started getting complaints that it was slow to move windows around, scroll in IE and Excel and management asked me to fix this.
So we bought some APEX 2800 cards from Teradici and so began my nightmare. As soon as we put these cards in, got drivers installed (2.3.1 at the time, now up to 2.3.3) users started complaining the monitors are flickering, cutting in and out periodically. View was upgraded to 5.2, APEX came out with 2.32, problems persisted. January we upgraded to View 5.3, APEX 2.3.3 and moved our Infrastructure from vSphere 5.1 to 5.5 including upgrading all the ESXi hosts. Problem persists.
So i open a case with Teradici 1st week of January. After a bit of run around and not getting called back, I escalate the case and get a call. I was asked to enable full debug logging and start submitting logs. I sent 2 weeks worth of logs to Teradici and initially I get this reply back:
"I went through the logs you submitted and its the same issue in each log. There is an "Imaging Timer Expiry" at each of the times you indicated the issue was occurring. This entry signifies that imagingon the VM has not received a response from the client in the last 5 seconds, and imaging tries to restart the imaging session(causing screen flicker)"
Things i've done on my own while waiting for Teradici:
Rebuilt View from Scratch
Rebuild the base image from scratch
optimized per Vmware and Teradici
Moved the vlan my users are on to the same switch as the ESXi server
Recomposed all pools and desktops
Firmware upgrade on all Cisco switches
Firmware upgrade to 4.2.0 on all Zero Clients
Latest Teradici APEX ESXi and VM drivers
Nothing has helped. Management is really putting pressure on me to fix this.
We've googled every possible combination of this error and i find nothing. Nothing here on these forums, Teradici's or anywhere on google, yet they tell me others are having the issue and engineering is looking into the issue.
I'm at a total and complete loss. Can anyone offer any suggestions, has this happened to any of you?
Attached are some logs i've sent to Teradici (one of the hundreds)
The flicker in this set of logs happened at 2:29pm. I don't know what else to do. Ive set Aero to best performance, let windows manage it etc. I can't crack this nut.
I cannot disable Aero as that was one of the requirements management wouldn't flex on, the end user experience had to be the same as their laptops, which for the most part it is except for this bloody screen flickering thats driving my staff nuts, upsetting management and making me want to give up and try something else.
My Cisco guy has had traces running on the ports of some affected users, we cannot find anything incorrect with the network. All priorities for PCoIP are set and he has assured me the traffic flow for the 20users is not even coming close to a gigabit, we can't be saturating the link
EDIT: Would adding some Nvidia cards and returning these APEX cards to Teradici solve this problem? We're prepared to get a grid card for eval, even though no one really uses any 3D apps, I really need to get this resolved once and for all