1. #1

    Maximising 3DMark11 score on a Hyper-V VM

    Good day,

    I am running 3DMark11 benchmark tests on a (guest partition) VM running on Hyper-V 2012 (Windows Server 2012 Datacenter edition b9200 with Hyper-V role) and finding large differences when compared with the results obtained by running on the root partition. Apart from the difference in score, the user experience is poorer on the guest: the videos are perciptibly jerky whereas on the root partition, they are fluid.

    Can anyone suggest optimisations? Specs and results are shown below.

    Cheers!


    ###START SPECS AND RESULTS

    The host has the following hardware specs:
    1. Intel Xeon E5645 (6-core CPU, 12 logical processors)
    2. 48GB RAM
    3. Nvidia GeForce GTX560 Ti

    The root partition of this installation is running Windows Server 2012 Datacenter edition b9200 and has driver version 306.23 installed.

    Benchmark results for the root partition are:
    (a) Graphics score:4300
    (b) Physics score:6612
    (c) Combined score:4330
    (d) 3DMarks score:4541

    On the other hand, the VM has the following hardware specs:
    1. 4 vCPUs
    2. 4GB RAM
    3. RemoteFX 3D Video Adapter


    Benchmark results for the VM (guest partition) are:
    (a) Graphics score:3856
    (b) Physics score:3414
    (c) Combined score:2084
    (d) 3DMarks score:3491

    ###END SPECS AND RESULTS

    --- Post Update ---

    Root partition:4541
    Guest partition:3299

  2. #2
    Futuremark Staff
    Joined
    Jun 2000
    Location
    Finland
    Posts
    9,123
    Most likely the virtualization adds overhead to the graphics processing. There are no "videos" in the benchmark - they are rendered real time and push the graphics hardware and the CPU (on Physics and Combined Test) pretty hard. A difference like this is not at all unexpected. Virtual machines always have a performance penalty.

    Also note that RemoteFX is currently not supported by Futuremark. While our benchmarks may work with it, we have not validated them against this configuration. It is a very new technology and it pulls off some fairly advanced feats with cutting edge graphics code. The fact that 3DMark 11 runs for you over it (even with a noticeable performance penalty) is a small miracle in itself.

    The major difference in score - the drop in Physics score - is pretty clearly due to the fact that you cut down the processors from 6 physical (12 logical) down to 4 logical processors. This unsurprisingly causes the Physics and Combined scores to take a major hit. 3DMark 11 utilizes all available cores (physical and logical) up to, if I recall right, 24 cores when benchmarking Physics and Combined Test.

    (Game Tests, which produce the graphics score, are almost 100% dependent on the video card)

  3. #3
    Thank you for the feedback, Jarnis.

    I agree that the difference in the physics test scores is at least partially due to the difference in the number of cores between the root and the guest partition. I write "partially" because, although the score increases monotonically with number of vCPUs in the guest, it still does not reach the root partition's score. I have not modified other parameters to identify the root cause of the difference, as at this point in time, I am more concerned about how to improve the user experience during the graphics test as the playback is too jerky.

    The graphics score has not been substantially affected by modification of RAM and vCPUs. I would like to find an influential factor and so far I have only identified the possibility of the amount of "graphics memory" (128MB dedicated video RAM, 256MB shared RAM) on the RemoteFX 3D video adapter as being a possible cause.
    Last edited by lightseeker; October 15, 2012 at 08:52. Reason: Original text was inaccurate.

  4. #4
    Futuremark Staff
    Joined
    Jun 2000
    Location
    Finland
    Posts
    9,123
    Quote Originally posted by lightseeker View Post
    Thank you for the feedback, Jarnis.

    I agree that the difference in the physics test scores is at least partially due to the difference in the number of cores between the root and the guest partition. I write "partially" because, although the score increases monotonically with number of vCPUs in the guest, it still does not reach the root partition's score. I have not modified other parameters to identify the root cause of the difference, as at this point in time, I am more concerned about how to improve the user experience during the graphics test as the playback is too jerky.

    The graphics score has not been substantially affected by modification of RAM and vCPUs. I would like to find an influential factor and so far I have only identified the possibility of the amount of "graphics memory" (128MB dedicated video RAM, 256MB shared RAM) on the RemoteFX 3D video adapter as being a possible cause.
    It could be. 3DMark 11 in Performance Preset expects you have 512MB video RAM. Entry preset is designed for low performance systems with less than that. If you have Advanced Edition, you could experiment how Entry varies between the two scenarios described.

    Beyond that, I would point out that RemoteFX is very new technology and most likely isn't anywhere close to optimized yet (and the fact that virtualization will always induce an overhead). If anything, the benchmark gives a rough idea how much performance you are losing when running under virtualization for high end 3D Graphics workloads. Since we have not tested or validated the benchmark for this scenario, I cannot say what factors play a part in the results you are seeing and most likely the only people who could give more than a guess are video driver engineers at NVIDIA and AMD who have the tools to do detailed driver performance profiling.

  5. #5
    Since my last post, I have used performance monitor on the parent partition to improve my understanding.

    I would like to propose that 3DMark11 is measuring Perfmon's RemoteFX Graphics:Input Frames/s on the four graphics tests and the combined test. This rate does not correspond to the user's experience. I propose that the user's experience is positively correlated to Remote FX Graphics:Output Frames/s, which closely matches Remote FX Software:Capture Rate for monitor 1 (I have only one monitor set up on my VM).

    This concurs with the description of the RemoteFX for VDI architecture published by Microsoft, which states that the DirectX calls made to the VM's vGPU are intercepted by the RemoteFX server and, after inspection and optimisation, rendered to a frame buffer on the parent partition. Further processing ("capture" and "compress") follow before the RemoteFX server releases the frames to the RDP client. This processing, which, I understand to be carried out solely by rdvgm.exe (and any dlls that it calls), introduces frame rate loss. This loss distorts the relationship between 3DMark11's P-score and the user experience.

    This said, I have to date been unable to match 3DMark11's physics test score (frame rate) with Perfmon's RemoteFX Graphics:Input Frames/s.
    Last edited by lightseeker; October 25, 2012 at 09:54. Reason: Inaccurate text

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts