#98 closed (fixed)
Poor performance for VM to VM connections within an InstaGENI rack
Reported by: | lnevers@bbn.com | Owned by: | somebody |
---|---|---|---|
Priority: | major | Milestone: | |
Component: | Experiment | Version: | SPIRAL5 |
Keywords: | confirmation tests | Cc: | |
Dependencies: |
Description
Iperf has been used in the Site Confirmation tests to capture performance for each of the experimenter test cases.
The confirmation test (IG-CT-1 - Access to New Site VM resources) has consistently found poor TCP performance results for all initial racks (GPO, Northwestern, Utah).
For IG-CT-1, measurements were captures from VM to VM within one rack over TCP for a duration of 60 seconds for 1, 5 and 10 iperf client connections. In all cases VMs assigned were on the same node.
End points | One Client | 5 Clients | 10 Clients | Ping rtt min/avg/max/mdev |
GPO | 294 Kbits/sec | 1.68 Mbits/sec | 3.45 Mbits/sec | 0.025/0.030/0.042/0.006 ms |
Northwestern | 294 Kbits/sec | 1.65 Mbits/sec | 3.52 Mbits/sec | 0.025/0.029/0.048/0.007 ms |
Utah | 294 Kbits/sec | 1.52 Mbits/sec | 3.48 Mbits/sec | 0.019/0.029/0.045/0.006 ms |
The results as captured for the test runs at each rack:
- http://groups.geni.net/geni/wiki/GENIRacksHome/InstageniRacks/ConfirmationTestStatus/GPO#Measurements
- http://groups.geni.net/geni/wiki/GENIRacksHome/InstageniRacks/ConfirmationTestStatus/Northwestern#Measurements
- http://groups.geni.net/geni/wiki/GENIRacksHome/InstageniRacks/ConfirmationTestStatus/Utah#Measurements
Change History (11)
comment:1 Changed 10 years ago by
comment:2 Changed 10 years ago by
It looks like the test case that Luisa saw the issues on was a non-OF test case, so one of my assumptions is already wrong.
comment:3 Changed 10 years ago by
I don't believe this is the case. Certainly there *could* be hairpinning going on here, but I doubt that the slow path on the switch is screwing this up (of course, who knows...). The case that we were looking at before was not just OpenFlow? (which isn't this case), but also was the result of a *bad* flowmod from the controller - the HP supports an output port of INPUT_PORT in the fast path, but the example we had before was where the output port was set to the *value* of the input port (and not the special INPUT_PORT constant).
comment:4 Changed 10 years ago by
This seems to be a problem only for non-OpenFlow? scenarios.
Re-ran with OpenFlow? (IG-CT-4) using two VMs within one rack, iperf over TCP for a duration of 60 seconds for 1, 5 and 10 iperf client connections and all VMs assigned were on the same node:
End points | One Client | 5 Clients | 10 Clients | 1 client | Ping rtt min/avg/max/mdev |
GPO IG to GPO IG | 939 Mbits/sec | 941 Mbits/sec | 940 Mbits/sec | 1.04 Mbits/sec | 0.072/0.185/6.738/0.853 ms |
comment:5 Changed 10 years ago by
On 2/20/13 4:28 PM, Jonathon Duerig wrote:
You should double check that you haven't constrained the bandwidth in your rspec without realizing it and that there aren't any defaults (100mbps) that are constraining you.
I am not specifying any bandwidth in any of the RSpecs (OF & non-OF) used for confirmation testing. I checked what is allocated, and following is the speed assigned to the VMs in the OpenFlow scenario:
$ sudo /sbin/ethtool mv4.4|grep -i speed Speed: 1000Mb/s
The speed assigned to the VM in the non-OpenFlow scenario:
$ sudo /sbin/ethtool mv8.11 | grep -i speed Speed: 1000Mb/s
comment:6 follow-up: 7 Changed 10 years ago by
On 2/20/13 4:56 PM, Leigh Stoller wrote:
Can you put the exact iperf command lines into the confirmation tests page please so that they can be duplicated by someone else?
The iperf command used on server:
/usr/bin/iperf -s &
The iperf commands used on the client node:
/sbin/iperf -c 192.168.1.2 -t 60 /usr/bin/iperf -c 192.168.1.2 -t 60 /usr/bin/iperf -c 192.168.1.2 -t 60 -P 5 /usr/bin/iperf -c 192.168.1.2 -t 60 -P 10
comment:7 Changed 10 years ago by
Replying to lnevers@bbn.com:
/sbin/iperf -c 192.168.1.2 -t 60
Please ignore this "/sbin/iperf", it obviously does not work (wrong path). But the other 3 are the commands were actually used to collect the measurements.
comment:8 Changed 10 years ago by
Resolution: | → fixed |
---|---|
Status: | new → closed |
On 2/22/13 6:09 PM, Leigh Stoller wrote:
Okay, I think I got this figured out. I've updated the shared pool on all of the racks.
Luisa, can you please rerun your tests. You will need to destroy and recreate your test slivers in order to get the fix.
Re-ran tests between VMs in the same rack both at the GPO and at the Utah rack. Issue is resolved. Current results show:
End points | One Client | 5 Clients | 10 Clients | Ping rtt min/avg/max/mdev |
GPO to GPO | 96.9 Mbits/sec | 97.8 Mbits/sec | 99.5 Mbits/sec | 0.020/0.030/0.047/0.005 ms |
Utah to Utah | 96.9 Mbits/sec | 98.1 Mbits/sec | 99.1 Mbits/sec | 0.029/0.034/0.053/0.009 ms |
comment:9 Changed 10 years ago by
Re-opening ticket, there is still a performance issue between PC (iperf client) and VM (iperf server) in the GPO rack (slice IG-CT-2):
End points | One Client | 5 Clients | 10 Clients | Ping rtt min/avg/max/mdev |
GPO | 101 Mbits/sec | 101 Mbits/sec | 101 Mbits/sec | 0.093/0.172/0.198/0.022 ms |
I also re-ran this test IG-CT-2 (1 PC + 1 VM) in the Utah rack and found the results in the Utah rack to be as expected:
Utah | 939 Mbits/sec | 940 Mbits/sec | 940 Mbits/sec | 0.116/0.167/0.233/0.027 ms |
End points | One Client | 5 Clients | 10 Clients | Ping rtt min/avg/max/mdev |
Following is the GPO iperf output:
+ /usr/bin/iperf -c 192.168.1.1 -t 60 ------------------------------------------------------------ Client connecting to 192.168.1.1, TCP port 5001 TCP window size: 16.0 KByte (default) ------------------------------------------------------------ [ 3] local 192.168.1.2 port 49410 connected with 192.168.1.1 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0-60.0 sec 719 MBytes 101 Mbits/sec + /usr/bin/iperf -c 192.168.1.1 -t 60 -P 5 ------------------------------------------------------------ Client connecting to 192.168.1.1, TCP port 5001 TCP window size: 16.0 KByte (default) ------------------------------------------------------------ [ 3] local 192.168.1.2 port 49411 connected with 192.168.1.1 port 5001 [ 4] local 192.168.1.2 port 49414 connected with 192.168.1.1 port 5001 [ 7] local 192.168.1.2 port 49412 connected with 192.168.1.1 port 5001 [ 5] local 192.168.1.2 port 49413 connected with 192.168.1.1 port 5001 [ 6] local 192.168.1.2 port 49415 connected with 192.168.1.1 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0-60.0 sec 145 MBytes 20.3 Mbits/sec [ 7] 0.0-60.0 sec 143 MBytes 20.0 Mbits/sec [ 5] 0.0-60.1 sec 146 MBytes 20.4 Mbits/sec [ 6] 0.0-60.1 sec 143 MBytes 19.9 Mbits/sec [ 4] 0.0-60.2 sec 144 MBytes 20.1 Mbits/sec [SUM] 0.0-60.2 sec 722 MBytes 101 Mbits/sec + /usr/bin/iperf -c 192.168.1.1 -t 60 -P 10 ------------------------------------------------------------ Client connecting to 192.168.1.1, TCP port 5001 TCP window size: 16.0 KByte (default) ------------------------------------------------------------ [ 3] local 192.168.1.2 port 49424 connected with 192.168.1.1 port 5001 [ 7] local 192.168.1.2 port 49417 connected with 192.168.1.1 port 5001 [ 12] local 192.168.1.2 port 49420 connected with 192.168.1.1 port 5001 [ 10] local 192.168.1.2 port 49419 connected with 192.168.1.1 port 5001 [ 9] local 192.168.1.2 port 49418 connected with 192.168.1.1 port 5001 [ 6] local 192.168.1.2 port 49425 connected with 192.168.1.1 port 5001 [ 4] local 192.168.1.2 port 49422 connected with 192.168.1.1 port 5001 [ 5] local 192.168.1.2 port 49423 connected with 192.168.1.1 port 5001 [ 11] local 192.168.1.2 port 49421 connected with 192.168.1.1 port 5001 [ 8] local 192.168.1.2 port 49416 connected with 192.168.1.1 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0-60.0 sec 46.9 MBytes 6.55 Mbits/sec [ 7] 0.0-60.1 sec 86.1 MBytes 12.0 Mbits/sec [ 8] 0.0-60.1 sec 81.5 MBytes 11.4 Mbits/sec [ 11] 0.0-60.1 sec 83.5 MBytes 11.7 Mbits/sec [ 12] 0.0-60.1 sec 84.0 MBytes 11.7 Mbits/sec [ 10] 0.0-60.2 sec 48.4 MBytes 6.74 Mbits/sec [ 9] 0.0-60.3 sec 81.8 MBytes 11.4 Mbits/sec [ 6] 0.0-60.3 sec 83.8 MBytes 11.6 Mbits/sec [ 4] 0.0-60.4 sec 47.4 MBytes 6.58 Mbits/sec [ 5] 0.0-60.5 sec 82.8 MBytes 11.5 Mbits/sec [SUM] 0.0-60.5 sec 726 MBytes 101 Mbits/sec
Slice IG-CT-2 is still running in the GPO rack.
comment:10 Changed 10 years ago by
An overall summary of TCP throughput measured this morning in one rack with iperf:
End Points | One client | Five clients | 10 clients |
VM to VM (1 VM server) | 96.5 Mbits/sec | 96.5 Mbits/sec | 96.5 Mbits/sec |
VM (server1) to VM (server2) | 101 Mbits/sec | 101 Mbits/sec | 100 Mbits/sec |
PC to VM | 101 Mbits/sec | 101 Mbits/sec | 101 Mbits/sec |
PC to PC | 941 Mbits/sec | 942 Mbits/sec | 942 Mbits/sec |
Are these the expected values?
comment:11 Changed 10 years ago by
The previous set of measurements were collected without any bandwidth specification in the RSpec. When no bandwidth is specified, the interfaces on the devices allocated show an interface speed of 1Gb/sec, but the bandwidth is shaped to 100 Mb/sec.
Re-ran the TCP performance test cases in one rack (GPO), but this time explicitly requested 1 GB links between the nodes in each scenarios. Results below:
End Points | One Client | Five Clients | Ten Clients |
VM to VM (1 VM server) | 931 Mbits/sec | 936 Mbits/sec | 933 Mbits/sec |
VM (server1) to VM (server2) | 930 Mbits/sec | 934 Mbits/sec | 935 Mbits/sec |
PC to VM | 939 Mbits/sec | 940 Mbits/sec | 940 Mbits/sec |
PC to PC | 942 Mbits/sec | 942 Mbits/sec | 942 Mbits/sec |
So, if an experimenter requests 1 GB links, they will get the bandwidth that they requested. Well at least in one rack :-).
I have a guess as to what might be going on here, but it involves some assumptions, and I very well may be wrong...
Did the IG folks end up using something like custom drivers to force traffic between two VMs on the same node to go through the HW switch (at least for the OF VLAN case)? If that is the case, then that would imply that traffic is going in and out on the same switch port.
I am pretty sure that if that happens on an HP, then the packet goes through the slow path on the switch. This might explain the performance that is being seen for VMs on the same node.
It might be worth testing with VMs on different nodes to see if anything changes, if that is easy to do.