Opened 12 years ago
Closed 12 years ago
#33 closed (fixed)
Failure to allocate resource while attempting to create sliver
Reported by: | lnevers@bbn.com | Owned by: | somebody |
---|---|---|---|
Priority: | major | Milestone: | IG-EXP-3 |
Component: | AM | Version: | SPIRAL4 |
Keywords: | vm support | Cc: | |
Dependencies: |
Description (last modified by )
Background: Listresources showed large number of pcvm slot counts available before the test:
- pc5 had 97 slots
- pc3 had 100 slots
Test sequence:
- Created 1 sliver named 25vmslice1 with 25 VMs without problems, with the following allocation: 10 VM on pc5, 10 VM on pc3 and 5 VMs on pc4.
- Created a second sliver named 25vmslice2 with 25 VMs which caused the following error:
Result Summary: Slice urn:publicid:IDN+pgeni.gpolab.bbn.com+slice+25vmslice2 expires on 2012-05-26 14:05:32 UTC Asked https://boss.utah.geniracks.net/protogeni/xmlrpc/am/2.0 to reserve resources. No manifest Rspec returned. *** ERROR: mapper: Reached run limit. Giving up. seed = 1338050345 Physical Graph: 6 Calculating shortest paths on switch fabric. Virtual Graph: 25 Generating physical equivalence classes:6 Type precheck: Type precheck passed. Node mapping precheck: Node mapping precheck succeeded Policy precheck: Policy precheck succeeded Annealing. Doing melting run Reverting: forced Reverting to best solution Done BEST SCORE: 18.71 in 49000 iters and 0.307011 seconds With 1 violations Iters to find best score: 48288 Violations: 1 unassigned: 0 pnode_load: 0 no_connect: 0 link_users: 0 bandwidth: 1 desires: 0 vclass: 0 delay: 0 trivial mix: 0 subnodes: 0 max_types: 0 endpoints: 0 Nodes: VM pc5 VM-0 pc5 VM-1 pc1 VM-10 pc5 VM-11 pc2 VM-12 pc1 VM-13 pc2 VM-14 pc3 VM-15 pc3 VM-16 pc3 VM-19 pc2 VM-2 pc1 VM-20 pc5 VM-21 pc5 VM-22 pc1 VM-23 pc2 VM-24 pc3 VM-26 pc3 VM-27 pc3 VM-3 pc2 VM-4 pc5 VM-5 pc3 VM-6 pc3 VM-7 pc3 VM-9 pc5 End Nodes Edges: linksimple/lan0/VM:0,VM-0:0 trivial pc5:loopback (pc5/null,(null)) pc5:loopback (pc5/null,(null)) linksimple/lan1/VM-0:1,VM-1:0 intraswitch link-pc5:eth2-procurve2:(null) (pc5/eth2,(null)) link-pc1:eth1-procurve2:(null) (pc1/eth1,(null)) linksimple/lan2/VM-1:1,VM-2:0 trivial pc1:loopback (pc1/null,(null)) pc1:loopback (pc1/null,(null)) linksimple/lan24/VM-9:0,VM-20:0 trivial pc5:loopback (pc5/null,(null)) pc5:loopback (pc5/null,(null)) linksimple/lan25/VM:1,VM-9:1 trivial pc5:loopback (pc5/null,(null)) pc5:loopback (pc5/null,(null)) linksimple/lan26/VM-9:2,VM-10:0 trivial pc5:loopback (pc5/null,(null)) pc5:loopback (pc5/null,(null)) linksimple/lan27/VM-10:1,VM-11:0 intraswitch link-pc5:eth2-procurve2:(null) (pc5/eth2,(null)) link-pc2:eth1-procurve2:(null) (pc2/eth1,(null)) linksimple/lan28/VM-11:1,VM-12:0 intraswitch link-pc2:eth1-procurve2:(null) (pc2/eth1,(null)) link-pc1:eth1-procurve2:(null) (pc1/eth1,(null)) linksimple/lan29/VM-12:1,VM-13:0 intraswitch link-pc1:eth1-procurve2:(null) (pc1/eth1,(null)) link-pc2:eth1-procurve2:(null) (pc2/eth1,(null)) linksimple/lan3/VM-2:1,VM-3:0 intraswitch link-pc1:eth1-procurve2:(null) (pc1/eth1,(null)) link-pc2:eth1-procurve2:(null) (pc2/eth1,(null)) linksimple/lan30/VM-13:1,VM-14:0 intraswitch link-pc2:eth1-procurve2:(null) (pc2/eth1,(null)) link-pc3:eth2-procurve2:(null) (pc3/eth2,(null)) linksimple/lan31/VM-14:1,VM-15:0 trivial pc3:loopback (pc3/null,(null)) pc3:loopback (pc3/null,(null)) linksimple/lan32/VM-15:1,VM-16:0 trivial pc3:loopback (pc3/null,(null)) pc3:loopback (pc3/null,(null)) linksimple/lan4/VM-3:1,VM-4:0 intraswitch link-pc2:eth1-procurve2:(null) (pc2/eth1,(null)) link-pc5:eth2-procurve2:(null) (pc5/eth2,(null)) linksimple/lan5/VM-4:1,VM-5:0 intraswitch link-pc5:eth2-procurve2:(null) (pc5/eth2,(null)) link-pc3:eth2-procurve2:(null) (pc3/eth2,(null)) linksimple/lan54/VM-16:1,VM-26:0 trivial pc3:loopback (pc3/null,(null)) pc3:loopback (pc3/null,(null)) linksimple/lan55/VM-27:1,VM-26:1 trivial pc3:loopback (pc3/null,(null)) pc3:loopback (pc3/null,(null)) linksimple/lan58/VM-24:0,VM-27:0 trivial pc3:loopback (pc3/null,(null)) pc3:loopback (pc3/null,(null)) linksimple/lan59/VM-23:0,VM-24:1 intraswitch link-pc2:eth1-procurve2:(null) (pc2/eth1,(null)) link-pc3:eth2-procurve2:(null) (pc3/eth2,(null)) linksimple/lan6/VM-5:1,VM-6:0 trivial pc3:loopback (pc3/null,(null)) pc3:loopback (pc3/null,(null)) linksimple/lan65/VM-20:1,VM-21:0 trivial pc5:loopback (pc5/null,(null)) pc5:loopback (pc5/null,(null)) linksimple/lan66/VM-21:1,VM-19:0 intraswitch link-pc5:eth2-procurve2:(null) (pc5/eth2,(null)) link-pc2:eth1-procurve2:(null) (pc2/eth1,(null)) linksimple/lan67/VM-19:1,VM-22:0 intraswitch link-pc2:eth1-procurve2:(null) (pc2/eth1,(null)) link-pc1:eth1-procurve2:(null) (pc1/eth1,(null)) linksimple/lan68/VM-22:1,VM-23:1 intraswitch link-pc1:eth1-procurve2:(null) (pc1/eth1,(null)) link-pc2:eth1-procurve2:(null) (pc2/eth1,(null)) linksimple/lan7/VM-6:1,VM-7:0 trivial pc3:loopback (pc3/null,(null)) pc3:loopback (pc3/null,(null)) linksimple/lan72/VM-6:2,VM-16:2 trivial pc3:loopback (pc3/null,(null)) pc3:loopback (pc3/null,(null)) linksimple/lan73/VM-5:2,VM-15:2 trivial pc3:loopback (pc3/null,(null)) pc3:loopback (pc3/null,(null)) linksimple/lan74/VM-4:2,VM-14:2 intraswitch link-pc5:eth2-procurve2:(null) (pc5/eth2,(null)) link-pc3:eth2-procurve2:(null) (pc3/eth2,(null)) linksimple/lan75/VM-3:2,VM-13:2 trivial pc2:loopback (pc2/null,(null)) pc2:loopback (pc2/null,(null)) linksimple/lan76/VM-2:2,VM-12:2 trivial pc1:loopback (pc1/null,(null)) pc1:loopback (pc1/null,(null)) linksimple/lan77/VM-1:2,VM-11:2 intraswitch link-pc1:eth1-procurve2:(null) (pc1/eth1,(null)) link-pc2:eth3-procurve2:(null) (pc2/eth3,(null)) linksimple/lan78/VM-0:2,VM-10:2 trivial pc5:loopback (pc5/null,(null)) pc5:loopback (pc5/null,(null)) linksimple/lan79/VM-10:3,VM-21:2 trivial pc5:loopback (pc5/null,(null)) pc5:loopback (pc5/null,(null)) linksimple/lan80/VM-11:3,VM-19:2 trivial pc2:loopback (pc2/null,(null)) pc2:loopback (pc2/null,(null)) linksimple/lan81/VM-12:3,VM-22:2 trivial pc1:loopback (pc1/null,(null)) pc1:loopback (pc1/null,(null)) linksimple/lan82/VM-13:3,VM-23:2 trivial pc2:loopback (pc2/null,(null)) pc2:loopback (pc2/null,(null)) linksimple/lan83/VM-14:3,VM-24:2 trivial pc3:loopback (pc3/null,(null)) pc3:loopback (pc3/null,(null)) linksimple/lan84/VM-15:3,VM-27:2 trivial pc3:loopback (pc3/null,(null)) pc3:loopback (pc3/null,(null)) End Edges End solution Summary: procurve2 0 vnodes, 2800000 nontrivial BW, 0 trivial BW, type=(null) pc3 9 vnodes, 400000 nontrivial BW, 1100000 trivial BW, type=pcvm 400000 link-pc3:eth2-procurve2:(null) pc5 7 vnodes, 600000 nontrivial BW, 700000 trivial BW, type=pcvm 600000 link-pc5:eth2-procurve2:(null) pc1 4 vnodes, 700000 nontrivial BW, 300000 trivial BW, type=pcvm 700000 link-pc1:eth1-procurve2:(null) ?+virtpercent: used=0 total=100 ?+cpu: used=0 total=2666 ?+ram: used=0 total=3574 ?+cpupercent: used=0 total=92 ?+rampercent: used=0 total=80 pc2 5 vnodes, 1100000 nontrivial BW, 300000 trivial BW, type=pcvm 1000000 link-pc2:eth1-procurve2:(null) 100000 link-pc2:eth3-procurve2:(null) ?+virtpercent: used=0 total=100 ?+cpu: used=0 total=2666 ?+ram: used=0 total=3574 ?+cpupercent: used=0 total=92 ?+rampercent: used=0 total=80 Total physical nodes used: 4 End summary ASSIGN FAILED: Type precheck passed. Node mapping precheck succeeded Policy precheck succeeded Annealing. Doing melting run Reverting: forced Reverting to best solution Done BEST SCORE: 18.71 in 49000 iters and 0.307011 seconds unassigned: 0 pnode_load: 0 no_connect: 0 link_users: 0 bandwidth: 1 desires: 0 vclass: 0 delay: 0 trivial mix: 0 subnodes: 0 max_types: 0 endpoints: 0
Change History (12)
comment:1 Changed 12 years ago by
comment:2 Changed 12 years ago by
On 5/25/12 11:40 AM, Leigh Stoller wrote:
bandwidth: 1
So here is the issue; Creating a 10 node mesh of 100Mb links requires an aggregate bandwidth of 1GB. Not cause you are going to actually use that, but the resource mapper cannot make any assumptions in the absence of other information.
This is why you get to provide a bandwidth in your rspec, to inform the mapper what you really want to do.
Diving deeper for those who are interested; this is a lan of containers on the same physical node, and there is some limit to the amount of traffic that can be sent over the loopback device between containers. At some point the physical node will no longer be able to keep up, and so we set a limit on what you can ask for. At the moment that number is set a lower then it probably should be (at 400Mb).
Bottom line; I bumped that to 1Gb which should allow your rspec to map. These nodes are pretty beefy, so I imagine they can keep up.
Lbs
The message reported to the experimenter is somewhat cryptic and I missed the bandwidth violation which was on line 25 out of 145 lines (not including the omni output).
I realized we are still developing/debugging, but I am going to ask anyways... Are there plans to modify results to provide a more intuitive output?
comment:3 Changed 12 years ago by
Re-ran the 10experiments with 10 VMs assuming that the configuration changes from last Friday would handle the bandwidth requirements. Before starting, verified that both shared nodes had 99 slot available.
Set up the first 8 experiments without problem. On the createsliver for the 9th experiment (10vmslice9) fails with this error:
Asked https://boss.utah.geniracks.net/protogeni/xmlrpc/am/2.0 to reserve resources. No manifest Rspec returned. *** ERROR: mapper: Reached run limit. Giving up. seed = 1338437282 Physical Graph: 4 Calculating shortest paths on switch fabric. Virtual Graph: 11 Generating physical equivalence classes:4 Type precheck: Type precheck passed. Node mapping precheck: Node mapping precheck succeeded Policy precheck: Policy precheck succeeded Annealing. Doing melting run Reverting: forced Reverting to best solution Done BEST SCORE: 3 in 17000 iters and 0.10192 seconds With 1 violations Iters to find best score: 338 Violations: 1 unassigned: 0 pnode_load: 0 no_connect: 0 link_users: 0 bandwidth: 0 desires: 1 vclass: 0 delay: 0 trivial mix: 0 subnodes: 0 max_types: 0 endpoints: 0 Nodes: VM-1 pc3 VM-10 pc3 VM-2 pc3 VM-3 pc3 VM-4 pc3 VM-5 pc3 VM-6 pc3 VM-7 pc3 VM-8 pc3 VM-9 pc3 lan/Lan pc3 End Nodes Edges: linklan/Lan/VM-1:0 trivial pc3:loopback (pc3/null,(null)) pc3:loopback (pc3/null,(null)) linklan/Lan/VM-2:0 trivial pc3:loopback (pc3/null,(null)) pc3:loopback (pc3/null,(null)) linklan/Lan/VM-3:0 trivial pc3:loopback (pc3/null,(null)) pc3:loopback (pc3/null,(null)) linklan/Lan/VM-4:0 trivial pc3:loopback (pc3/null,(null)) pc3:loopback (pc3/null,(null)) linklan/Lan/VM-5:0 trivial pc3:loopback (pc3/null,(null)) pc3:loopback (pc3/null,(null)) linklan/Lan/VM-6:0 trivial pc3:loopback (pc3/null,(null)) pc3:loopback (pc3/null,(null)) linklan/Lan/VM-7:0 trivial pc3:loopback (pc3/null,(null)) pc3:loopback (pc3/null,(null)) linklan/Lan/VM-8:0 trivial pc3:loopback (pc3/null,(null)) pc3:loopback (pc3/null,(null)) linklan/Lan/VM-9:0 trivial pc3:loopback (pc3/null,(null)) pc3:loopback (pc3/null,(null)) linklan/Lan/VM-10:0 trivial pc3:loopback (pc3/null,(null)) pc3:loopback (pc3/null,(null)) End Edges End solution Summary: pc3 11 vnodes, 0 nontrivial BW, 1000000 trivial BW, type=pcvm Total physical nodes used: 1 End summary ASSIGN FAILED: Type precheck passed. Node mapping precheck succeeded Policy precheck succeeded Annealing. Doing melting run Reverting: forced Reverting to best solution Done BEST SCORE: 3 in 17000 iters and 0.10192 seconds unassigned: 0 pnode_load: 0 no_connect: 0 link_users: 0 bandwidth: 0 desires: 1 vclass: 0 delay: 0 trivial mix: 0 subnodes: 0 max_types: 0 endpoints: 0
comment:4 Changed 12 years ago by
Here is the VM allocation for the 10 experiment with 10 VMs each, in case it is of interest to anyone:
- 10vmslice1 - 10 VMs on pc5
- 10vmslice2 - 10 VMs on pc3
- 10vmslice3 - 10 VMs on pc5
- 10vmslice4 - 10 VMs on pc3
- 10vmslice5 - 7 VMs on pc5 + 3 VMs on pc3
- 10vmslice6 - 2 VMs on pc5 + 6 VMs on pc3 + 2 VMs on pc1
- 10vmslice7 - 10 VMs on pc2
- 10vmslice8 - 10 VMs on pc4
comment:5 Changed 12 years ago by
Unable to create 1 experiment with 25 VMs when starting with the following available resources:
- 100 pcvm slots available on shared nodes pc3
- 100 pcvm slots available on shared nodes pc5
- pc1, pc2, and pc4 not available
The attempt to create a sliver (25vmslice1) failed with the following error:
*** Type precheck failed!*** Type precheck failed!*** ERROR: mapper: Unretriable error. Giving up. seed = 1338507038 Physical Graph: 4 Calculating shortest paths on switch fabric. Virtual Graph: 25 Generating physical equivalence classes:4 Type precheck: *** 25 nodes of type pcvm requested, but only 20 available nodes of type pcvm found *** Type precheck failed! ASSIGN FAILED: *** 25 nodes of type pcvm requested, but only 20 available nodes of type pcvm found *** Type precheck failed!
I expected to be able to get 25 nodes across the two shared nodes (pc3 and pc5).
comment:6 Changed 12 years ago by
Description: | modified (diff) |
---|
Backing off from testing!
With the following resources available:
- 100 pcvm slots available on shared nodes pc3
- 98 pcvm slots available on shared nodes pc5
- pc1, pc2, and pc4 not available
Tried to create one sliver with 20 nodes (20vmslice1), which resulted in the same failure reported yesterday afternoon due (desires:1).
*** ERROR: mapper: Reached run limit. Giving up. seed = 1338510636 Physical Graph: 4 Calculating shortest paths on switch fabric. Virtual Graph: 21 Generating physical equivalence classes:4 Type precheck: Type precheck passed. Node mapping precheck: Node mapping precheck succeeded Policy precheck: Policy precheck succeeded Annealing. Adjusting dificulty estimate for fixed nodes, 1 remain. Doing melting run Reverting: forced Reverting to best solution Done BEST SCORE: 5.42 in 17000 iters and 0.470627 seconds With 1 violations Iters to find best score: 1 Violations: 1 unassigned: 0 pnode_load: 0 no_connect: 0 link_users: 0 bandwidth: 0 desires: 1 vclass: 0 delay: 0 trivial mix: 0 subnodes: 0 max_types: 0 endpoints: 0 Nodes: VM-1 pc3 VM-10 pc5 VM-11 pc5 VM-12 pc3 VM-13 pc5 VM-14 pc5 VM-15 pc3 VM-16 pc5 VM-17 pc3 VM-18 pc5 VM-19 pc5 VM-2 pc5 VM-20 pc3 VM-3 pc3 VM-4 pc5 VM-5 pc3 VM-6 pc5 VM-7 pc3 VM-8 pc3 VM-9 pc3 lan/Lan pc5 End Nodes Edges: linklan/Lan/VM-1:0 intraswitch link-pc3:eth1-procurve2:(null) (pc3/eth1,(null)) link-pc5:eth1-procurve2:(null) (pc5/eth1,(null)) linklan/Lan/VM-2:0 trivial pc5:loopback (pc5/null,(null)) pc5:loopback (pc5/null,(null)) linklan/Lan/VM-3:0 intraswitch link-pc3:eth1-procurve2:(null) (pc3/eth1,(null)) link-pc5:eth1-procurve2:(null) (pc5/eth1,(null)) linklan/Lan/VM-4:0 trivial pc5:loopback (pc5/null,(null)) pc5:loopback (pc5/null,(null)) linklan/Lan/VM-5:0 intraswitch link-pc3:eth1-procurve2:(null) (pc3/eth1,(null)) link-pc5:eth1-procurve2:(null) (pc5/eth1,(null)) linklan/Lan/VM-6:0 trivial pc5:loopback (pc5/null,(null)) pc5:loopback (pc5/null,(null)) linklan/Lan/VM-7:0 intraswitch link-pc3:eth1-procurve2:(null) (pc3/eth1,(null)) link-pc5:eth1-procurve2:(null) (pc5/eth1,(null)) linklan/Lan/VM-8:0 intraswitch link-pc3:eth1-procurve2:(null) (pc3/eth1,(null)) link-pc5:eth1-procurve2:(null) (pc5/eth1,(null)) linklan/Lan/VM-9:0 intraswitch link-pc3:eth1-procurve2:(null) (pc3/eth1,(null)) link-pc5:eth1-procurve2:(null) (pc5/eth1,(null)) linklan/Lan/VM-10:0 trivial pc5:loopback (pc5/null,(null)) pc5:loopback (pc5/null,(null)) linklan/Lan/VM-13:0 trivial pc5:loopback (pc5/null,(null)) pc5:loopback (pc5/null,(null)) linklan/Lan/VM-14:0 trivial pc5:loopback (pc5/null,(null)) pc5:loopback (pc5/null,(null)) linklan/Lan/VM-15:0 intraswitch link-pc3:eth1-procurve2:(null) (pc3/eth1,(null)) link-pc5:eth1-procurve2:(null) (pc5/eth1,(null)) linklan/Lan/VM-16:0 trivial pc5:loopback (pc5/null,(null)) pc5:loopback (pc5/null,(null)) linklan/Lan/VM-17:0 intraswitch link-pc3:eth1-procurve2:(null) (pc3/eth1,(null)) link-pc5:eth1-procurve2:(null) (pc5/eth1,(null)) linklan/Lan/VM-18:0 trivial pc5:loopback (pc5/null,(null)) pc5:loopback (pc5/null,(null)) linklan/Lan/VM-19:0 trivial pc5:loopback (pc5/null,(null)) pc5:loopback (pc5/null,(null)) linklan/Lan/VM-20:0 intraswitch link-pc3:eth1-procurve2:(null) (pc3/eth1,(null)) link-pc5:eth1-procurve2:(null) (pc5/eth1,(null)) linklan/Lan/VM-12:0 intraswitch link-pc3:eth1-procurve2:(null) (pc3/eth1,(null)) link-pc5:eth1-procurve2:(null) (pc5/eth1,(null)) linklan/Lan/VM-11:0 trivial pc5:loopback (pc5/null,(null)) pc5:loopback (pc5/null,(null)) End Edges End solution Summary: procurve2 0 vnodes, 2000000 nontrivial BW, 0 trivial BW, type= pc3 10 vnodes, 1000000 nontrivial BW, 0 trivial BW, type=pcvm 1000000 link-pc3:eth1-procurve2:(null) pc5 11 vnodes, 1000000 nontrivial BW, 1000000 trivial BW, type=pcvm 1000000 link-pc5:eth1-procurve2:(null) Total physical nodes used: 2 End summary ASSIGN FAILED: Type precheck passed. Node mapping precheck succeeded Policy precheck succeeded Annealing. Adjusting dificulty estimate for fixed nodes, 1 remain. Doing melting run Reverting: forced Reverting to best solution Done BEST SCORE: 5.42 in 17000 iters and 0.470627 seconds unassigned: 0 pnode_load: 0 no_connect: 0 link_users: 0 bandwidth: 0 desires: 1 vclass: 0 delay: 0 trivial mix: 0 subnodes: 0 max_types: 0 endpoints: 0
comment:7 Changed 12 years ago by
Really backing off now. :-)
Requesting one sliver with 15 VMs also generates the same failure when 198 slots are available.
comment:8 Changed 12 years ago by
Starting resources: 100 pcvm slots on pc3, 98 pcvm slots on pc5, and no available dedicated nodes.
=> Ran scenario with 10 experiments with 10 VMs each:
Results: All slivers were successfully created.
Allocation: (90 VMs on pc3; 10 VMs on pc5)
- sliver 10vmslice-1 = 10 VMs on pc3
- sliver 10vmslice-2 = 10 VMs on pc3
- sliver 10vmslice-3 = 10 VMs on pc3
- sliver 10vmslice-4 = 10 VMs on pc3
- sliver 10vmslice-5 = 10 VMs on pc5
- sliver 10vmslice-6 = 10 VMs on pc3
- sliver 10vmslice-7 = 10 VMs on pc3
- sliver 10vmslice-8 = 10 VMs on pc3
- sliver 10vmslice-9 = 10 VMs on pc3
- sliver 10vmslice-10 = 10 VMs on pc3
Running some commands on allocated nodes.
comment:9 Changed 12 years ago by
Luisa suggested i run some of the various monitoring commands i've been looking at while her test is running. Doing so now. No particular questions, just recording this for reference:
- look for VLANs on the management switch:
ProCurve Switch 2610-24# show vlans ... VLAN ID Name | Status Voice Jumbo ------- -------------------------------- + ---------- ----- ----- 1 DEFAULT_VLAN | Port-based No No 10 control-hardware | Port-based No No 11 control-alternate | Port-based No No
So there are no per-experiment VLANs on the control switch, as expected. - IG-MON-2 Step 1: see what state the nodes are in:
- Three nodes are in reloading, which is suspicious, and i'll investigate in a bit if it's still true
- Two nodes, pc3 and pc5, are in the shared-nodes experiment
- IG-MON-3 Step 4: get information about running VMs:
- https://boss.utah.geniracks.net/showpool.php also shows two shared nodes
- https://boss.utah.geniracks.net/shownode.php3?node_id=pc3 shows that the following VMs are on pc3:
pcvm3-87 pgeni-gpolab-bbn-com/10vmslice-10 pcvm3-85 pgeni-gpolab-bbn-com/10vmslice-10 pcvm3-84 pgeni-gpolab-bbn-com/10vmslice-10 pcvm3-83 pgeni-gpolab-bbn-com/10vmslice-10 pcvm3-71 pgeni-gpolab-bbn-com/10vmslice-9 pcvm3-72 pgeni-gpolab-bbn-com/10vmslice-9 pcvm3-73 pgeni-gpolab-bbn-com/10vmslice-9 pcvm3-74 pgeni-gpolab-bbn-com/10vmslice-9 pcvm3-75 pgeni-gpolab-bbn-com/10vmslice-9 pcvm3-76 pgeni-gpolab-bbn-com/10vmslice-9 pcvm3-77 pgeni-gpolab-bbn-com/10vmslice-9 pcvm3-78 pgeni-gpolab-bbn-com/10vmslice-9 pcvm3-79 pgeni-gpolab-bbn-com/10vmslice-9 pcvm3-80 pgeni-gpolab-bbn-com/10vmslice-9 pcvm3-81 pgeni-gpolab-bbn-com/10vmslice-10 pcvm3-89 pgeni-gpolab-bbn-com/10vmslice-10 pcvm3-88 pgeni-gpolab-bbn-com/10vmslice-10 pcvm3-86 pgeni-gpolab-bbn-com/10vmslice-10 pcvm3-82 pgeni-gpolab-bbn-com/10vmslice-10 pcvm3-70 pgeni-gpolab-bbn-com/10vmslice-8 pcvm3-61 pgeni-gpolab-bbn-com/10vmslice-8 pcvm3-62 pgeni-gpolab-bbn-com/10vmslice-8 pcvm3-63 pgeni-gpolab-bbn-com/10vmslice-8 pcvm3-64 pgeni-gpolab-bbn-com/10vmslice-8 pcvm3-65 pgeni-gpolab-bbn-com/10vmslice-8 pcvm3-66 pgeni-gpolab-bbn-com/10vmslice-8 pcvm3-67 pgeni-gpolab-bbn-com/10vmslice-8 pcvm3-68 pgeni-gpolab-bbn-com/10vmslice-8 pcvm3-69 pgeni-gpolab-bbn-com/10vmslice-8 pcvm3-90 pgeni-gpolab-bbn-com/10vmslice-10 pcvm3-54 pgeni-gpolab-bbn-com/10vmslice-7 pcvm3-55 pgeni-gpolab-bbn-com/10vmslice-7 pcvm3-56 pgeni-gpolab-bbn-com/10vmslice-7 pcvm3-57 pgeni-gpolab-bbn-com/10vmslice-7 pcvm3-58 pgeni-gpolab-bbn-com/10vmslice-7 pcvm3-59 pgeni-gpolab-bbn-com/10vmslice-7 pcvm3-60 pgeni-gpolab-bbn-com/10vmslice-7 pcvm3-53 pgeni-gpolab-bbn-com/10vmslice-7 pcvm3-52 pgeni-gpolab-bbn-com/10vmslice-7 pcvm3-51 pgeni-gpolab-bbn-com/10vmslice-7 pcvm3-46 pgeni-gpolab-bbn-com/10vmslice-6 pcvm3-47 pgeni-gpolab-bbn-com/10vmslice-6 pcvm3-48 pgeni-gpolab-bbn-com/10vmslice-6 pcvm3-49 pgeni-gpolab-bbn-com/10vmslice-6 pcvm3-50 pgeni-gpolab-bbn-com/10vmslice-6 pcvm3-45 pgeni-gpolab-bbn-com/10vmslice-6 pcvm3-44 pgeni-gpolab-bbn-com/10vmslice-6 pcvm3-43 pgeni-gpolab-bbn-com/10vmslice-6 pcvm3-42 pgeni-gpolab-bbn-com/10vmslice-6 pcvm3-41 pgeni-gpolab-bbn-com/10vmslice-6 pcvm3-1 pgeni-gpolab-bbn-com/10vmslice-1 pcvm3-2 pgeni-gpolab-bbn-com/10vmslice-1 pcvm3-3 pgeni-gpolab-bbn-com/10vmslice-1 pcvm3-4 pgeni-gpolab-bbn-com/10vmslice-1 pcvm3-5 pgeni-gpolab-bbn-com/10vmslice-1 pcvm3-6 pgeni-gpolab-bbn-com/10vmslice-1 pcvm3-7 pgeni-gpolab-bbn-com/10vmslice-1 pcvm3-8 pgeni-gpolab-bbn-com/10vmslice-1 pcvm3-9 pgeni-gpolab-bbn-com/10vmslice-1 pcvm3-10 pgeni-gpolab-bbn-com/10vmslice-1 pcvm3-11 pgeni-gpolab-bbn-com/10vmslice-2 pcvm3-12 pgeni-gpolab-bbn-com/10vmslice-2 pcvm3-13 pgeni-gpolab-bbn-com/10vmslice-2 pcvm3-14 pgeni-gpolab-bbn-com/10vmslice-2 pcvm3-15 pgeni-gpolab-bbn-com/10vmslice-2 pcvm3-16 pgeni-gpolab-bbn-com/10vmslice-2 pcvm3-17 pgeni-gpolab-bbn-com/10vmslice-2 pcvm3-18 pgeni-gpolab-bbn-com/10vmslice-2 pcvm3-19 pgeni-gpolab-bbn-com/10vmslice-2 pcvm3-20 pgeni-gpolab-bbn-com/10vmslice-2 pcvm3-21 pgeni-gpolab-bbn-com/10vmslice-3 pcvm3-22 pgeni-gpolab-bbn-com/10vmslice-3 pcvm3-23 pgeni-gpolab-bbn-com/10vmslice-3 pcvm3-24 pgeni-gpolab-bbn-com/10vmslice-3 pcvm3-25 pgeni-gpolab-bbn-com/10vmslice-3 pcvm3-26 pgeni-gpolab-bbn-com/10vmslice-3 pcvm3-27 pgeni-gpolab-bbn-com/10vmslice-3 pcvm3-28 pgeni-gpolab-bbn-com/10vmslice-3 pcvm3-29 pgeni-gpolab-bbn-com/10vmslice-3 pcvm3-30 pgeni-gpolab-bbn-com/10vmslice-3 pcvm3-31 pgeni-gpolab-bbn-com/10vmslice-4 pcvm3-32 pgeni-gpolab-bbn-com/10vmslice-4 pcvm3-33 pgeni-gpolab-bbn-com/10vmslice-4 pcvm3-34 pgeni-gpolab-bbn-com/10vmslice-4 pcvm3-35 pgeni-gpolab-bbn-com/10vmslice-4 pcvm3-36 pgeni-gpolab-bbn-com/10vmslice-4 pcvm3-37 pgeni-gpolab-bbn-com/10vmslice-4 pcvm3-38 pgeni-gpolab-bbn-com/10vmslice-4 pcvm3-39 pgeni-gpolab-bbn-com/10vmslice-4 pcvm3-40 pgeni-gpolab-bbn-com/10vmslice-4
- https://boss.utah.geniracks.net/shownode.php3?node_id=pc5 shows that the following VMs are on pc5:
pcvm5-2 pgeni-gpolab-bbn-com/ecgtest pcvm5-1 pgeni-gpolab-bbn-com/ecgtest pcvm5-12 pgeni-gpolab-bbn-com/10vmslice-5 pcvm5-11 pgeni-gpolab-bbn-com/10vmslice-5 pcvm5-3 pgeni-gpolab-bbn-com/10vmslice-5 pcvm5-4 pgeni-gpolab-bbn-com/10vmslice-5 pcvm5-5 pgeni-gpolab-bbn-com/10vmslice-5 pcvm5-6 pgeni-gpolab-bbn-com/10vmslice-5 pcvm5-7 pgeni-gpolab-bbn-com/10vmslice-5 pcvm5-8 pgeni-gpolab-bbn-com/10vmslice-5 pcvm5-9 pgeni-gpolab-bbn-com/10vmslice-5 pcvm5-10 pgeni-gpolab-bbn-com/10vmslice-5
- So 90 VMs were placed on pc3, and 10 on pc5
- Looking at an individual experiment, say 10vmslice-1:
- Corroborates that all VMs are on pc3:
Virtual Node Info: ID Type OS Qualified Name --------------- ------------ --------------- -------------------- VM-1 (pc3) pcvm OPENVZ-STD VM-1.10vmslice-1.pgeni-gpolab-bbn-com.utah.geniracks.net VM-10 (pc3) pcvm OPENVZ-STD VM-10.10vmslice-1.pgeni-gpolab-bbn-com.utah.geniracks.net VM-2 (pc3) pcvm OPENVZ-STD VM-2.10vmslice-1.pgeni-gpolab-bbn-com.utah.geniracks.net VM-3 (pc3) pcvm OPENVZ-STD VM-3.10vmslice-1.pgeni-gpolab-bbn-com.utah.geniracks.net VM-4 (pc3) pcvm OPENVZ-STD VM-4.10vmslice-1.pgeni-gpolab-bbn-com.utah.geniracks.net VM-5 (pc3) pcvm OPENVZ-STD VM-5.10vmslice-1.pgeni-gpolab-bbn-com.utah.geniracks.net VM-6 (pc3) pcvm OPENVZ-STD VM-6.10vmslice-1.pgeni-gpolab-bbn-com.utah.geniracks.net VM-7 (pc3) pcvm OPENVZ-STD VM-7.10vmslice-1.pgeni-gpolab-bbn-com.utah.geniracks.net VM-8 (pc3) pcvm OPENVZ-STD VM-8.10vmslice-1.pgeni-gpolab-bbn-com.utah.geniracks.net VM-9 (pc3) pcvm OPENVZ-STD VM-9.10vmslice-1.pgeni-gpolab-bbn-com.utah.geniracks.net
- Note that, even on boss, those hostnames are not defined. I'll follow up with Leigh about that (it's an outstanding question i had):
boss,[~],12:45(127)$ host VM-8.10vmslice-1.pgeni-gpolab-bbn-com.utah.geniracks.net Host VM-8.10vmslice-1.pgeni-gpolab-bbn-com.utah.geniracks.net not found: 3(NXDOMAIN)
- I note that MAC addresses are not listed here:
Physical Lan/Link Mapping: ID Member IP MAC NodeID --------------- --------------- --------------- -------------------- --------- Lan VM-10:0 10.10.1.10 pcvm3-2 Lan VM-1:0 10.10.1.1 pcvm3-1 Lan VM-2:0 10.10.1.2 pcvm3-3 Lan VM-3:0 10.10.1.3 pcvm3-4 Lan VM-4:0 10.10.1.4 pcvm3-5 Lan VM-5:0 10.10.1.5 pcvm3-6 Lan VM-6:0 10.10.1.6 pcvm3-7 Lan VM-7:0 10.10.1.7 pcvm3-8 Lan VM-8:0 10.10.1.8 pcvm3-9 Lan VM-9:0 10.10.1.9 pcvm3-10
- Corroborates that all VMs are on pc3:
- Looking at the vzhosts themselves:
vhost2,[~],12:50(0)$ sudo vzlist -a CTID NPROC STATUS IP_ADDR HOSTNAME 1 15 running - VM-1.10vmslice-1.pgeni-gpolab-bbn-com.utah.geniracks.net 2 15 running - VM-10.10vmslice-1.pgeni-gpolab-bbn-com.utah.geniracks.net 3 15 running - VM-2.10vmslice-1.pgeni-gpolab-bbn-com.utah.geniracks.net 4 15 running - VM-3.10vmslice-1.pgeni-gpolab-bbn-com.utah.geniracks.net 5 15 running - VM-4.10vmslice-1.pgeni-gpolab-bbn-com.utah.geniracks.net 6 18 running - VM-5.10vmslice-1.pgeni-gpolab-bbn-com.utah.geniracks.net 7 15 running - VM-6.10vmslice-1.pgeni-gpolab-bbn-com.utah.geniracks.net 8 15 running - VM-7.10vmslice-1.pgeni-gpolab-bbn-com.utah.geniracks.net 9 15 running - VM-8.10vmslice-1.pgeni-gpolab-bbn-com.utah.geniracks.net 10 15 running - VM-9.10vmslice-1.pgeni-gpolab-bbn-com.utah.geniracks.net 11 15 running - VM-1.10vmslice-2.pgeni-gpolab-bbn-com.utah.geniracks.net 12 15 running - VM-10.10vmslice-2.pgeni-gpolab-bbn-com.utah.geniracks.net 13 15 running - VM-2.10vmslice-2.pgeni-gpolab-bbn-com.utah.geniracks.net 14 15 running - VM-3.10vmslice-2.pgeni-gpolab-bbn-com.utah.geniracks.net 15 15 running - VM-4.10vmslice-2.pgeni-gpolab-bbn-com.utah.geniracks.net 16 15 running - VM-5.10vmslice-2.pgeni-gpolab-bbn-com.utah.geniracks.net 17 15 running - VM-6.10vmslice-2.pgeni-gpolab-bbn-com.utah.geniracks.net 18 15 running - VM-7.10vmslice-2.pgeni-gpolab-bbn-com.utah.geniracks.net 19 15 running - VM-8.10vmslice-2.pgeni-gpolab-bbn-com.utah.geniracks.net 20 15 running - VM-9.10vmslice-2.pgeni-gpolab-bbn-com.utah.geniracks.net 21 15 running - VM-1.10vmslice-3.pgeni-gpolab-bbn-com.utah.geniracks.net 22 15 running - VM-10.10vmslice-3.pgeni-gpolab-bbn-com.utah.geniracks.net 23 15 running - VM-2.10vmslice-3.pgeni-gpolab-bbn-com.utah.geniracks.net 24 15 running - VM-3.10vmslice-3.pgeni-gpolab-bbn-com.utah.geniracks.net 25 15 running - VM-4.10vmslice-3.pgeni-gpolab-bbn-com.utah.geniracks.net 26 15 running - VM-5.10vmslice-3.pgeni-gpolab-bbn-com.utah.geniracks.net 27 15 running - VM-6.10vmslice-3.pgeni-gpolab-bbn-com.utah.geniracks.net 28 15 running - VM-7.10vmslice-3.pgeni-gpolab-bbn-com.utah.geniracks.net 29 15 running - VM-8.10vmslice-3.pgeni-gpolab-bbn-com.utah.geniracks.net 30 15 running - VM-9.10vmslice-3.pgeni-gpolab-bbn-com.utah.geniracks.net 31 15 running - VM-1.10vmslice-4.pgeni-gpolab-bbn-com.utah.geniracks.net 32 15 running - VM-10.10vmslice-4.pgeni-gpolab-bbn-com.utah.geniracks.net 33 15 running - VM-2.10vmslice-4.pgeni-gpolab-bbn-com.utah.geniracks.net 34 15 running - VM-3.10vmslice-4.pgeni-gpolab-bbn-com.utah.geniracks.net 35 15 running - VM-4.10vmslice-4.pgeni-gpolab-bbn-com.utah.geniracks.net 36 15 running - VM-5.10vmslice-4.pgeni-gpolab-bbn-com.utah.geniracks.net 37 15 running - VM-6.10vmslice-4.pgeni-gpolab-bbn-com.utah.geniracks.net 38 15 running - VM-7.10vmslice-4.pgeni-gpolab-bbn-com.utah.geniracks.net 39 15 running - VM-8.10vmslice-4.pgeni-gpolab-bbn-com.utah.geniracks.net 40 15 running - VM-9.10vmslice-4.pgeni-gpolab-bbn-com.utah.geniracks.net 41 15 running - VM-1.10vmslice-6.pgeni-gpolab-bbn-com.utah.geniracks.net 42 15 running - VM-10.10vmslice-6.pgeni-gpolab-bbn-com.utah.geniracks.net 43 15 running - VM-2.10vmslice-6.pgeni-gpolab-bbn-com.utah.geniracks.net 44 15 running - VM-3.10vmslice-6.pgeni-gpolab-bbn-com.utah.geniracks.net 45 15 running - VM-4.10vmslice-6.pgeni-gpolab-bbn-com.utah.geniracks.net 46 15 running - VM-5.10vmslice-6.pgeni-gpolab-bbn-com.utah.geniracks.net 47 15 running - VM-6.10vmslice-6.pgeni-gpolab-bbn-com.utah.geniracks.net 48 15 running - VM-7.10vmslice-6.pgeni-gpolab-bbn-com.utah.geniracks.net 49 15 running - VM-8.10vmslice-6.pgeni-gpolab-bbn-com.utah.geniracks.net 50 15 running - VM-9.10vmslice-6.pgeni-gpolab-bbn-com.utah.geniracks.net 51 15 running - VM-1.10vmslice-7.pgeni-gpolab-bbn-com.utah.geniracks.net 52 15 running - VM-10.10vmslice-7.pgeni-gpolab-bbn-com.utah.geniracks.net 53 15 running - VM-2.10vmslice-7.pgeni-gpolab-bbn-com.utah.geniracks.net 54 15 running - VM-3.10vmslice-7.pgeni-gpolab-bbn-com.utah.geniracks.net 55 15 running - VM-4.10vmslice-7.pgeni-gpolab-bbn-com.utah.geniracks.net 56 15 running - VM-5.10vmslice-7.pgeni-gpolab-bbn-com.utah.geniracks.net 57 15 running - VM-6.10vmslice-7.pgeni-gpolab-bbn-com.utah.geniracks.net 58 15 running - VM-7.10vmslice-7.pgeni-gpolab-bbn-com.utah.geniracks.net 59 15 running - VM-8.10vmslice-7.pgeni-gpolab-bbn-com.utah.geniracks.net 60 15 running - VM-9.10vmslice-7.pgeni-gpolab-bbn-com.utah.geniracks.net 61 15 running - VM-1.10vmslice-8.pgeni-gpolab-bbn-com.utah.geniracks.net 62 15 running - VM-10.10vmslice-8.pgeni-gpolab-bbn-com.utah.geniracks.net 63 15 running - VM-2.10vmslice-8.pgeni-gpolab-bbn-com.utah.geniracks.net 64 17 running - VM-3.10vmslice-8.pgeni-gpolab-bbn-com.utah.geniracks.net 65 15 running - VM-4.10vmslice-8.pgeni-gpolab-bbn-com.utah.geniracks.net 66 15 running - VM-5.10vmslice-8.pgeni-gpolab-bbn-com.utah.geniracks.net 67 15 running - VM-6.10vmslice-8.pgeni-gpolab-bbn-com.utah.geniracks.net 68 15 running - VM-7.10vmslice-8.pgeni-gpolab-bbn-com.utah.geniracks.net 69 15 running - VM-8.10vmslice-8.pgeni-gpolab-bbn-com.utah.geniracks.net 70 15 running - VM-9.10vmslice-8.pgeni-gpolab-bbn-com.utah.geniracks.net 71 15 running - VM-1.10vmslice-9.pgeni-gpolab-bbn-com.utah.geniracks.net 72 15 running - VM-10.10vmslice-9.pgeni-gpolab-bbn-com.utah.geniracks.net 73 15 running - VM-2.10vmslice-9.pgeni-gpolab-bbn-com.utah.geniracks.net 74 15 running - VM-3.10vmslice-9.pgeni-gpolab-bbn-com.utah.geniracks.net 75 15 running - VM-4.10vmslice-9.pgeni-gpolab-bbn-com.utah.geniracks.net 76 15 running - VM-5.10vmslice-9.pgeni-gpolab-bbn-com.utah.geniracks.net 78 15 running - VM-7.10vmslice-9.pgeni-gpolab-bbn-com.utah.geniracks.net 79 15 running - VM-8.10vmslice-9.pgeni-gpolab-bbn-com.utah.geniracks.net 80 15 running - VM-9.10vmslice-9.pgeni-gpolab-bbn-com.utah.geniracks.net 81 15 running - VM-1.10vmslice-10.pgeni-gpolab-bbn-com.utah.geniracks.net 82 15 running - VM-10.10vmslice-10.pgeni-gpolab-bbn-com.utah.geniracks.net 83 15 running - VM-2.10vmslice-10.pgeni-gpolab-bbn-com.utah.geniracks.net 84 15 running - VM-3.10vmslice-10.pgeni-gpolab-bbn-com.utah.geniracks.net 85 15 running - VM-4.10vmslice-10.pgeni-gpolab-bbn-com.utah.geniracks.net 86 15 running - VM-5.10vmslice-10.pgeni-gpolab-bbn-com.utah.geniracks.net 87 15 running - VM-6.10vmslice-10.pgeni-gpolab-bbn-com.utah.geniracks.net 88 15 running - VM-7.10vmslice-10.pgeni-gpolab-bbn-com.utah.geniracks.net 89 15 running - VM-8.10vmslice-10.pgeni-gpolab-bbn-com.utah.geniracks.net 90 15 running - VM-9.10vmslice-10.pgeni-gpolab-bbn-com.utah.geniracks.net vhost1,[~],12:51(0)$ sudo vzlist -a CTID NPROC STATUS IP_ADDR HOSTNAME 1 15 running - virt1.ecgtest.pgeni-gpolab-bbn-com.utah.geniracks.net 2 15 running - virt2.ecgtest.pgeni-gpolab-bbn-com.utah.geniracks.net 3 15 running - VM-1.10vmslice-5.pgeni-gpolab-bbn-com.utah.geniracks.net 4 15 running - VM-10.10vmslice-5.pgeni-gpolab-bbn-com.utah.geniracks.net 5 15 running - VM-2.10vmslice-5.pgeni-gpolab-bbn-com.utah.geniracks.net 6 15 running - VM-3.10vmslice-5.pgeni-gpolab-bbn-com.utah.geniracks.net 7 15 running - VM-4.10vmslice-5.pgeni-gpolab-bbn-com.utah.geniracks.net 8 15 running - VM-5.10vmslice-5.pgeni-gpolab-bbn-com.utah.geniracks.net 9 15 running - VM-6.10vmslice-5.pgeni-gpolab-bbn-com.utah.geniracks.net 10 15 running - VM-7.10vmslice-5.pgeni-gpolab-bbn-com.utah.geniracks.net 11 15 running - VM-8.10vmslice-5.pgeni-gpolab-bbn-com.utah.geniracks.net 12 17 running - VM-9.10vmslice-5.pgeni-gpolab-bbn-com.utah.geniracks.net
- I am interested in investigating a process which is running on pc3:
vhost2,[~],12:59(0)$ pg ping 20001 656004 468218 0 12:56 ? 00:00:00 csh -c ping 10.10.0.1 20001 656012 656004 0 12:56 ? 00:00:00 ping 10.10.0.1 chaos 659649 647640 0 12:59 pts/0 00:00:00 grep --color=auto ping vhost2,[~],12:59(0)$ sudo vzpid 656012 Pid CTID Name 656012 41 ping
- From above, CTID 41 is
VM-1.10vmslice-6.pgeni-gpolab-bbn-com.utah.geniracks.net
- Use top to see if the two hosts seem to be performing well:
- pc3:
top - 13:01:14 up 1 day, 46 min, 1 user, load average: 1.28, 2.24, 1.75 Tasks: 2111 total, 3 running, 2108 sleeping, 0 stopped, 0 zombie Cpu(s): 0.9%us, 1.5%sy, 0.0%ni, 94.5%id, 3.1%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 49311612k total, 7402468k used, 41909144k free, 446780k buffers Swap: 1050168k total, 0k used, 1050168k free, 2819928k cached
- pc5:
top - 13:02:00 up 1 day, 47 min, 1 user, load average: 0.06, 0.22, 0.21 Tasks: 492 total, 1 running, 491 sleeping, 0 stopped, 0 zombie Cpu(s): 0.1%us, 0.2%sy, 0.0%ni, 99.6%id, 0.1%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 49311612k total, 1904952k used, 47406660k free, 151880k buffers Swap: 1050168k total, 0k used, 1050168k free, 906872k cached
- pc3:
- So neither machine is all that busy, and pc3 having a large number of inactive processes seems to be no big deal performance-wise.
comment:10 Changed 12 years ago by
=> Ran a scenario with 5 experiments with 20 VMs each, which failed on the 4th sliver with a "no_connect" violation.
Starting resources before start of scenario:
- 100 pcvm slots on pc3
- 98 pcvm slots on pc5
- no available dedicated nodes.
Test details:
- 1st sliver 20vmslice-1 = OK = 10 VMs on pc3, 10 VMs on pc5
- 2nd sliver 20vmslice-2 = OK = 10 VMs on pc3, 10 VMs on pc5
- 3rd sliver 20vmslice-3 = OK = 10 VMs on pc3, 10 VMs on pc5
- 4th sliver 20vmslice-4 = failed with this error:
*** ERROR: mapper: Reached run limit. Giving up. seed = 1338542374 Physical Graph: 4 Calculating shortest paths on switch fabric. Virtual Graph: 21 Generating physical equivalence classes:4 Type precheck: Type precheck passed. Node mapping precheck: Node mapping precheck succeeded Policy precheck: Policy precheck succeeded Annealing. Doing melting run Reverting: forced Reverting to best solution Done BEST SCORE: 8.5 in 17000 iters and 0.050399 seconds With 10 violations Iters to find best score: 23 Violations: 10 unassigned: 0 pnode_load: 0 no_connect: 10 link_users: 0 bandwidth: 0 desires: 0 vclass: 0 delay: 0 trivial mix: 0 subnodes: 0 max_types: 0 endpoints: 0 Nodes: VM-1 pc3 VM-10 pc3 VM-11 pc3 VM-12 pc5 VM-13 pc5 VM-14 pc5 VM-15 pc5 VM-16 pc5 VM-17 pc5 VM-18 pc3 VM-19 pc3 VM-2 pc3 VM-20 pc3 VM-3 pc3 VM-4 pc3 VM-5 pc5 VM-6 pc5 VM-7 pc5 VM-8 pc3 VM-9 pc5 lan/Lan pc3 End Nodes Edges: linklan/Lan/VM-1:0 trivial pc3:loopback (pc3/null,(null)) pc3:loopback (pc3/null,(null)) linklan/Lan/VM-2:0 trivial pc3:loopback (pc3/null,(null)) pc3:loopback (pc3/null,(null)) linklan/Lan/VM-3:0 trivial pc3:loopback (pc3/null,(null)) pc3:loopback (pc3/null,(null)) linklan/Lan/VM-4:0 trivial pc3:loopback (pc3/null,(null)) pc3:loopback (pc3/null,(null)) linklan/Lan/VM-5:0 Mapping Failed linklan/Lan/VM-6:0 Mapping Failed linklan/Lan/VM-7:0 Mapping Failed linklan/Lan/VM-8:0 trivial pc3:loopback (pc3/null,(null)) pc3:loopback (pc3/null,(null)) linklan/Lan/VM-9:0 Mapping Failed linklan/Lan/VM-10:0 trivial pc3:loopback (pc3/null,(null)) pc3:loopback (pc3/null,(null)) linklan/Lan/VM-13:0 Mapping Failed linklan/Lan/VM-14:0 Mapping Failed linklan/Lan/VM-15:0 Mapping Failed linklan/Lan/VM-16:0 Mapping Failed linklan/Lan/VM-17:0 Mapping Failed linklan/Lan/VM-18:0 trivial pc3:loopback (pc3/null,(null)) pc3:loopback (pc3/null,(null)) linklan/Lan/VM-19:0 trivial pc3:loopback (pc3/null,(null)) pc3:loopback (pc3/null,(null)) linklan/Lan/VM-20:0 trivial pc3:loopback (pc3/null,(null)) pc3:loopback (pc3/null,(null)) linklan/Lan/VM-12:0 Mapping Failed linklan/Lan/VM-11:0 trivial pc3:loopback (pc3/null,(null)) pc3:loopback (pc3/null,(null)) End Edges End solution Summary: pc3 11 vnodes, 0 nontrivial BW, 1000000 trivial BW, type=pcvm pc5 10 vnodes, 0 nontrivial BW, 0 trivial BW, type=pcvm Total physical nodes used: 2 End summary ASSIGN FAILED: Type precheck passed. Node mapping precheck succeeded Policy precheck succeeded Annealing. Doing melting run Reverting: forced Reverting to best solution Done BEST SCORE: 8.5 in 17000 iters and 0.050399 seconds unassigned: 0 pnode_load: 0 no_connect: 10 link_users: 0 bandwidth: 0 desires: 0 vclass: 0 delay: 0 trivial mix: 0 subnodes: 0 max_types: 0 endpoints: 0
comment:11 Changed 12 years ago by
After the dedicated nodes were restored. I re-ran the test that had just failed, the scenario with 5 experiments with 20 VMs each, and it now works. The following were allocated:
- 1st sliver 20vmslice-1 = OK = 10 VMs on pc3, 10 VMs on pc5
- 2nd sliver 20vmslice-2 = OK = 10 VMs on pc3, 10 VMs on pc5
- 3rd sliver 20vmslice-3 = OK = 10 VMs on pc3, 10 VMs on pc5
- 4th sliver 20vmslice-4 = OK = 10 VMs on pc3, 10 VMs on pc5
- 5th sliver 20vmslice-5 = OK = 10 VMs on pc3, 10 VMs on pc1
comment:12 Changed 12 years ago by
Resolution: | → fixed |
---|---|
Status: | new → closed |
All test scenarios have been completed for 100 VM. This ticket is being closed.
The 100 VM scenarios will be re-run if the default rack configuration that is shipped is different than the current one (2 shared node and 3 exclusive nodes).
Another instance of the assignment failure occurred while running a 10 experiments with 10 VM test. Test was started with plenty of resources available (pc5 had 97 slots and pc3 had 100 slots). Here is the sequence of events and allocation based on sliverstatus: