Opened 12 years ago
Closed 12 years ago
#32 closed (fixed)
Track 100 VM scenario findings
Reported by: | lnevers@bbn.com | Owned by: | somebody |
---|---|---|---|
Priority: | major | Milestone: | IG-EXP-3 |
Component: | Experiment | Version: | SPIRAL4 |
Keywords: | vm support | Cc: | |
Dependencies: |
Description
This ticket is being written to capture the findings for the planned IG-EXP-3 InstaGENI Single Site 100 VM Test. All issues captured are known and require no action at this time.
These are the 100 VM scenarios planned for testing:
Scenario 1: 1 Slice with 100 VMs Scenario 2: 2 Slices with 50 VMs each Scenario 3: 4 Slices with 25 VMS each Scenario 4: 50 Slices with 2 VMs each Scenario 5: 100 Slices with 1 VM each Scenario 6: 10 slices with 10 VM each (note 1)
(note 1) This scenario was not in the original test plan, but it is being added based on input from Leigh Stroller.
=> Scenario 1: 1 Slice with 100 VMs in a grid topology
Results: FAILED due to:
100 nodes of type pcvm requested, but only 50 available nodes of type pcvm found
=> Scenario 2: 2 Slices with 50 VMs each, each sliver uses grid topology
Results: FAILED
Two test runs were completed. 1st try:
- Created first 50 node experiment, no error reported.
- Sliver status reported "resource is busy; try again later" for about 5 minutes,
- then sliverstatus reported error:
Failed to get SliverStatus on 2exp-50vm at AM https://boss.utah.geniracks.net/protogeni/xmlrpc/am/2.0: <Fault -32600: 'Internal Error executing SliverStatus'>
Results 2nd test run:
- Creatsliver first 50 node sliver which completes without error
- Sliver status is busy approximately 5 minutes reporting "resource is busy; try again later"
- Slivestatus eventually fails with "<Fault -32400: 'XMLRPC Server Error'>"
Additional tests results will also be captured as completed.
Change History (7)
comment:1 Changed 12 years ago by
comment:2 Changed 12 years ago by
Ran Scenario 6: 10 slices with 10 VM each.
- Created 1st sliver in slice 10exp-10vm-1 -> no problem
- Created 2nd sliver in slice 10exp-10vm-2 -> no problem
- Created 3rd sliver in slice 10exp-10vm-3 -> no problem
- Created 4th sliver in slice 10exp-10vm-4 -> no problem
- Created 5th sliver in slice 10exp-10vm-5 -> failure below:
Asked https://boss.utah.geniracks.net/protogeni/xmlrpc/am/2.0 to reserve resources. No manifest Rspec returned. *** ERROR: mapper: Reached run limit. Giving up. seed = 1337780864 Physical Graph: 3 Calculating shortest paths on switch fabric. Virtual Graph: 11 Generating physical equivalence classes:3 Type precheck: Type precheck passed. Node mapping precheck: Node mapping precheck succeeded Policy precheck: Policy precheck succeeded Annealing. Adjusting dificulty estimate for fixed nodes, 1 remain. Doing melting run Reverting: forced Reverting to best solution Done BEST SCORE: 11.75 in 17000 iters and 0.135319 seconds With 1 violations Iters to find best score: 1 Violations: 1 unassigned: 0 pnode_load: 0 no_connect: 0 link_users: 0 bandwidth: 1 desires: 0 vclass: 0 delay: 0 trivial mix: 0 subnodes: 0 max_types: 0 endpoints: 0 Nodes: VM-1 pc1 VM-10 pc1 VM-2 pc1 VM-3 pc1 VM-4 pc1 VM-5 pc1 VM-6 pc1 VM-7 pc1 VM-8 pc1 VM-9 pc1 lan/Lan pc1 End Nodes Edges: linklan/Lan/VM-1:0 trivial pc1:loopback (pc1/null,(null)) pc1:loopback (pc1/null,(null)) linklan/Lan/VM-2:0 trivial pc1:loopback (pc1/null,(null)) pc1:loopback (pc1/null,(null)) linklan/Lan/VM-3:0 trivial pc1:loopback (pc1/null,(null)) pc1:loopback (pc1/null,(null)) linklan/Lan/VM-4:0 trivial pc1:loopback (pc1/null,(null)) pc1:loopback (pc1/null,(null)) linklan/Lan/VM-5:0 trivial pc1:loopback (pc1/null,(null)) pc1:loopback (pc1/null,(null)) linklan/Lan/VM-6:0 trivial pc1:loopback (pc1/null,(null)) pc1:loopback (pc1/null,(null)) linklan/Lan/VM-7:0 trivial pc1:loopback (pc1/null,(null)) pc1:loopback (pc1/null,(null)) linklan/Lan/VM-8:0 trivial pc1:loopback (pc1/null,(null)) pc1:loopback (pc1/null,(null)) linklan/Lan/VM-9:0 trivial pc1:loopback (pc1/null,(null)) pc1:loopback (pc1/null,(null)) linklan/Lan/VM-10:0 trivial pc1:loopback (pc1/null,(null)) pc1:loopback (pc1/null,(null)) End Edges End solution Summary: pc1 11 vnodes, 0 nontrivial BW, 1000000 trivial BW, type=pcvm ?+virtpercent: used=0 total=100 ?+cpu: used=0 total=2666 ?+ram: used=0 total=3574 ?+cpupercent: used=0 total=92 ?+rampercent: used=0 total=80 Total physical nodes used: 1 End summary ASSIGN FAILED: Type precheck passed. Node mapping precheck succeeded Policy precheck succeeded Annealing. Adjusting dificulty estimate for fixed nodes, 1 remain. Doing melting run Reverting: forced Reverting to best solution Done BEST SCORE: 11.75 in 17000 iters and 0.135319 seconds unassigned: 0 pnode_load: 0 no_connect: 0 link_users: 0 bandwidth: 1 desires: 0 vclass: 0 delay: 0 trivial mix: 0 subnodes: 0 max_types: 0 endpoints: 0
comment:3 Changed 12 years ago by
Ran Scenario 5: 100 Slices with 1 VM each
At the start of the test, available resource were based on the following listresources details:
- pc3 sliver_type "emulab-openvz" for "pcvm" had type_slots="100" = 100 VMs possible
- pc5 sliver_type "emulab-openvz" for "pcvm" had type_slots="97" = 97 VMs possible
As each 1 VM sliver was created, both pc3 and pc5 counters for available slots decreased.
=> Results: Success!!
Each of the 100 experiments had a node assigned, was able to login to several.
Final allocation distribution:
- 61 VMs on pc3
- 39 VMs on pc5
comment:4 Changed 12 years ago by
Note the allocation listed in previous update are based on sliver manifests. The slot count from listrsources was as follows for each shared node:
- pc5 = 58
- pc3 = 39
Which is as expected based on sliver manifest counts.
comment:5 Changed 12 years ago by
Ran Scenario 4: 50 Slices with 2 VMs each
=> Results: Success - Each of the 50 experiments had 2 node assigned, was able to login to several. Final allocation distribution:
- 58 VMs on pc3
- 42 VMs on pc5
comment:6 Changed 12 years ago by
=> Re-ran Scenario 2: 2 Slices with 50 VMs each.
In previous run, this test case had failed with a "<Fault -32400: 'XMLRPC Server Error'>".
Verified this is no longer the case and that this scenario fails as expected due to configured rack resource allocation.
comment:7 Changed 12 years ago by
Resolution: | → fixed |
---|---|
Status: | new → closed |
Summary of the 100 VMs scenarios that have been completed with the InstaGENI rack configured to have 2 pcshared nodes:
=> Scenario 1: 1 Slice with 100 VMs
Results: - FAIL - Not allowed with current rack configuration.
Sliver failure reported:100 nodes, but only 20 available.
=> Scenario 2: 2 Slices with 50 VMs each
Results: - FAIL - Not allowed with current rack configuration.
Sliver failure reported on 1st sliver: 50 nodes requested, but only 30 available.
=> Scenario 3: 4 Slices with 25 VMS each
Results: - FAIL - Not allowed with current rack, only 3 slices set up.
Allocation:pc3=30 VMs, pc5=30 VMS, pc1,pc4,pc2=5 VMs each
Sliver failure reported on 4th sliver: 25 nodes requested, but only 20 available.
=> Scenario 7: 5 slices with 20 VMs each
Results: - PASS - Allocation:pc3=50 VMs, pc5=40 VMs, pc1=10 VMs
=> Scenario 6: 10 Slices with 10 VMs each
Results: - PASS - Allocation:pc3=90 VMs, pc5=10 VMs
=> Scenario 4: 50 Slices with 2 VMs each
Results: - PASS - Allocation:pc3=59 VMs, pc5=42 VMs
=> Scenario 5: 100 Slices with 1 VM each
Results: - PASS - Allocation:pc3=61 VMs, pc5=39 VMs
=> Scenario 3: 4 Slices with 25 VMS each
Results: