Opened 12 years ago
Closed 12 years ago
#57 closed (fixed)
sliver listresources only show login for first node when requesting nodes on shared vlan only
Reported by: | lnevers@bbn.com | Owned by: | ibaldin@renci.org |
---|---|---|---|
Priority: | major | Milestone: | EG-EXP-5 |
Component: | Experiment | Version: | SPIRAL4 |
Keywords: | sliver creation | Cc: | |
Dependencies: |
Description
This problem only seems to occur when requesting nodes on a shared VLAN.
Created a sliver "EG-EXP-5-scenario1" with 2 VMs on shared VLAN 1750 via BBN SM.
The sliver is created, but the listresources for the sliver only lists one set of ssh login details, for the first node in the list.
The create sliver operation showed 2 nodes being allocated:
$ ./src/omni.py -a exobbn createsliver EG-EXP-5-scenario1 exorspec/exo-2vm-shared-vlan.rspecINFO:omni:Loading config file omni_config INFO:omni:Using control framework pgeni INFO:omni:Slice urn:publicid:IDN+pgeni.gpolab.bbn.com+slice+EG-EXP-5-scenario1 expires on 2012-06-30 00:00:00 UTC INFO:omni:Substituting AM nickname exobbn with URL https://bbn-hn.exogeni.net:11443/orca/xmlrpc, URN unspecified_AM_URN INFO:omni:Substituting AM nickname exobbn with URL https://bbn-hn.exogeni.net:11443/orca/xmlrpc, URN unspecified_AM_URN INFO:omni:Creating sliver(s) from rspec file exorspec/exo-2vm-shared-vlan.rspec for slice urn:publicid:IDN+pgeni.gpolab.bbn.com+slice+EG-EXP-5-scenario1 INFO:omni:Asked https://bbn-hn.exogeni.net:11443/orca/xmlrpc to reserve resources. Result: INFO:omni:<?xml version="1.0" ?> INFO:omni:<!-- Reserved resources for: Slice: EG-EXP-5-scenario1 At AM: URL: https://bbn-hn.exogeni.net:11443/orca/xmlrpc --> INFO:omni:<rspec type="manifest" xmlns="http://www.geni.net/resources/rspec/3" xmlns:ns2="http://hpn.east.isi.edu/rspec/ext/stitch/0.1/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.protogeni.net/resources/rspec/2 http://www.protogeni.net/resources/rspec/2/manifest.xsd http://hpn.east.isi.edu/rspec/ext/stitch/0.1/ http://hpn.east.isi.edu/rspec/ext/stitch/0.1/stitch-schema.xsd"> <node client_id="1edfa914-2a29-42d6-91e1-1c3454c66ee8#VM" component_id="urn:publicid:IDN+exogeni.net:bbnvmsite+authority+cm" exclusive="true" sliver_id="urn:publicid:IDN+exogeni.net:bbnvmsite+sliver+1edfa914-2a29-42d6-91e1-1c3454c66ee8#VM"> <sliver_type name="m1.small"> <disk_image name="http://geni-images.renci.org/images/standard/debian/debian-squeeze-amd64-neuca-2g.zfilesystem.sparse.v0.2.xml" version="397c431cb9249e1f361484b08674bc3381455bb9"/> </sliver_type> <services/> <interface client_id="1edfa914-2a29-42d6-91e1-1c3454c66ee8#VM:if1"/> </node> <node client_id="1edfa914-2a29-42d6-91e1-1c3454c66ee8#VM-0" component_id="urn:publicid:IDN+exogeni.net:bbnvmsite+authority+cm" exclusive="true" sliver_id="urn:publicid:IDN+exogeni.net:bbnvmsite+sliver+1edfa914-2a29-42d6-91e1-1c3454c66ee8#VM-0"> <sliver_type name="m1.small"> <disk_image name="http://geni-images.renci.org/images/standard/debian/debian-squeeze-amd64-neuca-2g.zfilesystem.sparse.v0.2.xml" version="397c431cb9249e1f361484b08674bc3381455bb9"/> </sliver_type> <services/> <interface client_id="1edfa914-2a29-42d6-91e1-1c3454c66ee8#VM-0:if1"/> </node> <link client_id="1edfa914-2a29-42d6-91e1-1c3454c66ee8#lan0" sliver_id="urn:publicid:IDN+exogeni.net:bbnvmsite+sliver+1edfa914-2a29-42d6-91e1-1c3454c66ee8#lan0" vlantag="1750.0"> <interface_ref client_id="1edfa914-2a29-42d6-91e1-1c3454c66ee8#VM:if1"/> <interface_ref client_id="1edfa914-2a29-42d6-91e1-1c3454c66ee8#VM-0:if1"/> </link> </rspec>
The results of the listresources for the sliver were captured around 20 minutes after the sliver creation and only show one set of ssh login details for the first node:
<login authentication="ssh-keys" hostname="192.1.242.8" port="22" username="root"/>
The second node did not include the <login> details.
Re-ran same experiment the ExoSM and gave the same results, one set of login information only for the first of the two nodes.
Attaching Sliver Manifest and Request RSpec.
Attachments (2)
Change History (10)
Changed 12 years ago by
Attachment: | EG-EXP-5-scenario1-rspec-bbn-hn-exogeni-net-11443-orca.xml added |
---|
Changed 12 years ago by
Attachment: | exo-2vm-shared-vlan.rspec added |
---|
comment:1 Changed 12 years ago by
Owner: | changed from somebody to ibaldin@renci.org |
---|---|
Status: | new → assigned |
comment:2 Changed 12 years ago by
comment:3 Changed 12 years ago by
I wonder if one of the VMs got stuck? In that case you would have information on one of the nodes, but not the other (and the other one would eventually be either retried or killed).
I think as soon as I get sliverStatus working properly you should be able to see if individual slivers are actually operational.
comment:4 Changed 12 years ago by
I thought that one of the VMs may have been stuck. I also ran a 10 VM scenario and waited over 30 minutes, but only got the ssh information for 9 of the 10 VMs in the sliver. Having sliverstatus will help!
I will re-ran the scenarios in the original description and report back to see if I can duplicate the failure.
comment:5 Changed 12 years ago by
Not sure if this is helpful, the sliver EG-EXP-5-scenario1 is still running and still shows only one set of ssh login details:
$ ./src/omni.py -a exobbn listresources EG-EXP-5-scenario1 INFO:omni:Loading config file omni_config INFO:omni:Using control framework pgeni INFO:omni:Gathering resources reserved for slice EG-EXP-5-scenario1. INFO:omni:Substituting AM nickname exobbn with URL https://bbn-hn.exogeni.net:11443/orca/xmlrpc, URN unspecified_AM_URN INFO:omni:Listed resources on 1 out of 1 possible aggregates. INFO:omni:<?xml version="1.0" ?> INFO:omni:<!-- Resources for: Slice: EG-EXP-5-scenario1 at AM: URN: unspecified_AM_URN URL: https://bbn-hn.exogeni.net:11443/orca/xmlrpc --> INFO:omni:<rspec type="manifest" xmlns="http://www.geni.net/resources/rspec/3" xmlns:ns2="http://hpn.east.isi.edu/rspec/ext/stitch/0.1/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.geni.net/resources/rspec/3 http://www.geni.net/resources/rspec/3/manifest.xsd http://hpn.east.isi.edu/rspec/ext/stitch/0.1/ http://hpn.east.isi.edu/rspec/ext/stitch/0.1/stitch-schema.xsd"> <node client_id="1edfa914-2a29-42d6-91e1-1c3454c66ee8#VM" component_id="urn:publicid:IDN+exogeni.net:bbnvmsite+authority+cm" exclusive="true" sliver_id="urn:publicid:IDN+exogeni.net:bbnvmsite+sliver+1edfa914-2a29-42d6-91e1-1c3454c66ee8#VM"> <sliver_type name="m1.small"> <disk_image name="http://geni-images.renci.org/images/standard/debian/debian-squeeze-amd64-neuca-2g.zfilesystem.sparse.v0.2.xml" version="397c431cb9249e1f361484b08674bc3381455bb9"/> </sliver_type> <services> <login authentication="ssh-keys" hostname="192.1.242.8" port="22" username="root"/> </services> <interface client_id="1edfa914-2a29-42d6-91e1-1c3454c66ee8#VM:if1"/> </node> <node client_id="1edfa914-2a29-42d6-91e1-1c3454c66ee8#VM-0" component_id="urn:publicid:IDN+exogeni.net:bbnvmsite+authority+cm" exclusive="true" sliver_id="urn:publicid:IDN+exogeni.net:bbnvmsite+sliver+1edfa914-2a29-42d6-91e1-1c3454c66ee8#VM-0"> <sliver_type name="m1.small"> <disk_image name="http://geni-images.renci.org/images/standard/debian/debian-squeeze-amd64-neuca-2g.zfilesystem.sparse.v0.2.xml" version="397c431cb9249e1f361484b08674bc3381455bb9"/> </sliver_type> <services/> <interface client_id="1edfa914-2a29-42d6-91e1-1c3454c66ee8#VM-0:if1"/> </node> <link client_id="1edfa914-2a29-42d6-91e1-1c3454c66ee8#lan0" sliver_id="urn:publicid:IDN+exogeni.net:bbnvmsite+sliver+1edfa914-2a29-42d6-91e1-1c3454c66ee8#lan0" vlantag="1750.0"> <interface_ref client_id="1edfa914-2a29-42d6-91e1-1c3454c66ee8#VM:if1"/> <interface_ref client_id="1edfa914-2a29-42d6-91e1-1c3454c66ee8#VM-0:if1"/> </link> </rspec>
I am going to leave this sliver running and create a new one with the same RSpec (and different IP addresses for the interfaces.
comment:6 Changed 12 years ago by
One of them failed. That's why you don't see it. The overall slice status should be Failed though...
comment:7 Changed 12 years ago by
Ok, then I will delete EG-EXP-5-scenario1 to release the resources.
I have issued a new createsliver for a slice named "2shared" and it seems unusually slow. The omni command seems to be waiting on a response form the BBN SM and it has been showing this for the past 5 minutes:
$ ./src/omni.py -a exobbn createsliver 2shared ./exo-2vm-shared-vlan.rspec INFO:omni:Loading config file omni_config INFO:omni:Using control framework pgeni INFO:omni:Slice urn:publicid:IDN+pgeni.gpolab.bbn.com+slice+2shared expires on 2012-06-29 17:37:15 UTC INFO:omni:Substituting AM nickname exobbn with URL https://bbn-hn.exogeni.net:11443/orca/xmlrpc, URN unspecified_AM_URN INFO:omni:Substituting AM nickname exobbn with URL https://bbn-hn.exogeni.net:11443/orca/xmlrpc, URN unspecified_AM_URN INFO:omni:Creating sliver(s) from rspec file ./exo-2vm-shared-vlan.rspec for slice urn:publicid:IDN+pgeni.gpolab.bbn.com+slice+2shared
The next line should be the results of the request to the BBN SM:
INFO:omni:Asked https://bbn-hn.exogeni.net:11443/orca/xmlrpc to reserve resources. Result:
comment:8 Changed 12 years ago by
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
Was just able to create a sliver with the shared vlan rspec which failed in the original problem description. Was able to used both nodes in the sliver to exchange traffic with other nodes on VLAN 1750, closing ticket.
Luisa, are you sure about this? Maybe you tried too soon (sliverStatus)? I just tried this case - everything seems to be reported properly.
Please remember that management IP information is only available after the slivers are fully stood up.