Opened 12 years ago

Closed 12 years ago

#57 closed (fixed)

sliver listresources only show login for first node when requesting nodes on shared vlan only

Reported by: lnevers@bbn.com Owned by: ibaldin@renci.org
Priority: major Milestone: EG-EXP-5
Component: Experiment Version: SPIRAL4
Keywords: sliver creation Cc:
Dependencies:

Description

This problem only seems to occur when requesting nodes on a shared VLAN.

Created a sliver "EG-EXP-5-scenario1" with 2 VMs on shared VLAN 1750 via BBN SM.

The sliver is created, but the listresources for the sliver only lists one set of ssh login details, for the first node in the list.

The create sliver operation showed 2 nodes being allocated:

$ ./src/omni.py -a exobbn createsliver  EG-EXP-5-scenario1 exorspec/exo-2vm-shared-vlan.rspecINFO:omni:Loading config file omni_config
INFO:omni:Using control framework pgeni
INFO:omni:Slice urn:publicid:IDN+pgeni.gpolab.bbn.com+slice+EG-EXP-5-scenario1 expires on 2012-06-30 00:00:00 UTC
INFO:omni:Substituting AM nickname exobbn with URL https://bbn-hn.exogeni.net:11443/orca/xmlrpc, URN unspecified_AM_URN
INFO:omni:Substituting AM nickname exobbn with URL https://bbn-hn.exogeni.net:11443/orca/xmlrpc, URN unspecified_AM_URN
INFO:omni:Creating sliver(s) from rspec file exorspec/exo-2vm-shared-vlan.rspec for slice urn:publicid:IDN+pgeni.gpolab.bbn.com+slice+EG-EXP-5-scenario1
INFO:omni:Asked https://bbn-hn.exogeni.net:11443/orca/xmlrpc to reserve resources. Result:
INFO:omni:<?xml version="1.0" ?>
INFO:omni:<!-- Reserved resources for:
	Slice: EG-EXP-5-scenario1
	At AM:
	URL: https://bbn-hn.exogeni.net:11443/orca/xmlrpc
 -->
INFO:omni:<rspec type="manifest" xmlns="http://www.geni.net/resources/rspec/3" xmlns:ns2="http://hpn.east.isi.edu/rspec/ext/stitch/0.1/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.protogeni.net/resources/rspec/2 http://www.protogeni.net/resources/rspec/2/manifest.xsd http://hpn.east.isi.edu/rspec/ext/stitch/0.1/ http://hpn.east.isi.edu/rspec/ext/stitch/0.1/stitch-schema.xsd">  
      <node client_id="1edfa914-2a29-42d6-91e1-1c3454c66ee8#VM" component_id="urn:publicid:IDN+exogeni.net:bbnvmsite+authority+cm" exclusive="true" sliver_id="urn:publicid:IDN+exogeni.net:bbnvmsite+sliver+1edfa914-2a29-42d6-91e1-1c3454c66ee8#VM">    
            <sliver_type name="m1.small">      
                  <disk_image name="http://geni-images.renci.org/images/standard/debian/debian-squeeze-amd64-neuca-2g.zfilesystem.sparse.v0.2.xml" version="397c431cb9249e1f361484b08674bc3381455bb9"/>      
            </sliver_type>    
            <services/>    
            <interface client_id="1edfa914-2a29-42d6-91e1-1c3454c66ee8#VM:if1"/>    
      </node>  
      <node client_id="1edfa914-2a29-42d6-91e1-1c3454c66ee8#VM-0" component_id="urn:publicid:IDN+exogeni.net:bbnvmsite+authority+cm" exclusive="true" sliver_id="urn:publicid:IDN+exogeni.net:bbnvmsite+sliver+1edfa914-2a29-42d6-91e1-1c3454c66ee8#VM-0">    
            <sliver_type name="m1.small">      
                  <disk_image name="http://geni-images.renci.org/images/standard/debian/debian-squeeze-amd64-neuca-2g.zfilesystem.sparse.v0.2.xml" version="397c431cb9249e1f361484b08674bc3381455bb9"/>      
            </sliver_type>    
            <services/>    
            <interface client_id="1edfa914-2a29-42d6-91e1-1c3454c66ee8#VM-0:if1"/>    
      </node>  
      <link client_id="1edfa914-2a29-42d6-91e1-1c3454c66ee8#lan0" sliver_id="urn:publicid:IDN+exogeni.net:bbnvmsite+sliver+1edfa914-2a29-42d6-91e1-1c3454c66ee8#lan0" vlantag="1750.0">    
            <interface_ref client_id="1edfa914-2a29-42d6-91e1-1c3454c66ee8#VM:if1"/>    
            <interface_ref client_id="1edfa914-2a29-42d6-91e1-1c3454c66ee8#VM-0:if1"/>    
      </link>  
</rspec>

The results of the listresources for the sliver were captured around 20 minutes after the sliver creation and only show one set of ssh login details for the first node: <login authentication="ssh-keys" hostname="192.1.242.8" port="22" username="root"/>

The second node did not include the <login> details.

Re-ran same experiment the ExoSM and gave the same results, one set of login information only for the first of the two nodes.

Attaching Sliver Manifest and Request RSpec.

Attachments (2)

EG-EXP-5-scenario1-rspec-bbn-hn-exogeni-net-11443-orca.xml (2.4 KB) - added by lnevers@bbn.com 12 years ago.
exo-2vm-shared-vlan.rspec (1.6 KB) - added by lnevers@bbn.com 12 years ago.

Download all attachments as: .zip

Change History (10)

Changed 12 years ago by lnevers@bbn.com

Attachment: exo-2vm-shared-vlan.rspec added

comment:1 Changed 12 years ago by ibaldin@renci.org

Owner: changed from somebody to ibaldin@renci.org
Status: newassigned

comment:2 Changed 12 years ago by ibaldin@renci.org

Luisa, are you sure about this? Maybe you tried too soon (sliverStatus)? I just tried this case - everything seems to be reported properly.

Please remember that management IP information is only available after the slivers are fully stood up.

comment:3 Changed 12 years ago by ibaldin@renci.org

I wonder if one of the VMs got stuck? In that case you would have information on one of the nodes, but not the other (and the other one would eventually be either retried or killed).

I think as soon as I get sliverStatus working properly you should be able to see if individual slivers are actually operational.

comment:4 Changed 12 years ago by lnevers@bbn.com

I thought that one of the VMs may have been stuck. I also ran a 10 VM scenario and waited over 30 minutes, but only got the ssh information for 9 of the 10 VMs in the sliver. Having sliverstatus will help!

I will re-ran the scenarios in the original description and report back to see if I can duplicate the failure.

comment:5 Changed 12 years ago by lnevers@bbn.com

Not sure if this is helpful, the sliver EG-EXP-5-scenario1 is still running and still shows only one set of ssh login details:

$ ./src/omni.py -a exobbn listresources  EG-EXP-5-scenario1 
INFO:omni:Loading config file omni_config
INFO:omni:Using control framework pgeni
INFO:omni:Gathering resources reserved for slice EG-EXP-5-scenario1.
INFO:omni:Substituting AM nickname exobbn with URL https://bbn-hn.exogeni.net:11443/orca/xmlrpc, URN unspecified_AM_URN
INFO:omni:Listed resources on 1 out of 1 possible aggregates.
INFO:omni:<?xml version="1.0" ?>
INFO:omni:<!-- Resources for:
	Slice: EG-EXP-5-scenario1
	at AM:
	URN: unspecified_AM_URN
	URL: https://bbn-hn.exogeni.net:11443/orca/xmlrpc
 -->
INFO:omni:<rspec type="manifest" xmlns="http://www.geni.net/resources/rspec/3" xmlns:ns2="http://hpn.east.isi.edu/rspec/ext/stitch/0.1/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.geni.net/resources/rspec/3 http://www.geni.net/resources/rspec/3/manifest.xsd http://hpn.east.isi.edu/rspec/ext/stitch/0.1/ http://hpn.east.isi.edu/rspec/ext/stitch/0.1/stitch-schema.xsd">  
      <node client_id="1edfa914-2a29-42d6-91e1-1c3454c66ee8#VM" component_id="urn:publicid:IDN+exogeni.net:bbnvmsite+authority+cm" exclusive="true" sliver_id="urn:publicid:IDN+exogeni.net:bbnvmsite+sliver+1edfa914-2a29-42d6-91e1-1c3454c66ee8#VM">    
            <sliver_type name="m1.small">      
                  <disk_image name="http://geni-images.renci.org/images/standard/debian/debian-squeeze-amd64-neuca-2g.zfilesystem.sparse.v0.2.xml" version="397c431cb9249e1f361484b08674bc3381455bb9"/>      
            </sliver_type>    
            <services>      
                  <login authentication="ssh-keys" hostname="192.1.242.8" port="22" username="root"/>      
            </services>    
            <interface client_id="1edfa914-2a29-42d6-91e1-1c3454c66ee8#VM:if1"/>    
      </node>  
      <node client_id="1edfa914-2a29-42d6-91e1-1c3454c66ee8#VM-0" component_id="urn:publicid:IDN+exogeni.net:bbnvmsite+authority+cm" exclusive="true" sliver_id="urn:publicid:IDN+exogeni.net:bbnvmsite+sliver+1edfa914-2a29-42d6-91e1-1c3454c66ee8#VM-0">    
            <sliver_type name="m1.small">      
                  <disk_image name="http://geni-images.renci.org/images/standard/debian/debian-squeeze-amd64-neuca-2g.zfilesystem.sparse.v0.2.xml" version="397c431cb9249e1f361484b08674bc3381455bb9"/>      
            </sliver_type>    
            <services/>    
            <interface client_id="1edfa914-2a29-42d6-91e1-1c3454c66ee8#VM-0:if1"/>    
      </node>  
      <link client_id="1edfa914-2a29-42d6-91e1-1c3454c66ee8#lan0" sliver_id="urn:publicid:IDN+exogeni.net:bbnvmsite+sliver+1edfa914-2a29-42d6-91e1-1c3454c66ee8#lan0" vlantag="1750.0">    
            <interface_ref client_id="1edfa914-2a29-42d6-91e1-1c3454c66ee8#VM:if1"/>    
            <interface_ref client_id="1edfa914-2a29-42d6-91e1-1c3454c66ee8#VM-0:if1"/>    
      </link>  
</rspec>

I am going to leave this sliver running and create a new one with the same RSpec (and different IP addresses for the interfaces.

comment:6 Changed 12 years ago by ibaldin@renci.org

One of them failed. That's why you don't see it. The overall slice status should be Failed though...

comment:7 Changed 12 years ago by lnevers@bbn.com

Ok, then I will delete EG-EXP-5-scenario1 to release the resources.

I have issued a new createsliver for a slice named "2shared" and it seems unusually slow. The omni command seems to be waiting on a response form the BBN SM and it has been showing this for the past 5 minutes:

$ ./src/omni.py -a exobbn createsliver 2shared ./exo-2vm-shared-vlan.rspec 
INFO:omni:Loading config file omni_config
INFO:omni:Using control framework pgeni
INFO:omni:Slice urn:publicid:IDN+pgeni.gpolab.bbn.com+slice+2shared expires on 2012-06-29 17:37:15 UTC
INFO:omni:Substituting AM nickname exobbn with URL https://bbn-hn.exogeni.net:11443/orca/xmlrpc, URN unspecified_AM_URN
INFO:omni:Substituting AM nickname exobbn with URL https://bbn-hn.exogeni.net:11443/orca/xmlrpc, URN unspecified_AM_URN
INFO:omni:Creating sliver(s) from rspec file ./exo-2vm-shared-vlan.rspec for slice urn:publicid:IDN+pgeni.gpolab.bbn.com+slice+2shared

The next line should be the results of the request to the BBN SM:

INFO:omni:Asked https://bbn-hn.exogeni.net:11443/orca/xmlrpc to reserve resources. Result:

comment:8 Changed 12 years ago by lnevers@bbn.com

Resolution: fixed
Status: assignedclosed

Was just able to create a sliver with the shared vlan rspec which failed in the original problem description. Was able to used both nodes in the sliver to exchange traffic with other nodes on VLAN 1750, closing ticket.

Note: See TracTickets for help on using tickets.