Opened 12 years ago

Closed 12 years ago

#56 closed (fixed)

Sliver with 4 VMs remains in "geni_status": "notready"

Reported by: Owned by: somebody
Priority: major Milestone:
Component: Experiment Version: SPIRAL5
Keywords: sliver creation Cc:


Created a 4 VMs sliver named IG-CT-1 at the Utah InstaGENI rack. No errors were reported, but 30 minutes after the sliver creation, all 4 VMs are still showing "geni_status": "notready".

RSpec is attached.

Attachments (1)

IG-CT-1-utah.rspec (1.8 KB) - added by 12 years ago.

Download all attachments as: .zip

Change History (5)

Changed 12 years ago by

Attachment: IG-CT-1-utah.rspec added

comment:1 Changed 12 years ago by

On 11/1/12 1:11 PM, Leigh Stoller wrote:

Critical daemon (tmcd) died. I restarted it ... I think they are booting up but if not ready in a little while, terminate and try again.


Nodes are still not ready, restarting the experiment.

Thank you Leigh!

comment:2 Changed 12 years ago by

Re-created sliver IG-CT-1 with the same 4 VMs RSpec and again 20 minutes after createsliver, all nodes are not ready.

comment:3 Changed 12 years ago by

On 11/1/12 2:30 PM, Jonathon Duerig wrote:

This CreateSliver? failed and should have shown up as such in the return code. Did this not happen?

I did not get an error when I created the sliver, here is the createsliver output:

$ createsliver -a ig-utah IG-CT-1 IG-CT-1-utah.rspec 
INFO:omni:Loading config file /home/lnevers/.gcf/omni_config
INFO:omni:Using control framework pg
INFO:omni:Slice expires within 1 day on 2012-11-02 16:37:46 UTC
INFO:omni:Substituting AM nickname ig-utah with URL, URN unspecified_AM_URN
INFO:omni:Substituting AM nickname ig-utah with URL, URN unspecified_AM_URN
INFO:omni:Creating sliver(s) from rspec file IG-CT-1-utah.rspec for slice
INFO:omni:Got return from CreateSliver for slice IG-CT-1 at
INFO:omni:<?xml version="1.0" ?>
INFO:omni:  <!-- Reserved resources for:
	Slice: IG-CT-1
	at AM:
	URN: unspecified_AM_URN
INFO:omni:  <rspec type="manifest" xmlns="" xmlns:flack="" xmlns:planetlab="" xmlns:xsi="" xsi:schemaLocation="">    

    <node client_id="VM-1" component_id="" component_manager_id="" exclusive="false" sliver_id="">    
        <sliver_type name="emulab-openvz"/>    
        <interface client_id="VM-1:if0" component_id="" mac_address="021c38df33d5" sliver_id="">      
            <ip address="" netmask="" type="ipv4"/>      
      <rs:vnode name="pcvm5-1" xmlns:rs=""/>    <host name=""/>    <services>      <login authentication="ssh-keys" hostname="" port="31034" username="lnevers"/>    </services>  </node>  
    <node client_id="VM-2" component_id="" component_manager_id="" exclusive="false" sliver_id="">    
        <sliver_type name="emulab-openvz"/>    
        <interface client_id="VM-2:if0" component_id="" mac_address="024671762dc9" sliver_id="">      
            <ip address="" netmask="" type="ipv4"/>      
      <rs:vnode name="pcvm5-2" xmlns:rs=""/>    <host name=""/>    <services>      <login authentication="ssh-keys" hostname="" port="31035" username="lnevers"/>    </services>  </node>  
    <node client_id="VM-3" component_id="" component_manager_id="" exclusive="false" sliver_id="">    
        <sliver_type name="emulab-openvz"/>    
        <interface client_id="VM-3:if0" component_id="" mac_address="02b51e1a911b" sliver_id="">      
            <ip address="" netmask="" type="ipv4"/>      
      <rs:vnode name="pcvm5-3" xmlns:rs=""/>    <host name=""/>    <services>      <login authentication="ssh-keys" hostname="" port="31036" username="lnevers"/>    </services>  </node>  
    <node client_id="VM-4" component_id="" component_manager_id="" exclusive="false" sliver_id="">    
        <sliver_type name="emulab-openvz"/>    
        <interface client_id="VM-4:if0" component_id="" mac_address="026430586edb" sliver_id="">      
            <ip address="" netmask="" type="ipv4"/>      
      <rs:vnode name="pcvm5-4" xmlns:rs=""/>    <host name=""/>    <services>      <login authentication="ssh-keys" hostname="" port="31037" username="lnevers"/>    </services>  </node>  
    <link client_id="lan0" sliver_id="" vlantag="259">    
        <component_manager name=""/>    
        <interface_ref client_id="VM-1:if0" component_id="" sliver_id=""/>    
        <interface_ref client_id="VM-2:if0" component_id="" sliver_id=""/>    
        <interface_ref client_id="VM-3:if0" component_id="" sliver_id=""/>    
        <interface_ref client_id="VM-4:if0" component_id="" sliver_id=""/>    
        <link_type name="lan"/>    
INFO:omni: ------------------------------------------------------------
INFO:omni: Completed createsliver:

  Options as run:
		aggregate: ['ig-utah']
		framework: pg

  Args: createsliver IG-CT-1 IG-CT-1-utah.rspec

  Result Summary: Got Reserved resources RSpec from utah-geniracks-net-protogeniv2 
INFO:omni: ============================================================

comment:4 Changed 12 years ago by

Resolution: fixed
Status: newclosed

Thank you Jonathan, I was able to create the 4 VM sliver without any problem, closing the ticket.

Also, capturing the email exchanges that lead to resolution of this ticket:

On 11/1/12 2:38 PM, Jonathon Duerig wrote:

Try to create your sliver again now. I will keep an eye on email. 

On 11/1/12 2:48 PM, Luisa Nevers wrote:

Deleting existing sliver now and recreating. 

On 11/1/12 2:53 PM, Leigh Stoller wrote:

Jon. Is tmcd running? It threw a sigsegv and it might have died again

On 11/1/12 2:54 PM, Luisa Nevers wrote:

Just created the sliver successfully again..... Monitoring  sliver status. 

On 11/1/12 2:57 PM, Jonathon Duerig wrote:

I restarted all of the daemons (testbed-control restart). mfrisbeed and pubsub 
also seemed to be misbehaving so this seemed best.

Luisa, this time there were no errors in the mail so you should have your status
 changed pretty soon. Ping me if it doesn't. 

On 11/1/12 2:57 PM, Luisa Nevers wrote:

Different results,  1 node is failed and 3 are ready. 

On 11/1/12 2:59 PM, Jonathon Duerig wrote:

Ah. Spoke too soon. This is a problem on the client side. Poking around. 

On 11/1/12 3:57 PM, Jonathon Duerig wrote:

Ok. I've found a bug in our code and fixed a borked container. Please delete and 
swap in your experiment again. 
Note: See TracTickets for help on using tickets.