Opened 9 years ago

Closed 9 years ago

#8 closed (fixed)

creating a sliver with one vm at Utah InstaGENi and one VM other site results in incomplete manifest

Reported by: lnevers@bbn.com Owned by: somebody
Priority: major Milestone:
Component: AM Version: SPIRAL4
Keywords: Cc:
Dependencies:

Description

This experiment was run twice with following resources:

Run 1 = 1 VM @ InstaGENI rack + 1 VM @ GPO PG Run 2 = 1 VM @ InstaGENI rack + 1 VM @ Utah PG

In both scenarios the createsliver operation only showed login information only for the InstaGENI rack, subsequent attempt to get sliver status failed with "resource is busy; try again later" for approximately 10 minutes. Once was able to collect sliverstatus, in both runs there was no login information for either remote rack (GPO PG or Utah PG). Waited up to 30 minutes and sliverstatus did not change to include remote details,

Following are details for each of the two runs.

Run 1:

Using an RSpec that includes 2 nodes (1 @ Utah InstaGENI, 1@ GPO PG) was able to create a sliver without any errors. The createsliver generated an incomplete manifest RSpec which was missing login details for the GPO node. Followed up by checking the Manifest RSpec for the sliver, up to 30 minutes late, showed no node for the GPO VM.

The request RSpec and the manifest are attached at run1-*.

Run 2:

Using an RSpec that includes 2 nodes (1 @ Utah InstaGENI, 1@ Utah PG) was able to create a sliver without any errors. The initial sliver creation showed only InstaGENI rack login information. After approximately 10 minutes of "resource busy was able to get a sliver status to run, but no login information was available for the Utah PG VM.

The request RSpec and the manifest are attached at run2-*.

Attachments (4)

run1-1-insta-1-pgeni-vm.rspec (1.3 KB) - added by lnevers@bbn.com 9 years ago.
run1-instadoh-sliverstatus-boss-utah-geniracks-net-protogeni-xmlrpc-am-2-0.json (12.1 KB) - added by lnevers@bbn.com 9 years ago.
run2-1-insta-1-utahpg.rspec (1.4 KB) - added by lnevers@bbn.com 9 years ago.
run2-rack-pgutah-sliverstatus-boss-utah-geniracks-net-protogeni-xmlrpc-am-2-0.json (12.3 KB) - added by lnevers@bbn.com 9 years ago.

Download all attachments as: .zip

Change History (7)

Changed 9 years ago by lnevers@bbn.com

Changed 9 years ago by lnevers@bbn.com

Attachment: run2-1-insta-1-utahpg.rspec added

comment:1 Changed 9 years ago by lnevers@bbn.com

On 5/2/12 11:36 AM, Leigh Stoller wrote:

So, are you launching CreateSliver?() at both AMs at the same time? If not, there is no way this can work since you cannot create the tunnels until the other side has been created. The idea is to launch both at the same time, and then they can handshake with each other, eventually sync up, and finish the setup.

As for login information, I do not understand yet what the problem is. Note that the AM at InstaGENI will show login details for its nodes, not any of the other nodes at other AMs. Ditto for Utah, etc.

Is the GPO running the latest stable snapshot? If not, then it will not have the login until it updates.

Lbs

I was not running the createsliver at both AMs. I will continue with createslivers at the PG Utah and InstaGENI rack AMs.

Question:

I now have 1 RSpec for the two sites connected by the GRE tunnel (shown in the attachment (http://groups.geni.net/instageni/attachment/ticket/8/run2-1-insta-1-utahpg.rspec). If I need to create one sliver at each aggregate, how do I generate an RSpec for each aggregate that defines the GRE tunnel between the two sites? Do I split up the link definition? Or do I use the same rspec at both aggregates? (tried this last one, which does not seem to work)

comment:2 Changed 9 years ago by lnevers@bbn.com

On 5/2/12 5:06 PM, Jonathon Duerig wrote:

You send the same rspec to both AMs. Each one will annotate just the nodes that they manage. You are left with two manifests, each of which has different annotations...

On 5/2/12 5:09 PM, Leigh Stoller wrote:

Your rspec looks good. You submit the same rspec to both sites at the same time. Each AM will process the parts that are relevant to it (as indicated by the component_manager_id tags). You get back a manifest from each AM, with the relevant parts filled in by that AM.

You said this rspec did not work? Can you tell me when you did this so that I can find it in the logs. Note that using a different slice name for each test would be very helpful; makes it a lot easier to find in the logs, which are very big.

Lbs

I think I ran the create sliver between 1:00 and 1:30 pm eastern. The slivers were created at the InstaGENI and PG Utah Aggregates, the slice was named "rack-pgutah".

I can re-run it if it helps.

comment:3 Changed 9 years ago by lnevers@bbn.com

Resolution: fixed
Status: newclosed

Re-ran the "1 VM @ InstaGENI rack + 1 VM @ Utah PG" experiment this morning at 09:17 EDT. The slice was named rack-pg2. Both slivers were created without problem.

It took about 15 minutes for the InstaGENI rack VM to become available. Once logged into each VM, it took about 5 more minutes until ping between the hosts worked. Also the ping showed a high rate of packet loss.

On the pc5.utah.geniracks.net VM:

[lnevers@VM-0 ~]$ ping 192.168.3.2
<<<lines deleted>>>
--- 192.168.3.2 ping statistics ---
121 packets transmitted, 33 received, 72% packet loss, time 120000ms
rtt min/avg/max/mdev = 0.145/4.252/135.015/23.115 ms

On the pc533.emulab.net VM:

[lnevers@VM ~]$ ping 192.168.3.1
<<<lines deleted>>>
556 packets transmitted, 35 received, 93% packet loss, time 555028ms
rtt min/avg/max/mdev = 0.182/59.171/779.995/193.549 ms
556 packets transmitted, 35 received, 93% packet loss, time 555028ms

But the packet loss is really not a rack issue, so I am closing this particular ticket.

Note: See TracTickets for help on using tickets.