Opened 12 years ago

Closed 12 years ago

#8 closed (fixed)

Attempts to create slivers result in "unknown slice error'

Reported by: lnevers@bbn.com Owned by: somebody
Priority: critical Milestone: EG-EXP-3
Component: AM Version: SPIRAL4
Keywords: Cc:
Dependencies:

Description

As of this morning, all sliver creation attempts have been failing with an excption reporting:

Other Exception: java.lang.Exception: ERROR: Unknown slice urn urn:publicid:IDN+pgeni.gpolab.bbn.com+slice+acclne-123456 

Attachments (1)

exo-3vm-1lan.rspec (2.2 KB) - added by lnevers@bbn.com 12 years ago.

Download all attachments as: .zip

Change History (25)

comment:1 Changed 12 years ago by ibaldin@renci.org

Please report which controller/AM you're speaking to when reporting issues.

comment:2 Changed 12 years ago by lnevers@bbn.com

Sorry, working on the BBN rack https://bbn-hn.exogeni.net:11443/orca/xmlrpc

comment:3 Changed 12 years ago by ibaldin@renci.org

We restarted broker on the rack. Something got torqued. Working now

comment:4 Changed 12 years ago by lnevers@bbn.com

Ilia, I had seen this problem before. Last week I ran into this problem, at the time, I postponed documenting it, but the next day when I went to reproduce it I was not able to. I think this may occur again with use. I will see if I can get it to happen again.

comment:5 Changed 12 years ago by lnevers@bbn.com

Just saw the error again while running AM API acceptance tests:

 Slice urn:publicid:IDN+pgeni.gpolab.bbn.com+slice+acclne-1834592 expires within 1 day(s) on 2012-05-02 00:35:05 UTC
Asked https://bbn-hn.exogeni.net:11443/orca/xmlrpc to reserve resources. No manifest Rspec returned. 
Other Exception: java.lang.Exception: ERROR: Unknown slice urn urn:publicid:IDN+pgeni.gpolab.bbn.com+slice+acclne-1834592

Also seen from commad line:

Asked https://bbn-hn.exogeni.net:11443/orca/xmlrpc to reserve resources. No manifest Rspec returned. 
Other Exception: java.lang.Exception: ERROR: Unknown slice urn urn:publicid:IDN+pgeni.gpolab.bbn.com+slice+acclne-1234 

comment:6 Changed 12 years ago by ibaldin@renci.org

Please attach the RSpec request

comment:7 Changed 12 years ago by lnevers@bbn.com

The rspec used:

<?xml version="1.0" encoding="UTF-8"?>
<rspec type="request"
xsi:schemaLocation="http://www.protogeni.net/resources/rspec/2 http://www.protogeni.net/resources/rspec/2/request.xsd"
    xmlns:flack="http://www.protogeni.net/resources/rspec/ext/flack/1"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xmlns="http://www.protogeni.net/resources/rspec/2">
<node client_id="geni1" component_manager_id="urn:publicid:IDN+bbnvmsite+authority+cm">
 <sliver_type name="m1.small">
   <disk_image name="http://geni-images.renci.org/images/standard/debian/debian-squeeze-amd64-neuca-2g.zfilesystem.sparse.v0.2.xml" version="397c43
1cb9249e1f361484b08674bc3381455bb9" />
 </sliver_type>
 <interface client_id="geni1:0">
    <ip address="172.16.1.1" netmask="255.255.255.0" />
 </interface>
</node>
</rspec>

comment:8 Changed 12 years ago by ibaldin@renci.org

This request doesn't make a lot of sense as you're asking to create a node with a dataplane interface but no indication of what should be on the other end of that interface (typically a vlan). I think you're trying to do an end-run around the missing functionality to name a specific VLAN so you can do an OpenFlow? test.

That said, this is a bug on our part in not rejecting it properly.

comment:9 Changed 12 years ago by lnevers@bbn.com

True removing the interface from the RSpec bypasses the problem. That said, agreed, we need better handling of this error scenario.

comment:10 Changed 12 years ago by ibaldin@renci.org

We are looking into it. Simply submitting an NDL-OWL equivalent of your request does not cause a problem. The issue may be in the GENI AM API adapter in ORCA.

comment:11 Changed 12 years ago by lnevers@bbn.com

Have another experiment that shows the same failure:

Result Summary: Slice urn:publicid:IDN+pgeni.gpolab.bbn.com+slice+exoa expires within 1 day(s) on 2012-05-02 21:17:21 UTC
Asked https://bbn-hn.exogeni.net:11443/orca/xmlrpc to reserve resources. No manifest Rspec returned. 
Other Exception: java.lang.Exception: ERROR: Unknown slice urn urn:publicid:IDN+pgeni.gpolab.bbn.com+slice+exoa 

The request RSpec used is attached and it defines a 4 VM ring topology.

comment:12 in reply to:  11 Changed 12 years ago by lnevers@bbn.com

Replying to lnevers@bbn.com:

The request RSpec used is attached and it defines a 4 VM ring topology.

Please ignore, looks like I have the wrong component_id. Fixed the component ID and the 4 node ring VM slice is created without problem.

comment:13 Changed 12 years ago by ibaldin@renci.org

The unknown slice error is fixed in the code - you will get a more detailed message after we redeploy the code next week.

However I'm still bothered by the fact that somehow you were able to break the broker (it may have been a specific rspec or a particular sequence of actions). We have been unable to recreate this so far, but will work on it next week.

comment:14 Changed 12 years ago by ibaldin@renci.org

This is now ticket Orca #247 https://geni-orca.renci.org/trac/ticket/247

The issue is credential validation in Orca AM API adapter.

comment:15 Changed 12 years ago by lnevers@bbn.com

Have setup a lot of experiments this afternoon and have not seen this error.

Even though the rspec (1 node with 1 interface on data plane and no link defined) is still being accepted and not rejected.

comment:16 Changed 12 years ago by ibaldin@renci.org

Orca #247 has been fixed. Code should be deployed into ExoGENI for operations the week of June 4.

comment:17 Changed 12 years ago by lnevers@bbn.com

Using an ExoGENI version which should have the fix for ORCA #247. Ran into the "Unknown Slice" Error again:

$ ./src/omni.py createslice 5vmslice
INFO:omni:Loading config file omni_config
INFO:omni:Using control framework pgeni
INFO:omni:Created slice with Name 5vmslice, URN urn:publicid:IDN+pgeni.gpolab.bbn.com+slice+5vmslice, Expiration 2012-06-05 16:57:22+00:00
INFO:omni: ------------------------------------------------------------
INFO:omni: Completed createslice:

  Options as run:
		framework: pgeni
		native: True

  Args: createslice 5vmslice

  Result Summary: Created slice with Name 5vmslice, URN urn:publicid:IDN+pgeni.gpolab.bbn.com+slice+5vmslice, Expiration 2012-06-05 16:57:22+00:00
 
INFO:omni: ============================================================

$ ./src/omni.py -a exobbn createsliver 5vmslice --api-version 2 ./exorspec/exo-5vm-1lan.rspec 
INFO:omni:Loading config file omni_config
INFO:omni:Using control framework pgeni
INFO:omni:Slice urn:publicid:IDN+pgeni.gpolab.bbn.com+slice+5vmslice expires on 2012-06-05 16:57:22 UTC
INFO:omni:Substituting AM nickname exobbn with URL https://bbn-hn.exogeni.net:11443/orca/xmlrpc, URN unspecified_AM_URN
INFO:omni:Substituting AM nickname exobbn with URL https://bbn-hn.exogeni.net:11443/orca/xmlrpc, URN unspecified_AM_URN
INFO:omni:Creating sliver(s) from rspec file ./exorspec/exo-5vm-1lan.rspec for slice urn:publicid:IDN+pgeni.gpolab.bbn.com+slice+5vmslice
INFO:omni:Asked https://bbn-hn.exogeni.net:11443/orca/xmlrpc to reserve resources. Result:
INFO:omni:<!-- Reserved resources for:
	Slice: 5vmslice
	At AM:
	URL: https://bbn-hn.exogeni.net:11443/orca/xmlrpc
 -->
INFO:omni: ------------------------------------------------------------
INFO:omni: Completed createsliver:

  Options as run:
		aggregate: exobbn
		api_version: 2
		framework: pgeni
		native: True

  Args: createsliver 5vmslice ./exorspec/exo-5vm-1lan.rspec

  Result Summary: Slice urn:publicid:IDN+pgeni.gpolab.bbn.com+slice+5vmslice expires on 2012-06-05 16:57:22 UTC
Asked https://bbn-hn.exogeni.net:11443/orca/xmlrpc to reserve resources. No manifest Rspec returned. Other Exception: java.lang.Exception: ERROR: Unknown slice urn urn:publicid:IDN+pgeni.gpolab.bbn.com+slice+5vmslice 
INFO:omni: ============================================================
$

comment:18 Changed 12 years ago by lnevers@bbn.com

Milestone: EG-EXP-3

comment:19 Changed 12 years ago by lnevers@bbn.com

Several occurrence of this "Unknown slice urn" took place this afternoon. This failure is seen with sliver request to the BBN SM and to the ExoSM.

comment:20 Changed 12 years ago by lnevers@bbn.com

Capturing comments from email exchange previous to 6/6 meeting:

On 6/5/12 7:56 PM, Luisa Nevers wrote:

EG-EXP-3:

  • Ticket 8 - A fix to the "unknown slice urn" error was expected to be part of the current version, but the problem has continued to happen in testing, on Monday and Tuesday.

This problem was caused by ORCA ticket #247, which was closed on June 1. Is the fix part of the version running in the BBN rack?

On 6/6/12 10:28 AM, Ilia Baldine wrote:

Regarding ticket 8, there is some confusion. Orca ticket 247 was for handling premature closing of slivers that caused racks to run out if resources. It has ben fixed and deployed.

As an aside we have also fixed the IP address assignment problem (when you do not assign an IP address to an interface, it will be created an remain unconfigured, rather that having some random IP address).

Regarding the unknown slice error - this is a problem in orca's GENI AM API adapter that relates to error reporting. Instead of this error you should have seen something more informative explaining what has failed (usually it is the embedding code refusing the request). Not having available resources is a kind of embedding error and for that reason ticket 247 was lumped into this. I thought I had fixed the error reporting, but I guess not. I will have to look at it when I get back from vacation.

comment:21 in reply to:  20 Changed 12 years ago by lnevers@bbn.com

On 6/6/12 10:28 AM, Ilia Baldine wrote:

Regarding the unknown slice error - this is a problem in orca's GENI AM API adapter that relates to error reporting. Instead of this error you should have seen something more informative explaining what has failed (usually it is the embedding code refusing the request). Not having available resources is a kind of embedding error and for that reason ticket 247 was lumped into this. I thought I had fixed the error reporting, but I guess not. I will have to look at it when I get back from vacation.

I have run some test to try to characterize this "not-enough-resources" problem and I have found that the topology matters. Here are the sliver created and the results:

 1 VM - OK
 2 VMs - OK
 3 VMs linear topology - OK
 3 VMs on 1 lan (grid) - FAIL (ERROR: Unknown slice urn)
 4 VMs linear topology - OK
 4 VMs on 1 lan (grid) - FAIL (ERROR: Unknown slice urn)
 5 VMs linear topology - OK
 5 VMs on 1 lan (grid) - FAIL (ERROR: Unknown slice urn)

comment:22 Changed 12 years ago by ibaldin@renci.org

There are two problems here:

1) RSpec-NDL converter does not properly handle the indicated cases (more than two VMs on a lan or just one VM on a lan). This capability is available through Flukes only for now.

2) There is a problem with error reporting for this case.

Changed 12 years ago by lnevers@bbn.com

Attachment: exo-3vm-1lan.rspec added

comment:23 Changed 12 years ago by lnevers@bbn.com

Software update includes changes for this issue, so I re-ran the 5 VM on 1 lan scenario.

The sliver named "5vm" was created without problem, but the geni_status has been "configuring" for the past hour.

comment:24 Changed 12 years ago by lnevers@bbn.com

Resolution: fixed
Status: newclosed

This problem has not been seen in the past two week of testing. This problem is deemed solved. Closing ticket.

Note: See TracTickets for help on using tickets.