Opened 6 years ago

Closed 6 years ago

#1059 closed (invalid)

No VLANs available on link (PCE_CREATE_FAILED)

Reported by: lnevers@bbn.com Owned by: xyang@maxgigapop.net
Priority: major Milestone:
Component: I2AM Version: SPIRAL5
Keywords: Network Stitching Cc: ckotil@grnoc.iu.edu, lnevers@bbn.com
Dependencies:

Description

While creating a sliver (lnstitchc) between PG Utah and PG KY, there was an error reporting that no VLAN are available. This problem has been seem in the past and explained as a SCS issue with randomizing VLAN tags. Reporting in case the issue was not fully resolved with the last fix. Here is the failure reported:

07:26:58 INFO     stitch.Aggregate: Writing to '/tmp/lnstitchb-createsliver-request-11-ion-internet2-edu.xml'
07:26:58 INFO     stitch.Aggregate: 
	Stitcher doing createsliver at http://geni-am.net.internet2.edu:12346
07:27:45 INFO     stitch.Aggregate: DCN AM <Aggregate urn:publicid:IDN+ion.internet2.edu+authority+cm>: must wait for status ready....
07:27:45 INFO     stitch.Aggregate: Pause to let circuit become ready...
07:28:27 INFO     stitch.Aggregate: Pause to let circuit become ready...
07:29:05 WARNING  stitch.Aggregate: sliverstatus 21421 is (still) failed at 
<Aggregate urn:publicid:IDN+ion.internet2.edu+authority+cm>. Delete and retry.
07:29:05 WARNING  stitch.Aggregate:   Status had error message: VLAN PCE(PCE_CREATE_FAILED): 
'There are no VLANs available on link ion.internet2.edu:rtr.atla:ge-10/3/2:protogeni  
on reservation ion.internet2.edu-21421 in VLAN PCE'
07:29:11 INFO     stitch.launcher: Will put <Aggregate urn:publicid:IDN+ion.internet2.edu+authority+cm> 
back in the pool to allocate. 
Got Sliver status for circuit 21421 was (still): failed: VLAN PCE(PCE_CREATE_FAILED): 
'There are no VLANs available on link ion.internet2.edu:rtr.atla:ge-10/3/2:protogeni 
 on reservation ion.internet2.edu-21421 in VLAN PCE'

}}}

Attachments (2)

log-lnstitcha.txt (10.7 KB) - added by lnevers@bbn.com 6 years ago.
log-lnstitchb.txt (12.8 KB) - added by lnevers@bbn.com 6 years ago.

Download all attachments as: .zip

Change History (12)

comment:1 Changed 6 years ago by lnevers@bbn.com

Cc: ckotil@grnoc.iu.edu added; c removed

comment:2 Changed 6 years ago by ckotil@grnoc.iu.edu

Owner: changed from xyang@maxgigapop.net to ckotil@grnoc.iu.edu
Status: newassigned

In the OSCARS UI I was able to recreate (clone) this circuit without any problem. The cloned circuit went to active. We have seen this error , PCE Error, sporadically appear. I actually have a new release of the oscars api and and coordinator packages in hand that are supposed to resolve this issue. The upgrade is scheduled for Friday June 21 at 1200 UTC.

Since the error appears to be random, and the vlan IS actually available, if you try the test again it should work. According to the oscars developers after the upgrade, this issue should be resolved for good.

comment:3 Changed 6 years ago by ckotil@grnoc.iu.edu

Cc: lnevers@bbn.com added

Luisa, Oscars has been updated, as you're going through your testing please let me know if this error resurfaces.

Thanks, --Chad

comment:4 Changed 6 years ago by lnevers@bbn.com

Will re-run the test that showed this problem now.

comment:5 Changed 6 years ago by lnevers@bbn.com

Just saw the error again:

10:48:03 WARNING  stitch.Aggregate: sliverstatus 21681 is (still) failed at <Aggregate urn:publicid:IDN+ion.internet2.edu+authority+cm>. Delete and retry.
10:48:03 WARNING  stitch.Aggregate:   Status had error message: VLAN PCE(PCE_CREATE_FAILED): 'There are no VLANs available on link ion.internet2.edu:rtr.atla:ge-10/3/2:protogeni  on reservation ion.internet2.edu-21681 in VLAN PCE'
10:48:09 INFO     stitch.launcher: Will put <Aggregate urn:publicid:IDN+ion.internet2.edu+authority+cm> back in the pool to allocate. Got Sliver status for circuit 21681 was (still): failed: VLAN PCE(PCE_CREATE_FAILED): 'There are no VLANs available on link ion.internet2.edu:rtr.atla:ge-10/3/2:protogeni  on reservation ion.internet2.edu-21681 in VLAN PCE'

comment:6 Changed 6 years ago by ckotil@grnoc.iu.edu

What happened here is that the circuit ion.internet2.edu-21671 was being configured, when your client attempted to create ion.internet2.edu-21681 exactly 1 minute later. This caused Oscars to throw the PCE error because the vlan was already in use by ion.internet2.edu-21671. I'm not sure how you want to proceed, but this doesn't appear to be an issue with Oscars, rather an issue with the client attempting to submit too quickly to oscars.

comment:7 Changed 6 years ago by lnevers@bbn.com

Looking at the logs at the time, there were two slivers being set up concurrently:

  • Sliver "lnstitcha" requesting a sliver from GPO IG to Utah PG (circuit 21681)
  • Sliver "lnstitchb" requesting a sliver from GPO IG to Utah PG (circuit 21671)

The VLAN PCE failure for "lnsticha" attempting to get circuit 21681 is reported as failing because circuit 21671 was configured for "lnstitchb".

I am attaching the log for the two slivers creations.

Changed 6 years ago by lnevers@bbn.com

Attachment: log-lnstitcha.txt added

Changed 6 years ago by lnevers@bbn.com

Attachment: log-lnstitchb.txt added

comment:8 Changed 6 years ago by ckotil@grnoc.iu.edu

Owner: changed from ckotil@grnoc.iu.edu to xyang@maxgigapop.net
Status: assignednew

comment:9 Changed 6 years ago by lnevers@bbn.com

type: taskdefect

Incorrectly categorized as task, modifying to defect category

comment:10 Changed 6 years ago by xyang@maxgigapop.net

Resolution: invalid
Status: newclosed

This is expected behavior not a bug. VLAN contention is inevitable before we have stitching negotiation. Randomization only reduces the chance of contention but cannot eliminate it.

Note: See TracTickets for help on using tickets.