Opened 11 years ago

Closed 11 years ago

#1211 closed (fixed)

InstaGENI OpenFlow stops working over stitched link

Reported by: lnevers@bbn.com Owned by: somebody
Priority: major Milestone:
Component: GPO Version: SPIRAL6
Keywords: acceptance Cc: nick.bastin@gmail.com
Dependencies:

Description

Numerous experiments that use the emulab OpenFlow extension over stitched links have worked up to April 22nd (Tuesday). But now it no longer works. Sliver creation is successful, but there are no attempts from the rack switches to connect to user specified OpenFlow controller.

For example, in a scenario where a slice is created that connects two InstaGENI racks, the sliver has VLAN 3711 but no attempt was made from either of the IG switches to connect to the specified OF Controller. One of the IG switches shows:

 HP-E5406zl# show openflow 3711

 Openflow Configuration - VLAN 3711

  Openflow state [Disabled] : Enabled
  Controller pseudo-URL : tcp:192.1.249.185:33020
  Listener pseudo-URL : ptcp:5018
  Openflow software rate limit [100] : 100
  Openflow connecting max backoff [60] : 60
  Openflow hardware acceleration [Enabled] : Enabled
  Openflow hardware rate limit [0] : 0
  Openflow hardware stats max refresh rate [0] : 0
  Openflow fail-secure [Disabled] : Enabled
  Second Controller pseudo-URL :
  Third Controller pseudo-URL :

 Openflow Status - VLAN 3711

  Switch MAC address : 84:34:97:C6:C9:00
  Openflow datapath ID : 0E7F843497C6C900
  Controller connection status (1/1) : disconnected ; state: CONNECTING
    Last error: Operation timed out ; timeout: 33/60 ; duration: 694
  Listening connection status : listening (1 connections)
  SW Dpif n_flows: 0 ; cur_capacity:1 ; n_lost: 0
          n_hit: 0 ; n_missed: 23 ; n_frags: 0
  Number of hardware rules: 0

I verified that I can connect from the racks back the controller port (192.1.249.185 port 33019). Also there are numerous VLAN listed for OpenFlow on the GPO IG switch all from my testing except for (10.3.1.7:6633).

According to Niky, there is an existing issues where there is no OF VLAN clean up on the IG switches (not sure about the state for the clean up issue). The list of VLANs on the GPO IG switch, not sure if it plays a role in the current state of the switch

HP-E5406zl# show openflow

 Openflow Configuration

  Openflow aggregate VLANs [Disabled] :
  Openflow aggregate management VlanId [0] : 0
  Openflow second aggregate management VlanId [0] : 0
  Openflow aggregate configuration VlanId [0] : 0

  VID  State HW  Active controller Pseudo-URL Conn
  ---- ----- --- -------------------------------------------------- ----
  263  On    On  tcp:192.1.249.135:1718 No
  1750 On    On  tcp:10.3.1.7:6633 Yes
  1755 On    On  tcp:10.3.1.7:6633 Yes
  1756 On    On  tcp:10.3.1.7:6633 Yes
  1757 On    On  tcp:10.3.1.7:6633 Yes
  1758 On    On  tcp:10.3.1.7:6633 Yes
  1759 On    On  tcp:10.3.1.7:6633 Yes
  3706 Off   On  tcp:192.1.249.185:33020 No
  3707 Off   On  tcp:192.1.249.185:33020 No
  3708 Off   On  tcp:130.127.88.98:9000 No
  3709 Off   On  tcp:192.1.249.185:33019 No
  3710 Off   On  tcp:130.127.88.98:33020 No
  3711 On    On  tcp:192.1.249.185:33020 No
  3713 Off   On  tcp:192.1.249.185:33020 No
  3714 On    On  tcp:192.1.249.135:1716 No
  3715 Off   On  tcp:192.1.249.185:3015 No
  3716 Off   On  tcp:mallorea.gpolab.bbn.com:33020 No
  3717 Off   On  tcp:130.127.88.98:33020 No
  3718 Off   On  tcp:192.1.249.185:33020 No
  3719 Off   On  tcp:192.1.249.185:33019 No
  3721 On    On  tcp:130.127.88.98:9000 No
  3722 Off   On  tcp:130.127.88.98:9000 No
  3723 Off   On  tcp:192.1.249.185:33019 No
  3724 Off   On  tcp:130.127.88.98:9000 No
  3725 Off   On  tcp:192.1.249.185:3015 No
  3726 Off   On  tcp:192.1.249.185:33020 No
  3727 Off   On  tcp:192.1.249.185:33020 No
  3729 Off   On  tcp:192.1.249.185:33020 No
  3730 Off   On  tcp:130.127.88.98:9000 No
  3731 Off   On  tcp:192.1.249.185:33019 No
  3732 On    On  tcp:130.127.88.98:9000 No
  3746 On    On  tcp:192.1.249.135:1716 No
  3747 Off   On  tcp:192.1.249.185:33019 No
  3749 On    On  tcp:192.1.249.185:33019 No 

Change History (1)

comment:1 Changed 11 years ago by lnevers@bbn.com

Resolution: fixed
Status: newclosed

It turns out this problem was due to the FlowVisor VM:

On 4/25/14, 10:34 AM, Nicholas Bastin wrote:

To reiterate from this thread, and from the threads that enabled this functionality in the first place: There is no configuration missing on the *switch* to make this happen. The switches in the IG racks do not have internet connectivity over their control plane (as intended and a security feature). However, unfortunately, to enable the requested functionality (direct experimenter control of openflow vlans), we need to allow the switch to contact the internet (because this is where the experimenter controller lives, as opposed to FlowVisor, which lives on a private network in the rack). To do this, we allow the FlowVisor VM on the rack to act as a connection-tracking NAT gateway for outgoing connections from the rack switch. This configuration *was* enabled on the GPO rack, but it was not persistent (as it was a dev-rack feature for testing purposes) so the lab move caused it to be disabled. I simply had to re-enable this feature on the GPO rack. There is nothing on the switch that has anything to do with this "problem", as it was just something that failed to come back up properly on the flowvisor VM after the lab was moved. -- Nick

Closing ticket.

Note: See TracTickets for help on using tickets.