wiki:OpenFlow/ControlPath

Version 1 (modified by Josh Smift, 7 years ago) (diff)

--

Following OpenFlow control paths

It can be tricky to follow the control path between an OpenFlow switch and its eventual controller (especially if it goes through one or more FlowVisors); and, if there are problems, to figure out exactly what in that path is causing the problem. Here are some tips on how to do that, using a scenario like "why isn't traffic from this host reaching its destination" as a typical example.

In general, there are two types of paths that you'll need to explore. One is the actual traffic path from the host to the destination, which will probably go through multiple switches/datapaths; call these "hops" for now. The other is that at each OpenFlow-controlled hop, the OpenFlow control traffic from the switch may take a path through multiple intermediate controllers (e.g. FlowVisors) before it finally reaches the controller that actually controls the host's traffic (and each hop may have a totally different control path, of course). Following the traffic path usually isn't hard; among other things, you can often ask the switch if it's seen the MAC address you're looking for, and if so, where... So the rest of this page focuses on following control paths.

You can try to follow either of those paths in either order, and you'll probably eventually need to do some of each, but it's probably better to start with the control path, i.e. at each hop, figure out what controls the traffic for the host at that hop, and then figure out where the next hop will be. Among other things, the controller can of course influence what the next hop even is, so you may not be able to trace a traffic path through all the hops before you start digging into the control path for each hop. Sometimes, you may be pretty sure that you know what the traffic path will look like, so you can sketch it out quickly and see if anything jumps out at you, like "oh, right, traffic from this host to the Internet will eventually have to go through poblano, which is down", and that can save you time digging into earlier control paths. But if you need to take a methodical approach, start with the hop closest to the host, trace the control path for that hop, find the next hop, and repeat.

So, the first thing you'll usually want to do is find the first hop from that host. Figure out which interface on the host it's using, e.g. with ifconfig -- in the GPO Lab, it's generally a dataplane interface like eth1, rather than the control interface like eth0. Different nodes use different numbers for their interfaces, so make sure to check if you're not sure. Then check what that interface on the host is connected to.

Once you've found the hop, to follow its control path, start by looking at the switch's configuration to confirm that the relevant port (and/or VLAN) is OpenFlow-controlled, and what its DPID is. Then, figure out who controls that datapath (the details of which will vary from switch to switch).

If that controller is a FlowVisor, you need to figure out where it's going to send traffic from your host, e.g. with fvctl. In general, you're trying to find flowspace rules that match traffic from this host (view the flowspace with listFlowSpace), which will point to a FV slice, which will have a controller (find that with getSliceInfo), and thus show you the next place to look on the control path. If the flowspace is small enough, you can go through each rule, and figure out if it applies to the traffic from this host, until you find one that does.

Once you get down to the controller at the bottom of this path, figure out what it does. If it's just a learning-switch controller, "what it does" is generally "flood unknown packets with a packet-out, learn MAC addresses, add flows for known MAC addresses with a flowmod". You can use tcpdump on the controller host to see if the packet-out and flowmod operations are happening as expected; Wireshark has an OpenFlow plug-in, although you'll need to tell it explicitly that traffic on some TCP ports should be interpreted as OFP if they're anything other than 6633. (Select an example packet, then choose Analyze -> Decode As from the menu, then select "OFP"; do this for both source and destination). If the controller is something else, you'll need to learn more about it, probably by talking to whoever's running it.

Once you've confirmed that the controller is receiving and sending OpenFlow control traffic successfully, you can move on to the next hop, and repeat this process.

If the controller isn't receiving the OpenFlow control traffic that you expect, and there was a FlowVisor between it and the switch, go back and check the flowspace on the FlowVisor to make sure that traffic from this host is actually getting routed to this controller. One common reason why it might not be is if there are multiple rules that match the traffic, at different layers, e.g. if there's a rule that matches the port the host is on, a rule that matches the MAC source or destination, and a rule that matches the IP source or destination. You can also try running tcpdump on the FlowVisor, which can sometimes reveal what controller it's actually sending the traffic to -- but if you can't figure out why it's doing that, this will only help you so much.

If you've done that, and still can't figure out why the controller isn't receiving the traffic you expect, some other possibilities:

  • A network firewall, or iptables on the controller host, could be blocking the traffic.
  • There could be a typo in something that points to the controller -- double-check that everything really does point where you think it does.

If the controller is receiving the traffic, and sending a reply, but the reply doesn't seem to be having the effect you expect, one possible cause is if there's a FlowVisor between the controller and the switch. As control traffic comes from the switch to the controller, FlowVisor uses its flowspace table to figure out which upstream controller to send it to; but when the controller sends control traffic back to the switch, FlowVisor also examines that traffic, and will block traffic that it doesn't think is part of that controller's flowspace. For example, if the flowspace for a slice includes only traffic for which nw_src and nw_dst are 10.42.69.0/24, and the controller sends back a flowmod saying to forward traffic from 10.42.69.10 to 10.42.47.1, FlowVisor will block that, because it doesn't think that slice is allowed to control that traffic. The controller should get an error when this happens, and there should be logs on the FlowVisor about this too.