Changes between Initial Version and Version 1 of GEC17Agenda/AdvancedOpenFlow/Procedure/Execute


Ignore:
Timestamp:
07/19/13 15:34:13 (11 years ago)
Author:
shuang@bbn.com
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • GEC17Agenda/AdvancedOpenFlow/Procedure/Execute

    v1 v1  
     1
     2= [wiki:GENIEducation/SampleAssignments/OpenFlowLoadBalancerTutorial/ExerciseLayout OpenFlow Load Balancer Tutorial] =
     3{{{
     4#!html
     5
     6<div style="text-align:center; width:495px; margin-left:auto; margin-right:auto;">
     7<img id="Image-Maps_5201305222028436" src="http://groups.geni.net/geni/attachment/wiki/GENIExperimenter/Tutorials/Graphics/Execute.jpg?format=raw" usemap="#Image-Maps_5201305222028436" border="0" width="495" height="138" alt="" />
     8<map id="_Image-Maps_5201305222028436" name="Image-Maps_5201305222028436">
     9<area shape="rect" coords="18,18,135,110" href="http://groups.geni.net/geni/wiki/GENIEducation/SampleAssignments/OpenFlowLoadBalancerTutorial/ExerciseLayout/DesignSetup" alt="" title=""    />
     10<area shape="rect" coords="180,18,297,111" href="http://groups.geni.net/geni/wiki/GENIEducation/SampleAssignments/OpenFlowLoadBalancerTutorial/ExerciseLayout/Execute" alt="" title=""    />
     11<area shape="rect" coords="344,17,460,110" href="http://groups.geni.net/geni/wiki/GENIEducation/SampleAssignments/OpenFlowLoadBalancerTutorial/ExerciseLayout/Finish" alt="" title=""    />
     12<area shape="rect" coords="493,136,495,138" href="http://www.image-maps.com/index.php?aff=mapped_users_5201305222028436" alt="Image Map" title="Image Map" />
     13</map>
     14<!-- Image map text links - End - -->
     15
     16</div>
     17}}}
     18
     19
     20= 2. Implement a Load Balancing OpenFlow Controller =
     21
     22== 2.1 Run an example Load Balancing OpenFlow Controller ==
     23  An example OpenFlow Controller that assigns incoming TCP connections to alternating paths '''based on total number of flows''' is already downloaded for you. You can find it (load-balancer.rb) in the home directory on node "Switch". [[BR]]
     24  - 2.1.1 Log on to node "Switch", wait until all interfaces are up and running by issuing "ifconfig", make sure eth1, eth2, eth3 are up and assigned with valid IP addresses. [[BR]]
     25  Start the example Load Balancer by executing the following:
     26  {{{
     27  /opt/trema-trema-f995284/trema run /root/load-balancer.rb
     28  }}}
     29  - 2.1.2 After you started your Load Balancer, you should be able to see the following (Switch id may vary):
     30  {{{
     31  OpenFlow Load Balancer Conltroller Started!
     32  Switch is Ready! Switch id: 196242264273477
     33  }}}
     34  This means the OpenFlow Switch is connected to your controller and you can start testing your OpenFlow Load Balancer now.
     35 
     36== 2.2 Use GIMI Portal to run the experiment and monitor the load balancer ==
     37   - 2.2.1 Log on to your !LabWiki Account on http://emmy9.casa.umass.edu:3005 , on the `Prepare` Column, type "OpenFlow", it will pop up with a list of .rb choices. Choose any one, and replace the whole content with the ruby script [http://www.gpolab.bbn.com/experiment-support/OpenFlowExampleExperiment/ExoGENI/loadbalancer_monitor.rb HERE].
     38   - 2.2.2 Log on to node "Switch" and do "ifconfig" to see the IP addresses on each interfaces.
     39    - '''Note''': You may not be able to see all interfaces up immediately when node "Switch" is ready; wait for some more time (about 1 min) then try "ifconfig" again.
     40    - Identify the two interfaces that you want to monitor: the interfaces with IP address 192.168.2.1 and 192.168.3.1 respectively. On the !LabWiki page, in your ruby script, find the following line:
     41{{{
     42###### Change the following to the correct interfaces ######
     43left = 'eth1'
     44right = 'eth3'
     45###### Change the above to the correct interfaces ######
     46}}}
     47   - 2.2.3 Change eth1 and eth3 to the corresponding two interfaces you found with IP address 192.168.2.1 (the interface that connects to the left path) and 192.168.3.1 (the interface that connects to the right path) and press the "save" icon on your !LabWiki page.
     48   - 2.2.4 Drag the `file Icon` at the left-top corner on your !LabWiki page from `Prepare` column and drop it to `Execute` column. Fill in the name of your !LabWiki experiment (this can be anything), the name of your slice (this has to be your slice name), and type "true" in the graph box to enable graphs. And then press "Start Experiment" button.
     49   - 2.2.5 When your experiment is finished, turn off your controller and disconnect switch from your controller:
     50      - On node "Switch", press "Ctrl" and "c" key to kill your Load Balancer process on node "Switch"
     51      - On node "Switch", use the following command to disconnect the OpenFlow Switch from the controller:
     52     {{{
     53     ovs-vsctl del-controller br0
     54     }}}
     55   - '''Note''': Do not start another experiment (i.e., drag and drop the file icon in !LabWiki and press "Start Experiment") before your current experiment is finished.
     56   
     57== (Optional) 2.3 Fetch experimental results from your iRods account ==
     58 - 2.3.1 Log in your iRods account on https://www.irods.org/web/index.php, use "emmy9.casa.umass.edu" as Host/IP, "1247" as Port.
     59 - 2.3.2 Download your experimental results from your user directory under /geniRenci/home/
     60
     61== 2.4 Change link parameters of left path using "ovs-vsctl" and repeat the experiment ==
     62 - 2.4.1 Log on to node "left" and change the link capacity for the interface with IP address "192.168.2.2" (use "ifconfig" to find the correct interface, here we assume eth1 is the interface connecting to node "Switch"):
     63{{{
     64ovs-vsctl set Interface eth1 ingress_policing_rate=10000
     65}}}
     66 The above will rate-limit the connection from node "Switch" to node "left" to have a bandwidth of 10Mbps.
     67 - Other ways to e.g., change link delay and lossrate using "tc qdisc netem" can be found in Section 4.
     68
     69== 2.5 Repeat Experiment with limited bandwidth on Left path ==
     70 - 2.5.1 On node "Switch", start your Load Balancer using the following command:
     71 {{{
     72 /opt/trema-trema-f995284/trema run /root/load-balancer.rb
     73 }}}
     74 - 2.5.2 Start a new command line window, log onto node "Switch", use the following command to connect the OpenFlow Switch to the controller (the console window that runs your controller should display "Switch is Ready!" when the switch is connected):
     75 {{{
     76 ovs-vsctl set-controller br0 tcp:127.0.0.1 ptcp:6634:127.0.0.1
     77 }}}
     78 - 2.5.3 Go back to your !LabWiki web page, drag and drop the `file icon` and repeat the experiment, as described in section 2.2.4, using a different experiment name (the slice name should stay the same).
     79 - 2.5.4 When your experiment is finished, turn off your controller and disconnect switch from your controller:
     80      - On node "Switch", press "Ctrl" and "c" key to kill your Load Balancer process on node "Switch"
     81      - On node "Switch", use the following command to disconnect the OpenFlow Switch from the controller:
     82     {{{
     83     ovs-vsctl del-controller br0
     84     }}}
     85
     86 Questions:
     87 - Did you see any difference from the graphs plotted on !LabWiki, compared with the graphs plotted in the first experiment? why?
     88 - Check out the output of the Load Balancer on node "Switch" and tell how many flows are directed to the left path and how many are on the right path, why?
     89 - To answer the above question, you need to understand the Load Balancing controller. Check out the "load-balancer.rb" file in your home directory on node "Switch". Check Section 3.1 for hints/explanations about this OpenFlow Controller.
     90
     91== 2.6 Modify the OpenFlow Controller to balance throughput among all the TCP flows ==
     92 - You need to calculate the average per-flow throughput observed from both left and right path in function "stats_reply" in your load-balancer.rb
     93 - In function "decide_path", change the path decision based on the calculated average per-flow throughput: forward the flow onto the path with more average per-flow throughput. (Why? TCP tries its best to consume the whole bandwidth so more throughput means network is not congested)
     94 - If you do not know where to start, check the hints in Section 3.1.
     95  - If you really do not know where to start after reading the hints, the answer can be found on node "Switch", at /tmp/load-balancer/load-balancer-solution.rb
     96  - Copy the above solution into your home directory then re-do the experiment on !LabWiki. '''Note:''' you need to change your script to use the correct Load Balancing controller (e.g., if your controller is "load-balancer-solution.rb", you should run "/opt/trema-trema-f995284/trema run /root/load-balancer-solution.rb")
     97 - Redo the experiment using your new OpenFlow Controller following steps in Section 2.5, check the graphs plotted on !LabWiki as well as the controller's log on node "Switch" and see the difference.
     98 - When your experiment is done, you need to stop the Load Balancer:
     99  - On node "Switch", use the following command to disconnect the OpenFlow Switch from the controller:
     100  {{{
     101  ovs-vsctl del-controller br0
     102  }}}
     103  - On node "Switch", press "Ctrl" and "c" key to kill your Load Balancer process on node "Switch"
     104
     105== 2.7 Automate your experiment using !LabWiki ==
     106 - 2.7.1 Add code in your !LabWiki script to automate starting and stoping your OpenFlow Controller:
     107  - Go back to your !LabWiki page, un-comment the script from line 184 to line 189 to start your OpenFlow Controller automatically on !LabWiki
     108   - Note: You might need to change line 185 to use the correct load balancer controller
     109  - Uncomment the script from line 205 to line 209 to stop your OpenFlow Controller automatically on !LabWiki
     110 - 2.7.2 On your !LabWiki web page, drag and drop the `file icon` and repeat the experiment, as described in section 2.3, using a different experiment name (the slice name should stay the same).
     111 - If you have more time or are interested in trying out things, go ahead and try section 1.9. The tutorial is over now and feel free to ask questions :-)
     112
     113== 2.8 Try more experiments using different kinds of OpenFlow Load Balancers ==
     114 - You can find more load balancers under /tmp/load-balancer/ on node "Switch"
     115 - To try out any one of them, follow the steps:
     116  - At the home directory on node "Switch", copy the load balancer you want to try out, e.g.,
     117  {{{
     118  cp /tmp/load-balancer/load-balancer-random.rb /root/
     119  }}}
     120  - Change your !LabWiki code at line 185 to use the correct OpenFlow controller.
     121  - On !LabWiki, drag and drop the "File" icon and re-do the experiment as described in section 2.3
     122 - Some explanations about the different load balancers:
     123  - "load-balancer-random.rb" is the load balancer that picks path '''randomly''': each path has 50% of the chance to get picked
     124  - "load-balancer-roundrobin.rb" is the load balancer that picks path in a '''round robin''' fashion: right path is picked first, then left path, etc.
     125  - Load balancers that begin with "load-balancer-bytes" picks path based on the total number of bytes sent out to each path: the one with '''fewer bytes''' sent out is picked
     126   - "load-balancer-bytes-thread.rb" sends out flow stats request in function "packet_in" upon the arrival of a new TCP flow and waits until flow stats reply is received in function "stats_reply" before a decision is made. As a result, this balancer gets '''the most up-to-date flow stats''' to make a decision. However, it needs to wait for at least the round-trip time from the controller to the switch (for the flow stats reply) before a decision can be made.
     127   - "load-balancer-bytes-auto-thread.rb" sends out flow stats request once every 5 seconds in a separate thread, and makes path decisions based on the most recently received flow stats reply. As a result, this balancer makes path decisions based on some '''old statistics (up to 5 seconds)''' but reacts fast upon the arrival of a new TCP flow (i.e., no need to wait for flow stats reply)
     128  - Load balancers that begin with "load-balancer-flows" picks path based on the total number of flows sent out to each path: the one with '''fewer flows''' sent out is picked
     129  - Load balancers that begin with "load-balancer-throughput" picks path based on the total throughput sent out to each path: the one with '''more throughput''' is picked
     130
     131= 3. Hints / Explanation =
     132== 3.1 About the OpenFlow controller [http://www.gpolab.bbn.com/experiment-support/OpenFlowExampleExperiment/ExoGENI/load-balancer.rb load-balancer.rb] ==
     133  - Trema web site: http://trema.github.io/trema/
     134  - Treme ruby API document: http://rubydoc.info/github/trema/trema/master/frames
     135  - '''Functions used in our tutorial:'''
     136    - '''start()''': is the function that will be called when the OpenFlow Controller is started. Here in our case, we read the file /tmp/portmap and figures out which OpenFlow port points to which path
     137    - '''switch_ready()''': is the function that will be called each time a switch connects to the OpenFlow Controller. Here in our case, we allow all non-TCP flows to pass (including ARP and ICMP packets) and ask new inbound TCP flow to go to the controller. We also starts a "timer" function that calls "query_stats()" once every 2 seconds.
     138    - '''query_stats()''': is the function that sends out a flow_stats_request to get the current statistics about each flow.
     139    - '''packet_in()''': is the function that will be called each time a packet arrives at the controller. Here in our case, we call "decide_path()" to get path decisions, then send flow entry back to the OpenFlow Switch to instruct the switch which path to take for this new TCP flow.
     140    - '''stats_reply()''': is the function that will be called when the OpenFlow Controller receives a flow_stats_reply message from the OpenFlow Switch. Here in our case, we update the flow statistics so that "decide_path()" can make the right decision.
     141    - '''send_flow_mod_add()''': is the function that you should use to add a flow entry into an OpenFlow Switch.
     142    - '''decide_path()''': is the function that makes path decisions. It returns the path choices based on flow statistics.
     143  - '''The Whole Process: '''
     144    - When the OpenFlow switch is ready, our controller starts a function that asks for flow stats once every 2 seconds.
     145    - The OpenFlow switch will reply with statistics information about all flows in its flow table.
     146    - This flow statistics message will be fetched by the "stats_reply" function in the OpenFlow controller implemented by the user on node "Switch".
     147    - As a result, our controller updates its knowledge about both left and right path once every 2 seconds.
     148    - Upon the arrival of a new TCP flow, the OpenFlow controller decides which path to send the new flow to, based on the updated flow statistics.
     149
     150  The !FlowStatsReply message is in the following format:
     151{{{
     152FlowStatsReply.new(
     153  :length => 96,
     154  :table_id => 0,
     155  :match => Match.new
     156  :duration_sec => 10,
     157  :duration_nsec => 106000000,
     158  :priority => 0,
     159  :idle_timeout => 0,
     160  :hard_timeout => 0,
     161  :cookie => 0xabcd,
     162  :packet_count => 1,
     163  :byte_count => 1,
     164  :actions => [ ActionOutput.new ]
     165)
     166}}}
     167
     168== 3.2 About The Rspec file [http://www.gpolab.bbn.com/experiment-support/OpenFlowExampleExperiment/openflow-loadbalancer-kvm.rspec OpenFlowLBExo.rspec] ==
     169  - The Rspec file describes a topology we showed earlier--each node is assigned with certain number of interfaces with pre-defined IP addresses
     170  - Some of the nodes are loaded with softwares and post-scripts. We will take node "Switch" as an example since it is the most complicated one.
     171   - The following section in the Rspec file for node "Switch":
     172   {{{
     173     <install url="http://www.gpolab.bbn.com/experiment-support/OpenFlowExampleExperiment/software/of-switch-exo.tar.gz" install_path="/"/>
     174   }}}
     175   means it is going to download that tar ball from the specified URL and extract to directory "/"
     176   - The following section in the Rspec file for node "Switch":
     177   {{{
     178     <execute shell="bash" command="/tmp/postboot_script_exo.sh $sliceName $self.Name() ; /tmp/of-topo-setup/lb-setup"/>
     179   }}}
     180   names the post-boot script that ExoGENI is going to run for you after the nodes are booted. 
     181  - More information about "/tmp/postboot_script_exo.sh":
     182   It is a "hook" to the !LabWiki interface. Experimenter run this so that !LabWiki knows the name of the slice and the hostname of the particular node that OML/OMF toolkits are running on.
     183  - More information about "/tmp/of-topo-setup/lb-setup":
     184   "lb-setup" is to setup the load balancing switch. The source code as well as explanation is as follows:
     185   {{{
     186   #!/bin/sh
     187
     188   /tmp/of-topo-setup/prep-trema       # install all libraries for trema
     189   /tmp/of-topo-setup/ovs-start           # create ovs bridge
     190
     191   cp /usr/bin/trace-oml2 /usr/bin/trace        # a hack to the current LabWiki --> needs to be fixed
     192   cp /usr/bin/nmetrics-oml2 /usr/bin/nmetrics       # a hack to the current LabWiki --> needs to be fixed
     193   # download the load balancing openflow controller source code to user directory
     194   wget http://www.gpolab.bbn.com/experiment-support/OpenFlowExampleExperiment/ExoGENI/load-balancer.rb -O /root/load-balancer.rb
     195
     196   INTERFACES="192.168.1.1 192.168.2.1 192.168.3.1"
     197
     198   # wait until all interfaces are up, then fetch the mapping from interface name to its ip/MAC address and save this info in a file /tmp/ifmap
     199   /tmp/of-topo-setup/writeifmap3
     200
     201   # add port to the ovs bridge
     202   /tmp/of-topo-setup/find-interfaces $INTERFACES | while read iface; do
     203       ovs-vsctl add-port br0 $iface < /dev/null
     204   done
     205
     206   # create port map save it to /tmp/portmap
     207   ovs-ofctl show tcp:127.0.0.1:6634 \
     208       | /tmp/of-topo-setup/ovs-id-ports 192.168.1.1=outside 192.168.2.1=left 192.168.3.1=right \
     209       > /tmp/portmap
     210   }}}
     211
     212== 3.3 About the GIMI script you run on !LabWiki ==
     213 - Line 1 to Line 128: the definition of oml trace and oml nmetrics library. It basically defines the command line options for oml2-trace and oml2-nmetrics, as well as the output (the monitoring data that is going to be stored into the oml server)
     214  - users are not supposed to modify them
     215  - the definition here we used is not the same as what is provided by the latest OML2 2.10.0 library because there is some version mis-match between the OMF that !LabWiki is using and the OML2 toolkit that we are using. It is a temporary hack for now --> to be fixed
     216  - we added the definition of option "--oml-config" for trace app (Line 27-28) so that oml2-trace accepts configuration files:
     217  {{{
     218  app.defProperty('config', 'config file to follow', '--oml-config',
     219                  :type => :string, :default => '"/tmp/monitor/conf.xml"')
     220  }}}
     221 - Line 134 to Line 137: user defines the monitoring interfaces here. In our case, we want to monitor the interface on node "Switch" that connects to the left path (with IP 192.168.2.2) and to the right path (with IP 192.168.3.1)
     222 - Line 139 to Line 169: defines on which node the user wants to run which monitoring app; and the "display graph" option.
     223  - group "Monitor" monitors the left path statistics using nmetrics and trace.
     224  - group "Monitor1" monitors the right path statistics using nmetrics and trace.
     225  - To monitor the throughput information, we used oml2-trace with the option of "--oml-config" which uses the configuration file we created at /tmp/monitor/conf.xml, which simply sums up the number of tcp_packet_size (in Bytes) for each second and save the info into the OML Server (in a Postgre database):
     226  {{{
     227<omlc id="switch" encoding="binary">
     228  <collect url="tcp:emmy9.casa.umass.edu:3004" name="traffic">
     229    <stream mp="tcp" interval="1">
     230      <filter field="tcp_packet_size" operation="sum" rename="tcp_throughput" />
     231    </stream>
     232  </collect>
     233</omlc>
     234  }}}
     235  - More information about nmetrics and trace can be found here: http://oml.mytestbed.net/projects/omlapp/wiki/OML-instrumented_Applications#Packet-tracer-trace-oml2
     236 - Line 173 to Line 218: defines the experiment:
     237  - Line 175-177: starts the monitoring app
     238  - Line 179-181: starts the TCP receiver (using iperf)
     239  - Line 183-189: starts the load balancer and connects ovs switch to the load balancer (controller)
     240  - Line 191-200: starts 20 TCP flows, with 5 seconds interval between the initial of each Flow
     241  - Line 205-209: stop the load balancer controller, disconnect the ovs switch from the controller and finish the experiment
     242 - Line 217 to Line 234: defines the two graphs we want to plot:
     243  - The first uses the monitoring data from oml2-nmetrics to display the cumulated number of bytes observed from each of the interfaces;
     244  - The second graph uses the monitoring results from oml2-trace to display the throughput observed from each of the interfaces.
     245
     246= 4. Tips: Debugging an OpenFlow Controller =
     247You will find it helpful to know what is going on inside your OpenFlow controller and its associated switch when implementing these exercises. [[BR]]
     248This section contains a few tips that may help you out if you are using the Open vSwitch implementation provided with this tutorial.
     249If you are using a hardware OpenFlow switch, your instructor can help you find equivalent commands. [[BR]]
     250The Open vSwitch installation provided by the RSpec included in this tutorial is located in ''/opt/openvswitch-1.6.1-F15''. You will find Open vSwitch commands in ''/opt/openvswitch-1.6.1-F15/bin'' and ''/opt/openvswitch-1.6.1-F15/sbin''. Some of these commands may be helpful to you. If you add these paths to your shell’s ''$PATH'', you will be able to access their manual pages with man. Note that ''$PATH'' will not affect sudo, so you will still have to provide the absolute path to sudo; the absolute path is omitted from the following examples for clarity and formatting.
     251
     252 - '''ovs-vsctl'''[[BR]]
     253 Open vSwitch switches are primarily configured using the ''ovs-vsctl'' command. For exploring, you may find the ''ovs-vsctl show'' command useful, as it dumps the status of all virtual switches on the local Open vSwitch instance. Once you have some information on the local switch configurations, ''ovs-vsctl'' provides a broad range of capabilities that you will likely find useful for expanding your network setup to more complex configurations for testing and verification. In particular, the subcommands ''add-br'', ''add-port'', and ''set-controller'' may be of interest.
     254 - '''ovs-ofctl''' [[BR]]
     255 The switch host configured by the given rspec listens for incoming OpenFlow connections on localhost port 6634.
     256 You can use this to query the switch state using the ''ovs-ofctl'' command. In particular, you may find the ''dump-tables'' and ''dump-flows'' subcommands useful. For example, ''sudo ovs-ofctl dump-flows tcp:127.0.0.1:6634'' will output lines that look like this:
     257 {{{
     258cookie=0x4, duration=6112.717s, table=0, n packets=1, n bytes=74, idle age=78,priority=5,tcp,
     259nw src=10.10.10.0/24 actions=CONTROLLER:65535
     260 }}}
     261 This indicates that any TCP segment with source IP in the 10.10.10.0/24 subnet should be sent to the OpenFlow controller for processing, that it has been 78 seconds since such a segment was last seen, that one such segment has been seen so far, and the total number of bytes in packets matching this rule is 74. The other fields are perhaps interesting, but you will probably not need them for debugging. (Unless, of course, you choose to use multiple tables — an exercise in OpenFlow 1.1 functionality left to the reader.)
     262 - '''Unix utilities'''[[BR]]
     263 You will want to use a variety of Unix utilities, in addition to the tools listed in [http://groups.geni.net/geni/wiki/GENIEducation/SampleAssignments/OpenFlowAssignment/ExerciseLayout ExerciseLayout], to test your controllers. The standard ping and ''/usr/sbin/arping'' tools are useful for debugging connectivity (but make sure your controller passes ''ICMP ECHO REQUEST'' and ''REPLY'' packets and ''ARP'' traffic, respectively!), and the command ''netstat -an'' will show all active network connections on a Unix host; the TCP connections of interest in this exercise will be at the top of the listing. The format of netstat output is out of the scope of this tutorial, but information is available online and in the manual pages.
     264 - '''Linux netem''' [[BR]]
     265 Use the ''tc'' command to enable and configure delay and lossrate constraints on the outgoing interfaces for traffic traveling from the OpenFlow switch to the Aggregator node. To configure a path with a 20 ms delay and 10% lossrate on eth2, you would issue the command:
     266{{{
     267sudo tc qdisc add dev eth2 root handle 1:0 netem delay 20ms loss 2%
     268}}}
     269 Use the "tc qdisc change" command to reconfigure existing links,instead of "tc qdisc add". [[BR]]
     270
     271
     272= [wiki:GENIEducation/SampleAssignments/OpenFlowLoadBalancerTutorial/ExerciseLayout Introduction] =
     273= [wiki:GENIEducation/SampleAssignments/OpenFlowLoadBalancerTutorial/ExerciseLayout/Finish Next: Teardown Experiment] =