[[PageOutline]] = Detailed test plan for EG-MON-3: GENI Active Experiment Inspection Test = ''This page is GPO's working page for performing EG-MON-3. It is public for informational purposes, but it is not an official status report. See [wiki:GENIRacksHome/ExogeniRacks/AcceptanceTestStatus] for the current status of ExoGENI acceptance tests.'' == Page format == * The status chart summarizes the state of this test * The high-level description from test plan contains text copied exactly from the public test plan and acceptance criteria pages. * The steps contain things i will actually do/verify: * Steps may be composed of related substeps where i find this useful for clarity * Each step is either a preparatory step (identified by "(prep)") or a verification step (the default): * Preparatory steps are just things we have to do. They're not tests of the rack, but are prerequisites for subsequent verification steps * Verification steps are steps in which we will actually look at rack output and make sure it is as expected. They contain a '''Using:''' block, which lists the steps to run the verification, and an '''Expect:''' block which lists what outcome is expected for the test to pass. == Status of test == Meaning of states: * [[Color(lightgreen,Pass)]]: Step is completed and passed (for a verification step), or is completed (for a prep step) * [[Color(red,Fail)]]: Step is completed and failed, and is not being revisited * in progress: We are currently testing or iterating on this step * [[Color(orange,Blocked)]]: Step is blocked by some other step or activity || '''Step''' || '''State''' || '''Date completed''' || '''Tickets''' || '''Closed tickets / Comments''' || || 1 || [[Color(lightgreen,Pass)]] || 2012-08-14 || || || || 2 || [[Color(lightgreen,Pass)]] || 2012-08-14 || || || || 3 || [[Color(orange,Blocked)]] || || || blocked until the BBN rack is available with ORCA 4.0 || || 4 || [[Color(orange,Blocked)]] || || || blocked until the BBN rack is available with ORCA 4.0 || || 5 || [[Color(orange,Blocked)]] || || || blocked until the BBN rack is available with ORCA 4.0 || || 6 || [[Color(orange,Blocked)]] || || || (exoticket:10) [[BR]] blocked until the BBN rack is available with ORCA 4.0 || || 7 || [[Color(orange,Blocked)]] || || || (exoticket:10) [[BR]] blocked until the BBN rack is available with ORCA 4.0 || || 8 || [[Color(orange,Blocked)]] || || || (exoticket:10) [[BR]] blocked until the BBN rack is available with ORCA 4.0 || == High-level description from test plan == This test inspects the state of the rack data plane and control networks when experiments are running, and verifies that a site administrator can find information about running experiments. ==== Procedure ==== * An experimenter from the GPO starts up experiments to ensure there is data to look at: * An experimenter runs an experiment containing at least one rack VM, and terminates it. * An experimenter runs an experiment containing at least one rack VM, and leaves it running. * A site administrator uses available system and experiment data sources to determine current experimental state, including: * How many VMs are running and which experimenters own them * How many VMs were terminated within the past day, and which experimenters owned them * What !OpenFlow controllers the data plane switch, the rack !FlowVisor, and the rack FOAM are communicating with * A site administrator examines the switches and other rack data sources, and determines: * What MAC addresses are currently visible on the data plane switch and what experiments do they belong to? * For some experiment which was terminated within the past day, what data plane and control MAC and IP addresses did the experiment use? * For some experimental data path which is actively sending traffic on the data plane switch, do changes in interface counters show approximately the expected amount of traffic into and out of the switch? === Criteria to verify as part of this test === * VII.09. A site administrator can determine the MAC addresses of all physical host interfaces, all network device interfaces, all active experimental VMs, and all recently-terminated experimental VMs. (C.3.f) * VII.11. A site administrator can locate current configuration of flowvisor, FOAM, and any other OpenFlow services, and find logs of recent activity and changes. (D.6.a) * VII.18. Given a public IP address and port, an exclusive VLAN, a sliver name, or a piece of user-identifying information such as e-mail address or username, a site administrator or GMOC operator can identify the email address, username, and affiliation of the experimenter who controlled that resource at a particular time. (D.7) == Step 1 (prep): start a local experiment and terminate it == === Overview of Step 1 === * An experimenter requests an experiment from the local SM containing two rack VMs and a dataplane VLAN * The experimenter logs into a VM, and sends dataplane traffic * The experimenter terminates the experiment === Results of Step 1 from 2012-08-14 === I used this rspec: {{{ }}} I created the sliver successfully: {{{ slicename=jbstmp rspec=/home/jbs/subversion/geni/GENIRacks/ExoGENI/Spiral4/Rspecs/AcceptanceTests/EG-MON-3/step-1.rspec am=https://bbn-hn.exogeni.gpolab.bbn.com:11443/orca/xmlrpc omni createslice $slicename omni -a $am createsliver $slicename $rspec }}} I got two VMs: {{{ [14:32:04] jbs@jericho:/home/jbs +$ omni -a $am listresources $slicename |& grep hostname= }}} I created an account for myself on them and installed my preferred config files: {{{ logins='192.1.242.5 192.1.242.6' rootlogins=$(echo $logins | sed -re 's/([^ ]+)/root@\1/g') for login in $rootlogins ; do ssh $login date ; done shmux -c "apt-get install sudo" $rootlogins shmux -c 'sed -i -e "s/^%sudo ALL=(ALL) ALL$/%sudo ALL=(ALL) NOPASSWD:ALL/" /etc/sudoers' $rootlogins shmux -c 'sed -i -re "s/^(127.0.0.1.+localhost)$/\1 $(hostname)/" /etc/hosts' $rootlogins shmux -c 'useradd -c "Josh Smift" -G sudo -m -s /bin/bash jbs' $rootlogins shmux -c "sudo -u jbs mkdir ~jbs/.ssh" $rootlogins shmux -c "grep jbs /root/.ssh/authorized_keys > ~jbs/.ssh/authorized_keys" $rootlogins shmux -c "sudo chown jbs:jbs ~/.ssh/authorized_keys" $logins shmux -c "rm ~jbs/.profile" $logins for login in $logins ; do rsync -a ~/.cfhome/ $login: && echo $login & done shmux -c 'sudo apt-get install iperf' $logins }}} I then logged in to the two nodes, and ran a five-minute 10 Mbit iperf UDP between the two. On martin (receiver): {{{ [15:27:11] jbs@martin:/home/jbs +$ nice -n 19 iperf -u -B 172.16.1.12 -s -i 10 ------------------------------------------------------------ Server listening on UDP port 5001 Binding to local address 172.16.1.12 Receiving 1470 byte datagrams UDP buffer size: 122 KByte (default) ------------------------------------------------------------ [ 3] local 172.16.1.12 port 5001 connected with 172.16.1.11 port 57075 [ ID] Interval Transfer Bandwidth Jitter Lost/Total Datagrams [ 3] 0.0-10.0 sec 11.9 MBytes 10.0 Mbits/sec 0.017 ms 0/ 8504 (0%) [ 3] 10.0-20.0 sec 11.9 MBytes 10.0 Mbits/sec 0.023 ms 0/ 8503 (0%) [ 3] 20.0-30.0 sec 11.9 MBytes 10.0 Mbits/sec 0.021 ms 0/ 8504 (0%) [ 3] 30.0-40.0 sec 11.9 MBytes 10.0 Mbits/sec 0.024 ms 0/ 8504 (0%) [ 3] 40.0-50.0 sec 11.9 MBytes 10.0 Mbits/sec 0.021 ms 0/ 8504 (0%) [ 3] 50.0-60.0 sec 11.9 MBytes 10.0 Mbits/sec 0.015 ms 0/ 8504 (0%) [ 3] 60.0-70.0 sec 11.9 MBytes 10.0 Mbits/sec 0.018 ms 0/ 8503 (0%) [ 3] 70.0-80.0 sec 11.9 MBytes 10.0 Mbits/sec 0.022 ms 0/ 8504 (0%) [ 3] 80.0-90.0 sec 11.9 MBytes 10.0 Mbits/sec 0.025 ms 0/ 8504 (0%) [ 3] 90.0-100.0 sec 11.9 MBytes 10.0 Mbits/sec 0.025 ms 0/ 8504 (0%) [ 3] 100.0-110.0 sec 11.9 MBytes 10.0 Mbits/sec 0.020 ms 0/ 8504 (0%) [ 3] 110.0-120.0 sec 11.9 MBytes 10.0 Mbits/sec 0.026 ms 0/ 8503 (0%) [ 3] 120.0-130.0 sec 11.9 MBytes 10.0 Mbits/sec 0.032 ms 0/ 8504 (0%) [ 3] 130.0-140.0 sec 11.9 MBytes 10.0 Mbits/sec 0.027 ms 0/ 8504 (0%) [ 3] 140.0-150.0 sec 11.9 MBytes 10.0 Mbits/sec 0.019 ms 0/ 8504 (0%) [ 3] 150.0-160.0 sec 11.9 MBytes 10.0 Mbits/sec 0.022 ms 0/ 8504 (0%) [ 3] 160.0-170.0 sec 11.9 MBytes 10.0 Mbits/sec 0.026 ms 0/ 8504 (0%) [ 3] 170.0-180.0 sec 11.9 MBytes 10.0 Mbits/sec 0.019 ms 0/ 8503 (0%) [ 3] 180.0-190.0 sec 11.9 MBytes 10.0 Mbits/sec 0.020 ms 0/ 8504 (0%) [ 3] 190.0-200.0 sec 11.9 MBytes 10.0 Mbits/sec 0.027 ms 0/ 8504 (0%) [ 3] 200.0-210.0 sec 11.9 MBytes 10.0 Mbits/sec 0.020 ms 0/ 8504 (0%) [ 3] 210.0-220.0 sec 11.9 MBytes 10.0 Mbits/sec 0.022 ms 0/ 8503 (0%) [ 3] 220.0-230.0 sec 11.9 MBytes 10.0 Mbits/sec 0.018 ms 0/ 8504 (0%) [ 3] 230.0-240.0 sec 11.9 MBytes 10.0 Mbits/sec 0.020 ms 0/ 8504 (0%) [ 3] 240.0-250.0 sec 11.9 MBytes 10.0 Mbits/sec 0.033 ms 0/ 8504 (0%) [ 3] 250.0-260.0 sec 11.9 MBytes 10.0 Mbits/sec 0.031 ms 0/ 8503 (0%) [ 3] 260.0-270.0 sec 11.9 MBytes 10.0 Mbits/sec 0.018 ms 0/ 8504 (0%) [ 3] 270.0-280.0 sec 11.9 MBytes 10.0 Mbits/sec 0.022 ms 0/ 8504 (0%) [ 3] 280.0-290.0 sec 11.9 MBytes 10.0 Mbits/sec 0.021 ms 0/ 8504 (0%) [ 3] 0.0-300.0 sec 358 MBytes 10.0 Mbits/sec 0.016 ms 0/255103 (0%) [ 3] 0.0-300.0 sec 1 datagrams received out-of-order }}} On rowan (sender): {{{ [15:27:10] jbs@rowan:/home/jbs +$ nice -n 19 iperf -u -c 172.16.1.12 -t 300 -b 10M ------------------------------------------------------------ Client connecting to 172.16.1.12, UDP port 5001 Sending 1470 byte datagrams UDP buffer size: 122 KByte (default) ------------------------------------------------------------ [ 3] local 172.16.1.11 port 57075 connected with 172.16.1.12 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0-300.0 sec 358 MBytes 10.0 Mbits/sec [ 3] Sent 255104 datagrams [ 3] Server Report: [ 3] 0.0-300.0 sec 358 MBytes 10.0 Mbits/sec 0.016 ms 0/255103 (0%) [ 3] 0.0-300.0 sec 1 datagrams received out-of-order }}} (I also had some earlier runs where there was a lot of packet loss, but things smoothed out over time, so I kept this one.) I then deleted the sliver: {{{ omni -a $am deletesliver $slicename $rspec }}} == Step 2 (prep): start an ExoSM experiment and terminate it == === Overview of Step 2 === * An experimenter requests an experiment from the ExoSM containing two rack VMs and a dataplane VLAN * The experimenter logs into a VM, and sends dataplane traffic * The experimenter terminates the experiment === Results of Step 2 from 2012-08-14 === I used this rspec: {{{ }}} I created the sliver successfully: {{{ slicename=jbstmp rspec=/home/jbs/subversion/geni/GENIRacks/ExoGENI/Spiral4/Rspecs/AcceptanceTests/EG-MON-3/step-2.rspec am=https://geni.renci.org:11443/orca/xmlrpc omni createslice $slicename omni -a $am createsliver $slicename $rspec }}} I got two VMs: {{{ [16:26:59] jbs@jericho:/home/jbs +$ omni -a $am listresources $slicename |& grep hostname= }}} I created an account for myself on them and installed my preferred config files: {{{ logins='192.1.242.7 192.1.242.8' rootlogins=$(echo $logins | sed -re 's/([^ ]+)/root@\1/g') for login in $rootlogins ; do ssh $login date ; done shmux -c "apt-get install sudo" $rootlogins shmux -c 'sed -i -e "s/^%sudo ALL=(ALL) ALL$/%sudo ALL=(ALL) NOPASSWD:ALL/" /etc/sudoers' $rootlogins shmux -c 'sed -i -re "s/^(127.0.0.1.+localhost)$/\1 $(hostname)/" /etc/hosts' $rootlogins shmux -c 'useradd -c "Josh Smift" -G sudo -m -s /bin/bash jbs' $rootlogins shmux -c "sudo -u jbs mkdir ~jbs/.ssh" $rootlogins shmux -c "grep jbs /root/.ssh/authorized_keys > ~jbs/.ssh/authorized_keys" $rootlogins shmux -c "sudo chown jbs:jbs ~/.ssh/authorized_keys" $logins shmux -c "rm ~jbs/.profile" $logins for login in $logins ; do rsync -a ~/.cfhome/ $login: && echo $login & done shmux -c 'sudo apt-get install iperf' $logins }}} I then logged in to the two nodes, and ran a five-minute 10 Mbit iperf UDP between the two. On martin (receiver): {{{ [16:29:51] jbs@martin:/home/jbs +$ nice -n 19 iperf -u -B 172.16.1.12 -s -i 10 ------------------------------------------------------------ Server listening on UDP port 5001 Binding to local address 172.16.1.12 Receiving 1470 byte datagrams UDP buffer size: 122 KByte (default) ------------------------------------------------------------ [ 3] local 172.16.1.12 port 5001 connected with 172.16.1.11 port 35124 [ ID] Interval Transfer Bandwidth Jitter Lost/Total Datagrams [ 3] 0.0-10.0 sec 11.9 MBytes 10.0 Mbits/sec 0.026 ms 4/ 8509 (0.047%) [ 3] 0.0-10.0 sec 4 datagrams received out-of-order [ 3] 10.0-20.0 sec 11.9 MBytes 10.0 Mbits/sec 0.018 ms 0/ 8503 (0%) [ 3] 20.0-30.0 sec 11.9 MBytes 10.0 Mbits/sec 0.015 ms 0/ 8504 (0%) [ 3] 30.0-40.0 sec 11.9 MBytes 10.0 Mbits/sec 0.015 ms 0/ 8503 (0%) [ 3] 40.0-50.0 sec 11.9 MBytes 10.0 Mbits/sec 0.019 ms 0/ 8503 (0%) [ 3] 50.0-60.0 sec 11.9 MBytes 10.0 Mbits/sec 0.027 ms 0/ 8504 (0%) [ 3] 60.0-70.0 sec 11.9 MBytes 10.0 Mbits/sec 0.017 ms 0/ 8504 (0%) [ 3] 70.0-80.0 sec 11.9 MBytes 10.0 Mbits/sec 0.022 ms 0/ 8503 (0%) [ 3] 80.0-90.0 sec 11.9 MBytes 10.0 Mbits/sec 0.028 ms 0/ 8504 (0%) [ 3] 90.0-100.0 sec 11.9 MBytes 10.0 Mbits/sec 0.015 ms 0/ 8503 (0%) [ 3] 100.0-110.0 sec 11.9 MBytes 10.0 Mbits/sec 0.013 ms 0/ 8503 (0%) [ 3] 110.0-120.0 sec 11.9 MBytes 10.0 Mbits/sec 0.018 ms 0/ 8503 (0%) [ 3] 120.0-130.0 sec 11.9 MBytes 10.0 Mbits/sec 0.022 ms 0/ 8504 (0%) [ 3] 130.0-140.0 sec 11.9 MBytes 10.0 Mbits/sec 0.023 ms 0/ 8503 (0%) [ 3] 140.0-150.0 sec 11.9 MBytes 10.0 Mbits/sec 0.019 ms 0/ 8504 (0%) [ 3] 150.0-160.0 sec 11.9 MBytes 10.0 Mbits/sec 0.022 ms 0/ 8503 (0%) [ 3] 160.0-170.0 sec 11.9 MBytes 10.0 Mbits/sec 0.020 ms 0/ 8504 (0%) [ 3] 170.0-180.0 sec 11.9 MBytes 10.0 Mbits/sec 0.017 ms 0/ 8503 (0%) [ 3] 180.0-190.0 sec 11.9 MBytes 10.0 Mbits/sec 0.018 ms 0/ 8503 (0%) [ 3] 190.0-200.0 sec 11.9 MBytes 10.0 Mbits/sec 0.012 ms 0/ 8504 (0%) [ 3] 200.0-210.0 sec 11.9 MBytes 10.0 Mbits/sec 0.017 ms 0/ 8504 (0%) [ 3] 210.0-220.0 sec 11.9 MBytes 10.0 Mbits/sec 0.016 ms 0/ 8503 (0%) [ 3] 220.0-230.0 sec 11.9 MBytes 10.0 Mbits/sec 0.021 ms 0/ 8503 (0%) [ 3] 230.0-240.0 sec 11.9 MBytes 10.0 Mbits/sec 0.024 ms 0/ 8504 (0%) [ 3] 240.0-250.0 sec 11.9 MBytes 10.0 Mbits/sec 0.025 ms 0/ 8504 (0%) [ 3] 250.0-260.0 sec 11.9 MBytes 10.0 Mbits/sec 0.015 ms 0/ 8503 (0%) [ 3] 260.0-270.0 sec 11.9 MBytes 10.0 Mbits/sec 0.017 ms 0/ 8503 (0%) [ 3] 270.0-280.0 sec 11.9 MBytes 10.0 Mbits/sec 0.016 ms 0/ 8504 (0%) [ 3] 280.0-290.0 sec 11.9 MBytes 10.0 Mbits/sec 0.014 ms 0/ 8503 (0%) [ 3] 0.0-300.0 sec 358 MBytes 10.0 Mbits/sec 0.018 ms 3/255103 (0.0012%) [ 3] 0.0-300.0 sec 5 datagrams received out-of-order }}} On rowan (sender): {{{ [16:29:58] jbs@rowan:/home/jbs +$ nice -n 19 iperf -u -c 172.16.1.12 -t 300 -b 10M ------------------------------------------------------------ Client connecting to 172.16.1.12, UDP port 5001 Sending 1470 byte datagrams UDP buffer size: 122 KByte (default) ------------------------------------------------------------ [ 3] local 172.16.1.11 port 35124 connected with 172.16.1.12 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0-300.0 sec 358 MBytes 10.0 Mbits/sec [ 3] Sent 255104 datagrams [ 3] Server Report: [ 3] 0.0-300.0 sec 358 MBytes 10.0 Mbits/sec 0.017 ms 3/255103 (0.0012%) [ 3] 0.0-300.0 sec 5 datagrams received out-of-order }}} I then deleted the sliver: {{{ omni -a $am deletesliver $slicename $rspec }}} == Step 3 (prep): start an experiment and leave it running == === Overview of Step 3 === * An experimenter requests an experiment from the local SM containing two rack VMs connected by an OpenFlow-controlled dataplane VLAN * The experimenter configures a simple OpenFlow controller to pass dataplane traffic between the VMs * The experimenter logs into one VM, and begins sending a continuous stream of dataplane traffic == Step 4: view running VMs == === Overview of Step 4 === '''Using:''' * On bbn-hn, use SM state, logs, or administrator interfaces to determine: * What experiments are running right now * How many VMs are allocated for those experiments * Which worker node is each VM running on * On bbn worker nodes, use system state, logs, or administrative interfaces to determine what VMs are running right now, and look at any available configuration or logs of each. '''Verify:''' * A site administrator can determine what experiments are running on the local SM * A site administrator can determine the mapping of VMs to active experiments * A site administrator can view some state of running VMs on the VM server == Step 5: get information about terminated VMs == === Overview of Step 5 === '''Using:''' * On bbn-hn use SM state, logs, or administrator interfaces to find evidence of the two terminated experiments. * Determine how many other experiments were run in the past day. * Determine which GENI user created each of the terminated experiments. * Determine the mapping of experiments to VM servers for each of the terminated experiments. * Determine the control and dataplane MAC addresses assigned to each VM in each terminated experiment. * Determine any IP addresses assigned by ExoGENI to each VM in each terminated experiment. '''Verify:''' * A site administrator can get ownership and resource allocation information for recently-terminated experiments which were created on the local SM. * A site administrator can get ownership and resource allocation information for recently-terminated experiments which were created using ExoSM. * A site administrator can get information about MAC addresses and IP addresses used by recently-terminated experiments. == Step 6: get !OpenFlow state information == === Overview of Step 6 === '''Using:''' * On the 8264 (dataplane) switch, get a list of controllers, and see if any additional controllers are serving experiments. * On bbn-hn, get a list of active FV slices from the !FlowVisor * On bbn-hn, get a list of active slivers from FOAM * On bbn-hn, use FV or FOAM to get a list of the flowspace of a running !OpenFlow experiment. '''Verify:''' * A site administrator can get information about the !OpenFlow resources used by running experiments. * No new controllers are added directly to the switch when an !OpenFlow experiment is running. * A new slice has been added to the !FlowVisor which points to the experimenter's controller. * No new sliver has been added to FOAM. == Step 7: verify MAC addresses on the rack dataplane switch == === Overview of Step 7 === '''Using:''' * Establish a privileged login to the 8264 (dataplane) switch * Obtain a list of the full MAC address table of the switch * On bbn-hn and the worker nodes, use available data sources to determine which host or VM owns each MAC address. '''Verify:''' * It is possible to identify and classify every MAC address visible on the switch == Step 8: verify active dataplane traffic == === Overview of Step 8 === '''Using:''' * Establish a privileged login to the 8264 (dataplane) switch * Based on the information from Step 7, determine which interfaces are carrying traffic between the experimental VMs * Collect interface counters for those interfaces over a period of 10 minutes * Estimate the rate at which the experiment is sending traffic '''Verify:''' * The switch reports interface counters, and an administrator can obtain plausible estimates of dataplane traffic quantities by looking at them.