wiki:OMF5.3ExperimentAnalysis

Version 2 (modified by hmussman@bbn.com, 8 years ago) (diff)

--

11. Processing and plotting measurement data

The collected results corresponding to each run of the experiment would be stored in a sqlite3(*.sq3) file. We have created scripts that analyze the collected results, and create CSV files based on the analysis. These CSV files are later fed into R scripts to plot graphs. The whole process is automated. These scripts are available at http://software.geni.net/local-sw/

Processing can be done in any Linux server, mobile node or another server

The post processing scripts run on any flavor of Linux. This machine can be a experimenters laptop or the Central OML server.

Measurement data is retrieved from OML server sql db; how?

Measurement data is processed with a script

The following code, describes a sample ruby script for the analysis of the results of a set of "udp-dual" experiments. It assumes that all the relevant sq3 files are in the "basedir" directory with a name with the following pattern: point#{i}_*.sq3 where i is a number/index indicating the position of measurement.

The script takes as input a text file with the list of the points of interest (one position index per line) and prints a csv file with each line containing the point#,avgBW,stdBW,avgRSSI,stdRSSI,avgCINR,stdCINR , that is, the average and standard error of bandwidth, rssi, and cinr for each point, based on the experiments performed on that point. Note that in this analysis we only consider the downlink throughput in the dual-mode test, the data for which is collected on the mobile client (for uplink throughput, you will have to look at the data collected on the server side. A very similar analysis can be performed on the sq3 files collected there). Downlink connection ID can be determined knowing the local_address and local_port to which the connection is made.

#----------------------------------------------------------------------
# Copyright (c) 2011 Raytheon BBN Technologies
#
# Permission is hereby granted, free of charge, to any person obtaining
# a copy of this software and/or hardware specification (the "Work") to
# deal in the Work without restriction, including without limitation the
# rights to use, copy, modify, merge, publish, distribute, sublicense,
# and/or sell copies of the Work, and to permit persons to whom the Work
# is furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be
# included in all copies or substantial portions of the Work.
#
# THE WORK IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
# OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
# MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
# HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
# WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE WORK OR THE USE OR OTHER DEALINGS
# IN THE WORK.
#----------------------------------------------------------------------

#!/usr/bin/ruby

require 'rubygems'
require 'sqlite3'

######### Analysis File Dir#############

# MAKE SURE THESE PARAMETERS ARE RIGHT #

basedir = './dual_udp'
local_server_addr='192.1.240.106'
local_server_port='5001'
foreign_server_addr='192.1.240.126'
foreign_server_port='5001'

########################################


module Enumerable

	def sum
		return self.inject(0) { |acc,i|acc +i }
	end

	def average
		return self.sum/self.length.to_f
	end

	def sample_variance
		avg=self.average
		sum=self.inject(0) {|acc,i| acc + (i-avg)**2}
		return(1/self.length.to_f*sum)
	end

	def standard_deviation
		return Math.sqrt(self.sample_variance)
	end

	def standard_error
		return self.standard_deviation/Math.sqrt(self.length.to_f)
	end

end #module Enumerable

##################### Analysis starts here ######################

# read path points from the input_file. Note that input_file must 
# have a valid point number and only a valid point number in each
# line.
#################################################################


unless ARGV.length == 1
	puts "Usage: new_clientside_analyze_dual_udp input_file(s)"
	puts "Note: Each input_file must have a valid point number"
	puts "      and only a valid point number in each line."
	exit
end

# read path_points for each input_file
input_file = ARGV[0]
if File.exists?(input_file)
	path_points=[]
	f = File.open(input_file) or die "Unable to open file..."
	f.each_line{ |line|
		path_points.push line.to_i
	}
	#puts path_points.inspect

	# uncomment if you're interested in having headers
	#puts "pos#,avgBW,stdBW,avgRSSI,stdRSSI,avgCINR,stdCINR"

	# for each point on the path print out the measurements:
	for i in path_points

		avgBandwidth = Array.new
		avgRSSI = Array.new
		avgCINR = Array.new


		# get corresponding sqlite3 files
		files = Dir["#{basedir}/point#{i}_*.sq3"]
		if files.empty? then
			#puts "No proper files found for i=#{i}"
			puts "#{i},0.0,0.0,0.0,0.0,0.0,0.0"
			next
		end
		

		# each file has the average throughput as its last entry. get that.
		files.each do |f|
			db = SQLite3::Database.new(f)

			# throughput measurements of both directions are pushed into the same table.
			# they can be distinguished by their different connection ids.
			# connection ids can be found in a separate table.
		
			# fist find the downlink and uplink connection ids.
			
			#uplink_ID_query = "select connection_id from iperf_connection where foreign_address='#{foreign_server_addr}' and foreign_port='#{foreign_server_port}' ;"
			
			#uplink_ID   = db.get_first_value(uplink_ID_query);

			downlink_conID_query = "select connection_id from iperf_connection where local_address='#{local_server_addr}' and local_port='#{local_server_port}';"			
			downlink_conID = db.get_first_value(downlink_conID_query);
			#puts downlink_conID;
				
			# now calucate the downlink bandwidth
			downlink_result = db.get_first_value( "select size*8/(1024*(end_interval-begin_interval)) from iperf_transfer where (end_interval-begin_interval > 1) and connection_id='#{downlink_conID}' order by oml_seq desc limit 1;" )   

			# note that if the file exists but query result is nil, throughput is 0.0
			if downlink_result.nil? then
				#puts "Query result for (Bandwidth) is nil. using 0.0 for i=#{i}"
				# this could be due to the problem with the last element not being the average (iperf issue).
				downlink_result2 = db.get_first_value( "select sum(size)*8/(1024*End_interval) from iperf_transfer" )
				#puts "for i=#{i} , result2 is #{result2}"
				if downlink_result2.nil? or (downlink_result2<0) then
					avgBandwidth << 0.0
				else
					avgBandwidth << downlink_result2
				end
			else
				avgBandwidth << downlink_result
			end


			# now let's get average/stdD RSSI
			result = db.get_first_value( "select avg(RSSI) from wimaxcu_wimaxstat" )
			if result.nil? then
				#puts "Query result for (RSSI) is nil. using 0.0 for i=#{i}"
				avgRSSI << 0.0
			else
				avgRSSI << result
			end

			# now let's get average/stdD CINR
			result = db.get_first_value( "select avg(CINR) from wimaxcu_wimaxstat" )
			if result.nil? then
				#puts "Query result is nil. using 0.0 for i=#{i}"
				avgCINR << 0.0
			else
				avgCINR << result
			end

		end
		puts "#{i},#{avgBandwidth.average},#{avgBandwidth.standard_error},#{avgRSSI.average},#{avgRSSI.standard_error},#{avgCINR.average},#{avgCINR.standard_error}"
	end # end for each point

else
	abort("Invalid input_file name.")		
end	

The CSV file created by the above script, can be fed into R scripts that plot the extracted data. The following is an example of a R script that plots throughput data:

#----------------------------------------------------------------------
# Copyright (c) 2011 Raytheon BBN Technologies
#
# Permission is hereby granted, free of charge, to any person obtaining
# a copy of this software and/or hardware specification (the "Work") to
# deal in the Work without restriction, including without limitation the
# rights to use, copy, modify, merge, publish, distribute, sublicense,
# and/or sell copies of the Work, and to permit persons to whom the Work
# is furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be
# included in all copies or substantial portions of the Work.
#
# THE WORK IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
# OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
# MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
# HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
# WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE WORK OR THE USE OR OTHER DEALINGS
# IN THE WORK.
#----------------------------------------------------------------------

#! /usr/local/bin/Rscript

##########################################################################
#	This is the R script to plot dual udp 'throughput' data         	 #
##########################################################################


# get input/output files as arguments
args <- commandArgs(TRUE);

print (args);
infile <- args[[1]];
outfile <- args[[2]];

print(infile);
print(outfile);

# skip list (puts annotation markers for points in this list). This to mark points where the node could not connect or no experiment was done.
# skip_note_y : y coordinate of where to put the marker. This should be manually modified based on the range of y axis.
skip_list = c();
skip_note_y = c(1060);

## setup the device
pdf(outfile, width=4, height=3, pointsize=10);
oldpar <- par(font.lab=2, font.axis=2,
		mar = c(4,4,.5,.5),
		oma = c(0,0,0,0),
		mgp = c(2.5,1,0)
	     );

## read in the data
data <- read.table(infile, header=FALSE, sep=",");
locations <- data[[1]];
means.throughput <- data[[2]];

#stderr.throughput <- data[[3]];
names(means.throughput) <- locations

# use gplots library to draw error bars
library(gplots);
plot(x=means.throughput, pch=20, lty=1, xaxt="n",  ylim=c(0,max(means.throughput)*1.5), gap=0, ylab="Avg. Downlink Throughput (Kb/s)", xlab="Location" );

# turn on grids
abline(h=seq(0,max(means.throughput)*1.5,by=1000),lty=2, col="gray");

# Put markers for skip_list
for (i in locations) {
	if (i %in% skip_list)
	text(which(locations==i,arr.ind=TRUE),skip_note_y,labels="*",col="red");
}

# Draw the x-axis and y-axis (omitted above)
axis(side=1, at=seq(1,length(locations), by=1), labels=paste("P",names(means.throughput),sep=""), cex=1.0);

You can automate the process of graph creation by writing a bash script (make_plots.sh) similar to the following and running it in shell:

#----------------------------------------------------------------------
# Copyright (c) 2011 Raytheon BBN Technologies
#
# Permission is hereby granted, free of charge, to any person obtaining
# a copy of this software and/or hardware specification (the "Work") to
# deal in the Work without restriction, including without limitation the
# rights to use, copy, modify, merge, publish, distribute, sublicense,
# and/or sell copies of the Work, and to permit persons to whom the Work
# is furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be
# included in all copies or substantial portions of the Work.
#
# THE WORK IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
# OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
# MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
# HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
# WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE WORK OR THE USE OR OTHER DEALINGS
# IN THE WORK.
#----------------------------------------------------------------------

#!/bin/bash

./new_clientside_analyze_dual_udp.rb ptest > text/ptest_dual_udp_log.txt

R --no-save --no-restore --no-environ --args text/ptest_dual_udp_log.txt figures/dual/ptest_udp_dual_throughput.pdf < Rscripts/plot_udp_dual_throughput.R

The first line uses the analysis script to pull data out of the sq3 files and create a csv file, putting it in the 'text/ptest_dual_udp_log.txt'. The file "ptest" contains EXACTLY two lines, one with number 1 and the other with number 2 on it indicating the point indices. The next line calls the R script to read from the that text file, and create a pdf figure containing throughput data. Similar R scripts can be used for plotting RSSI and CINR, being called in the same way.