Changes between Version 6 and Version 7 of WebServerExample


Ignore:
Timestamp:
12/08/14 17:18:30 (9 years ago)
Author:
sedwards@bbn.com
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • WebServerExample

    v6 v7  
    1919== Setup ==
    2020
    21  * Start Flack and create a new slice
    22  * Load the rspec from this URL to Flack  [http://www.gpolab.bbn.com/experiment-support/WebServer/websrv.rspec].
    23  * submit for sliver creation (also fine to use omni, if you prefer). Your sliver should look something like this:
     21 * Create a new slice using the resource reservation tool of your choice (Jacks/Portal, Omni, Flack).
     22 * Reserve the RSpec from this URL  [http://www.gpolab.bbn.com/experiment-support/WebServer/websrv.rspec].
     23 * Your sliver should look something like this:
    2424
    2525[[Image(WebsrvExampleSliverJacks.png, 25%)]]
    2626
    2727In this setup, there is one host acting as a web server. To test that the webserver is up visit the web page of the Server host, use either of the following techniques:
    28    * Open a web browser and go to the webpage !http://<hostname>. In the above example this would be !http://pc484.emulab.net, or
     28   * Open a web browser and go to the webpage !http://<hostname>. In the above example this would be !http://pcvm5-12.lan.sdn.uky.edu/, or
    2929   * Press on the (i) button in Flack and then press the Visit button.
    3030
     
    69692012-07-06 04:59:09 (120 MB/s) - “index.html” saved [548/548]
    7070}}}
    71    '''Note:''' In the above command we used `http://server` instead of `http://pc484.emulab.net` so that we can contact the web server over the private connection we have created, instead of the server's public interface. The private connections are the ones that are represented with lines between hosts in Flack. When you do load testing on your web server, you should run tests from the two client machines in your test configuration, using the `http://server` address, so that you are testing the performance of your server and not your Internet connection to the lab.
     71   '''Note:''' In the above command we used `http://server` instead of `http://pcvm5-12.lan.sdn.uky.edu` so that we can contact the web server over the private connection we have created, instead of the server's public interface. The private connections are the ones that are represented with lines between hosts in Jacks and Flack. When you do load testing on your web server, you should run tests from the two client machines in your test configuration, using the `http://server` address, so that you are testing the performance of your server and not your Internet connection to the lab.
    7272 
    73  * The above command only downloads the `index.html` file from the webserver. As we are going to see later a web page may include other web pages or objects such as images, videos etc. In order to force wget to download all dependencies of a page use the following options :
     73 * The above command only downloads the `index.html` file from the webserver. As we are going to see later a web page may include other web pages or objects such as images, videos etc. In order to force `wget` to download all dependencies of a page use the following options:
    7474   {{{
    7575[inki@Client1 ~]$ wget -m -p http://server
     
    9090(Type two carriage returns after the "GET" command).  This will return to you (on the command line) the HTML representing the "front page" of the web server that is running on the `Server` host.)
    9191
    92 One of the key things to keep in mind in building your web server is that the server is translating relative filenames (such as index.html ) to absolute filenames in a local filesystem.  For example, you might decide to keep all the files for your server in ~10abc/cs339/server/files/, which we call the document root.  When your server gets a request for index.html (which is the default web page if no file is specified), it will prepend the document root to the specified file and determine if the file exists, and if the proper permissions are set on the file (typically the file has to be world readable).  If the file does not exist, a file not found error is returned.  If a file is present but the proper permissions are not set, a permission denied error is returned.  Otherwise, an HTTP OK message is returned along with the contents of a file.
     92One of the key things to keep in mind in building your web server is that the server is translating relative filenames (such as `index.html`) to absolute filenames in a local filesystem.  For example, you might decide to keep all the files for your server in ~10abc/cs339/server/files/, which we call the document root.  When your server gets a request for `index.html` (which is the default web page if no file is specified), it will prepend the document root to the specified file and determine if the file exists, and if the proper permissions are set on the file (typically the file has to be world readable).  If the file does not exist, a file not found error is returned.  If a file is present but the proper permissions are not set, a permission denied error is returned.  Otherwise, an HTTP OK message is returned along with the contents of a file.
    9393
    9494In our setup we are using the [http://httpd.apache.org/ Apache web server]. The default document root for Apache on a host running Fedora 10 is under `/var/www/html`.
     
    9696  * Run
    9797  {{{
    98 [inki@server ~]$ ls /var/www/html/
     98[inki@server ~]$ ls /var/www/
    9999  }}}
    100100  This should give you a similar structure to the directory structure you got when you downloaded the whole site with wget on the previous steps.
    101101
    102 You should also note that since index.html is the default file, web servers typically translate "GET /" to "GET /index.html".  That way index.html is assumed to be the filename if no explicit filename is present.  This is also why the two URLs http://server (or http://pc484.emulab.net) and http://server/index.html  (or http://pc484.emulab.net/index.html) return equivalent results.
     102You should also note that since `index.html` is the default file, web servers typically translate "GET /" to "GET /index.html".  That way index.html is assumed to be the filename if no explicit filename is present.  This is also why the two URLs `http://server` (or `http://pcvm5-12.lan.sdn.uky.edu`) and `http://server/index.html`  (or `http://pcvm5-12.lan.sdn.uky.edu/index.html`) return equivalent results.
    103103
    104 When you type a URL into a web browser, the server retrieves the contents of the requested file.  If the file is of type text/html and HTTP/1.0 is being used, the browser will parse the html for embedded links (such as images) and then make separate connections to the web server to retrieve the embedded files.  If a web page contains 4 images, a total of five separate connections will be made to the web server to retrieve the html and the four image files.
     104When you type a URL into a web browser, the server retrieves the contents of the requested file.  If the file is of type `text/html` and HTTP/1.0 is being used, the browser will parse the html for embedded links (such as images) and then make separate connections to the web server to retrieve the embedded files.  If a web page contains 4 images, a total of five separate connections will be made to the web server to retrieve the html and the four image files.
    105105
    106106Using HTTP/1.0, a separate connection is used for each requested file. This implies that the TCP connections being used never get out of the slow start phase. HTTP/1.1 attempts to address this limitation. When using HTTP/1.1, the server keeps connections to clients open, allowing for "persistent" connections and pipelining of client requests. That is, after the results of a single request are returned (e.g., index.html), the server should by default leave the connection open for some period of time, allowing the client to reuse that connection to make subsequent requests. One key issue here is determining how long to keep the connection open. This timeout needs to be configured in the server and ideally should be dynamic based on the number of other active connections the server is currently supporting. Thus if the server is idle, it can afford to leave the connection open for a relatively long period of time. If the server is busy servicing several clients at once, it may not be able to afford to have an idle connection sitting around (consuming kernel/thread resources) for very long. You should develop a simple heuristic to determine this timeout in your server.
     
    125125 3. An event-driven architecture will keep a list of active connections and loop over them, performing a little bit of work on behalf of each connection.  For example, there might be a loop that first checks to see if any new connections are pending to the server (performing appropriate bookkeeping if so), and then it will loop overall all existing client connections and send a "block" of file data to each (e.g., 4096 bytes, or 8192 bytes, matching the granularity of disk block size).  This event-driven architecture has the primary advantage of avoiding any synchronization issues associated with a multi-threaded model (though synchronization effects should be limited in your simple web server) and avoids the performance overhead of context switching among a number of threads.
    126126
    127 You may choose from C or C++ to build your web server but you must do it in Linux (although the code should run on any Unix system).  In C/C++, you will want to become familiar with the interactions of the following system calls to build your system: socket(), select(), listen(), accept(), connect() .  We outline a number of resources below with additional information on these system calls.  A good book is also available on this topic (there is a reference copy of this in the lab).
     127You may choose from C or C++ to build your web server but you must do it in Linux (although the code should run on any Unix system).  In C/C++, you will want to become familiar with the interactions of the following system calls to build your system: socket(), select(), listen(), accept(), connect() .  We outline a number of resources below with additional information on these system calls.  A good book is also available on this topic.
    128128
    129129== What to hand in ==