Changes between Initial Version and Version 1 of GIMSDocumentation


Ignore:
Timestamp:
08/05/10 15:45:03 (14 years ago)
Author:
jsommers@colgate.edu
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • GIMSDocumentation

    v1 v1  
     1== Overview to GIMS Capture Daemon ==
     2
     3There are two basic components to the capture node software:
     4capture-daemon, and capd_proxy.py.  The proxy handles all XML/RPC
     5communication with the GIMS AM and UI, and manages starting any
     6necessary capture processes and resulting capture files.
     7
     8Below, specific description of how to install and get the capture daemon proxy running are given.
     9
     10'''capture-daemon'''
     11
     12----
     13
     14The capture daemon can be run as a standalone process, but is normally
     15started up by the capd_proxy.  capd_proxy should run in an
     16unprivileged mode, thus capture-daemon should be installed as setuid
     17root (or a privileged user that has appropriate access to network
     18devices via libpcap) in order for it to have appropriate permissions
     19to open capture devices, etc.  It can be installed anywhere, but should be put
     20on a file system with reasonable enough space for temporary measurement
     21files to reside.
     22
     23The present capabilities of the capture daemon with respect to packet transformations
     24are:
     25 * prefix-preserving IP address anonymization
     26 * sampling (1-in-N or uniform probabilistic sampling)
     27 * aggregation into either simple byte/packet counts, or flow-records.  By default, no
     28 aggregation is done and you'll get libpcap files with raw packet headers.
     29
     30Separate capture-daemon processes will be started by capd_proxy for
     31each active experiment.  While there is some overhead involved with
     32having multiple capture processes running at the same time, this
     33architecture simplifies the problem of demultiplexing the appropriate
     34packets for a given experiment.  It also provides a level of isolation
     35among experiments, and enables a separate O&M capture process to be
     36running alongside any active experiment process (e.g., in order to
     37capture *all* traffic on the wire).
     38
     39The only software dependency for the basic functionality of
     40capture-daemon is a working libpcap library.
     41
     42For flow aggregation in IPFIX records, libyaf and libfixbuf are
     43required.  They are available at
     44http://tools.netsa.cert.org/index.html.  These tools should be
     45compiled and installed prior to compiling the capture-daemon code.
     46They have their own set of prerequisites, notably glib 2.6.4 or
     47better.  See the yaf/fixbuf documentation for more details.
     48
     49From inside the capture-daemon subdirectory, configure and make, then
     50sudo make install.  (Make install just changes the capture-daemon
     51binary to be suid root so that it can have permissions to set a net
     52device in promiscuous mode.  Alternatively, run capd_proxy.py as root.)
     53
     54
     55'''capd_proxy.py'''
     56
     57----
     58
     59Requires python 2.6.
     60
     61The capd_proxy handles all XML/RPC functions for configuring,
     62starting, and stopping experiments.  There are also interfaces for
     63testing storage capabilities for an experiment, and for gathering
     64information on running experiments.
     65
     66Presently, storage functionality is included for:
     67 * sftp
     68 * Amazon S3
     69
     70All storage functionality resides in capd_storage.py.  It is designed in
     71a fashion to allow addition of new storage repositories relatively easy.
     72
     73From the capture daemon proxy, a separate process is started up to
     74handle each storage type (s3, ssh, and local storage).  This storage
     75process handles checking for new files that can be uploaded, and also
     76annotates the existing metadata for raw capture files prior to upload.
     77Each individual file transfer is handled in a transient (Python)
     78thread in order to avoid blocking the entire process on a single
     79transfer.  (Note that some work is yet to be done on ensuring the
     80propagation of errors from storage uploads to the UI (and
     81experimenters).)  Storage functionality is almost entirely housed in
     82the capd_storage.py module.
     83
     84The software dependencies for capd_proxy are related to the storage
     85capabilities: the boto Python library is required for s3, and paramiko
     86and pyCrypto are required for ssh.  Each of these libraries can be
     87quite easily installed (see depend subdirectory), but capd_proxy will
     88successfully start up even if they are not present (you simply won't
     89have the affected storage capabilities).  (On debian linuxes, just do:
     90apt-get install python-paramiko python-crypto python-boto.)
     91
     92There are a few options to start up capd_proxy.py (python
     93capd_proxy.py -h will show them).  If capture-daemon is suid root or
     94capd_proxy.py is running as root, you should simply be able to say
     95"python capd_proxy.py" to get started.  The output logging will
     96immediately show what storage capabilities have been found (via
     97installed python libraries).  A simple script (runproxy.sh) is
     98supplied to do a basic startup of the proxy.  This will cause the
     99proxy to listen on any locally configured IP address and port 8001.