Context Navigation

← Previous Change
Wiki History
Next Change →

Changes between Initial Version and Version 1 of GIMSDocumentation

Timestamp:: 08/05/10 15:45:03 (14 years ago)
Author:: jsommers@colgate.edu
Comment:: --

Legend:

: Unmodified
: Added
: Removed
: Modified

GIMSDocumentation

                       v1
+== Overview to GIMS Capture Daemon ==
+There are two basic components to the capture node software:
+capture-daemon, and capd_proxy.py.  The proxy handles all XML/RPC
+communication with the GIMS AM and UI, and manages starting any
+necessary capture processes and resulting capture files.
+Below, specific description of how to install and get the capture daemon proxy running are given.
+'''capture-daemon'''
+----
+The capture daemon can be run as a standalone process, but is normally
+started up by the capd_proxy.  capd_proxy should run in an
+unprivileged mode, thus capture-daemon should be installed as setuid
+root (or a privileged user that has appropriate access to network
+devices via libpcap) in order for it to have appropriate permissions
+to open capture devices, etc.  It can be installed anywhere, but should be put
+on a file system with reasonable enough space for temporary measurement
+files to reside.
+The present capabilities of the capture daemon with respect to packet transformations
+are:
+ * prefix-preserving IP address anonymization
+ * sampling (1-in-N or uniform probabilistic sampling)
+ * aggregation into either simple byte/packet counts, or flow-records.  By default, no
+ aggregation is done and you'll get libpcap files with raw packet headers.
+Separate capture-daemon processes will be started by capd_proxy for
+each active experiment.  While there is some overhead involved with
+having multiple capture processes running at the same time, this
+architecture simplifies the problem of demultiplexing the appropriate
+packets for a given experiment.  It also provides a level of isolation
+among experiments, and enables a separate O&M capture process to be
+running alongside any active experiment process (e.g., in order to
+capture *all* traffic on the wire).
+The only software dependency for the basic functionality of
+capture-daemon is a working libpcap library.
+For flow aggregation in IPFIX records, libyaf and libfixbuf are
+required.  They are available at
+http://tools.netsa.cert.org/index.html.  These tools should be
+compiled and installed prior to compiling the capture-daemon code.
+They have their own set of prerequisites, notably glib 2.6.4 or
+better.  See the yaf/fixbuf documentation for more details.
+From inside the capture-daemon subdirectory, configure and make, then
+sudo make install.  (Make install just changes the capture-daemon
+binary to be suid root so that it can have permissions to set a net
+device in promiscuous mode.  Alternatively, run capd_proxy.py as root.)
+'''capd_proxy.py'''
+----
+Requires python 2.6.
+The capd_proxy handles all XML/RPC functions for configuring,
+starting, and stopping experiments.  There are also interfaces for
+testing storage capabilities for an experiment, and for gathering
+information on running experiments.
+Presently, storage functionality is included for:
+ * sftp
+ * Amazon S3
+All storage functionality resides in capd_storage.py.  It is designed in
+a fashion to allow addition of new storage repositories relatively easy.
+From the capture daemon proxy, a separate process is started up to
+handle each storage type (s3, ssh, and local storage).  This storage
+process handles checking for new files that can be uploaded, and also
+annotates the existing metadata for raw capture files prior to upload.
+Each individual file transfer is handled in a transient (Python)
+thread in order to avoid blocking the entire process on a single
+transfer.  (Note that some work is yet to be done on ensuring the
+propagation of errors from storage uploads to the UI (and
+experimenters).)  Storage functionality is almost entirely housed in
+the capd_storage.py module.
+The software dependencies for capd_proxy are related to the storage
+capabilities: the boto Python library is required for s3, and paramiko
+and pyCrypto are required for ssh.  Each of these libraries can be
+quite easily installed (see depend subdirectory), but capd_proxy will
+successfully start up even if they are not present (you simply won't
+have the affected storage capabilities).  (On debian linuxes, just do:
+apt-get install python-paramiko python-crypto python-boto.)
+There are a few options to start up capd_proxy.py (python
+capd_proxy.py -h will show them).  If capture-daemon is suid root or
+capd_proxy.py is running as root, you should simply be able to say
+"python capd_proxy.py" to get started.  The output logging will
+immediately show what storage capabilities have been found (via
+installed python libraries).  A simple script (runproxy.sh) is
+supplied to do a basic startup of the proxy.  This will cause the
+proxy to listen on any locally configured IP address and port 8001.