OpenGENI GRAM Installation Guide
- OpenGENI GRAM Installation Guide
Introduction
This document describes the procedures and context for installing GRAM software. There are these aspects which will be covered individually:
- Configuration Overview
- Hardware Requirements
- Software Requirements
- Network configuration
- OpenStack Installation and Configuration
- GRAM Installation and Configuration
Hardware Requirements
The minimum requirements are:
- 1 Control Server
- 1 Compute Server (Can be more)
- 1 Switch with at least (number of servers)*3 ports [For non-dataplane traffic]
- 1 OpenFlow Switch with at least (number of servers) ports [For dataplane traffic]
- Each server should have at least 4 Ethernet ports
- Each server should have internet connectivity for downloading packages
Software Requirements
Packages
The following Debian packages are required on the controller node
- git
Ports
The following ports will be used by GRAM components. Verify that these ports are not already in use. If so, change the configuration of the gram component below to use a different port.
- Controller node
- 8000: GRAM Clearinghouse (Unless you are using a different clearinghouse). See this section to change this port.
- 8001: GRAM Aggregate Manager. See this section to change this port.
- 9000: VMOC Default Controller
- 7001: VMOC Management. See this section to change this port.
- 6633: VMOC
Openstack Requirements
- This guide was written for Ubuntu 12.04
- All dependencies will be downloaded for the Ubuntu repository.
Image requirements
- Currently, nova images must meet the following requirements for GRAM:
- Must have the following packages installed:
- cloud-utils
- openssh-server
- bash
- apt
- Must have the following packages installed:
Configuration Overview
OpenStack and GRAM present software layers on top of rack hardware. It is expected that rack compute nodes are broken into two categories:
- Controller Node: The central management and coordination point for OpenStack and GRAM operations and services. There is one of these per rack.
- Compute Node: The resource from which VM's and network connections are sliced and allocated on requrest. There are many of these per rack.
- OpenStack and GRAM require establishing four distinct networks among the different nodes of the rack:
- Control Network: The network over which OpenFlow and GRAM commands flow between control and compute nodes. This network is NOT OpenFlow controlled and has internal IP addresses for all nodes.
- Data Network: The allocated network and associated interfaces between created VM's representing the requested compute/network resource topology. This network IS OpenFlow controlled.
- External Network: The network connecting the controller node to the external internet. The compute nodes may or may not also have externaly visible addresses on this network, for convenience.
- Management Network: Enables SSH entry into and between created VM's. This network is NOT OpenFlow controlled and has internal IP addresses for all nodes.
The mapping of the networks to interfaces is arbitrary and can be changed by the installer. For this document we assume the following convention:
- eth0: Control network
- eth1 : Data network
- eth2 : External network
- eth3 : Management network
The Controller node will have four interfaces, one for each of the above networks. The Compute nodes will have three (Control, Data and Management) with one (External) optional.
More details on the network configuration are provided in wiki:"GENIRacksHome/OpenGENIRacks/ArchitectureDescription".
Network Configuration
The first step of OpenStack/GRAM configuration is establishing the networks described above.
We need to define a range of VLAN's for the data network (say, 1000-2000) and separate VLANs for the external, control, and management networks (say, 5, 6, and 7) on the management switch. The external and control network ports should be configured untagged and the management port should be configured tagged.
The Control, External and Management networks are connected between the rack management switch and ethernet interfaces on the Controller or Compute nodes.
The Data network is connected between the rack OpenFlow switch and an ethernet interface on the Control and Compute nodes.
OpenFlow Switch for the Data Network
The ports on the OpenFlow switch to which data network interfaces have been connected need to be configured to trunk the VLANs of the data network. How this is done varies from switch to switch but typical commands look something like
conf t vlan <vlanid> tagged <ports> exit exit write memory
On the OpenFlow switch, for each VLAN used in the data network (1000-2000), set the controller to point to the VMOC running on the control node. The command will vary from switch to switch but this is typical:
conf t vlan <vlanid> openflow controller tcp:<controller_addr>:6633 enable openflow fail-secure on exit exit write memory
For the Dell Force 10 switch, The following lines set up the vlan trunking in the data network, and sets up the default openflow controller on the VMOC.
! interface Vlan 1001 of-instance 1 no ip address tagged TenGigabitEthernet 0/0-2 no shutdown ! ......... ! openflow of-instance 1 controller 1 128.89.72.112 tcp flow-map l2 enable flow-map l3 enable interface-type vlan multiple-fwd-table enable no shutdown !
The above snippet assumes that the controller node, running VMOC, is at 128.89.72.112
For a sample configuration file for the Dell Force10, see attachment:force10-running
Management Switch
The ports on the management switch to which management network interfaces have been connected need to be configured to trunk the VLAN of the management network. How this is done varies from switch to switch, but typical commands look something like:
conf t int gi0/1/3 switchport mode trunk switchport trunk native vlan 1 switchport trunk allowed vlan add 7 no shutdown end write memory
Here is a config file for a Dell Powerconnect 7048: attachment:powerconnect-running. We use VLAN 200, 300 and 2500 for the control plane, management plane and external network respectively.
GRAM and OpenStack Installation and Configuration
GRAM provides a custom installation script for installing and configuring OpenStack/Folsom particularly for GRAM requirements as well as GRAM itself.
- Install fresh Ubuntu 12.04 image on control and N compute nodes
- From among the rack nodes, select one to be the 'control' and others to be the compute nodes. The control node should have at least 4 NIC's and the compute should have at least 3 NIC's.
- Install Ubuntu 12.04 image on each selected rack. Server is preferred
- Create 'gram' user with sudo/admin privileges
- If there are additional admin accounts, you must manually install omni for each of these accounts.
- Install mysql on the control node
sudo apt-get install mysql-server python-mysqldb
- You will be prompted for the password of the mysql admin. Type it in (twice) and remember it: it will be needed in the config.json file for the value of mysql_password.
- Install OpenStack and GRAM on the control and compute nodes
- Get the DEBIAN files gram_control.deb and gram_compute.deb. These are not available on an apt server currently but can be obtained by request from gram-dev@bbn.com.
- Set up the APT repository to read the correct version of grizzly packages:
sudo apt-get install -y ubuntu-cloud-keyring echo deb http://ubuntu-cloud.archive.canonical.com/ubuntu precise-updates/grizzly main >> grizzly.list sudo mv grizzly.list /etc/apt/sources.list.d/
- Update the current repository:
sudo apt-get -y update && sudo apt-get -y dist-upgrade
- Get the gdebi package for direct installation of deb files
sudo apt-get install gdebi-core
- Install the gram package (where <type> is control or compute depending on what machine type is being installed):
sudo gdebi gram_<control/compute>.deb
- Edit /etc/gram/config.json. NOTE: This is the most critical step of the process. This specifies your passwords, network configurations, so that OpenStack will be configured properly. [See section "Configuring config.json" below for details on the variables in that file]
- Run the GRAM installation script (where <type> is control or compute depending on what machine type is being installed):
sudo /etc/gram/install_gram.sh <control/compute>
- Configure the OS and Network. You will lose network connectivity in the step, it is recommended that the following command is run directly on the machine or using the Linux 'screen' program.
sudo /tmp/install/install_operating_system_[control/compute].sh
- Configure everything else. Use a root shell
/tmp/install/install_[control/compute].sh
This last command will do a number of things:
- Read in all apt dependencies required
- Configure the OpenStack configuration files based on values set in config.json
- Start all OpenStack services
- Start all GRAM services
If something goes wrong (you'll see errors in the output stream), then the scripts it is running are in /tmp/install/install*.sh (install_compute.sh or install_control.sh). You can usually run the commands by hand and get things to work or at least see where things went wrong (often a problem in the configuration file).
- Set up the namespace only on the control node. Use a root shell.
- Check that sudo ip netns has two entries - the qrouter-* is the important one.
- If qdhcp-* namespace is not there, type sudo quantum-dhcp-agent-restart
- If you still cannot get 2 entries, try restarting all the quantum services:
- sudo service quantum-server restart
- sudo service quantum-plugin-openvswitch-agent restart
- sudo service quantum-dhcp-agent restart
- sudo service quantum-l3-agent restart
On the control node ONLY = In the root shell, type
export PYTHONPATH=$PYTHONPATH:/opt/gcf/src:/home/gram/gram/src:/home/gram/gram/src/gram/am/gram python /home/gram/gram/src/gram/am/gram/set_namespace.py
- Edit /etc/hosts - Not clear that this is necessary anymore.
Each control/compute node must be associated with the external ip address. It should look similar to:
127.0.0.1 localhost 128.89.72.112 bbn-cam-ctrl-1 128.89.72.113 bbn-cam-cmpe-1 128.89.72.114 bbn-cam-cmpe-2
- Installing OS Images : Only on the Control Node
At this point, OS images must be placed in OpenStack Glance (the image repository service) to support creation of virtual machines.
The choice of images is installation-specific, but these commands are provided as a reasonable example of a first image, a 64-bit Ubuntu 12.04 server in qcow2 format (http://cloud-images.ubuntu.com/releases/precise/release/ubuntu-12.04-server-cloudimg-amd64-disk1.img)
wget http://cloud-images.ubuntu.com/releases/precise/release/ubuntu-12.04-server-cloudimg-amd64-disk1.img glance image-create --name "ubuntu-12.04" --is-public=true \ --disk-format=qcow2 --container-format=bare < \ ubuntu-12.04-server-cloudimg-amd64-disk1.img #Make sure your default_OS_image in /etc/gram/config.json is set to # the name of an existing image
Another image, a 64-bit Fedora 19 in qcow2 format (http://download.fedoraproject.org/pub/fedora/linux/releases/19/Images/x86_64/Fedora-x86_64-19-20130627-sda.qcow2)
wget http://download.fedoraproject.org/pub/fedora/linux/releases/19/Images/x86_64/Fedora-x86_64-19-20130627-sda.qcow2 glance image-create --name "fedora-19" --is-public=true \ --disk-format=qcow2 --container-format=bare < \ Fedora-x86_64-19-20130627-sda.qcow2
Another image, a 64-bit CentOS 6.5 in qcow2 format (http://download.fedoraproject.org/pub/fedora/linux/releases/19/Images/x86_64/Fedora-x86_64-19-20130627-sda.qcow2)
wget http://repos.fedorapeople.org/repos/openstack/guest-images/centos-6.5-20140117.0.x86_64.qcow2 glance image-create --name "centos-6.5" --is-public=true \ --disk-format=qcow2 --container-format=bare < \ centos-6.5-20140117.0.x86_64.qcow2
*In the event, these links no longer work, copies of the images have been put on an internal projects directory in the GPO infrastructure.
- Adding Another OpenStack OS Flavor
We also wanted to add another OpenStack OS flavor. Some are created by default by OpenStack. We wanted a super image. As sudo, type:
nova flavor-create m1.super 7 32768 160 16 nova flavor-list
7 is the ID of the flavor. Generally, only 5 are installed by default. So using 7 should be safe. Otherwise pick a number, one larger than the number of flavors you have. Check using nova flavor-list.
- Edit gcf_config
If using the GENI Portal as the clearinghouse:
- Copy the cert:
cp nye-ca-cert.pem /etc/gram/certs/trusted_roots/ sudo service gram-am restart
- Install user certs and configure omni (instructions: http://trac.gpolab.bbn.com/gcf/wiki/OmniConfigure/Automatic )
If using the local gcf clearinghouse, set up gcf_config: In ~/.gcf/gcf_config change hostname to be the fully qualified domain name of the control host for the clearinghouse portion and the aggregate manager portion (2x) eg,
host=boscontroller.gram.gpolab.bbn.com
Change the base_name to reflect the service token (the same service token used in config.json). Use the FQDN of the control for the token.
base_name=geni//boscontroller.gram.gpolab.bbn.com//gcf
Generate new credentials:
cd /opt/gcf/src ./gen-certs.py --exp -u <username> ./gen-certs.py --exp -u <username> --notAll
This has to be done twice as the first creates certificates for the aggregate manager and the clearinghouse. The second creates the username certificates appropriately based on the previous certificates.
Generate public key pair
ssh-keygen -t rsa -C "gram@bbn.com"
Modify ~/.gcf/omni_config to reflect the service token used in config.json: (Currently using FQDN as token)
authority=geni:boscontroller.gram.gpolab.bbn.com:gcf
Set the ip addresses of the ch and sa to the external IP address of the controler
ch = https://128.89.91.170:8000 sa = https://128.89.91.170:8000 or ch = https://boscontroller.gram.gpolab.bbn.com:8000 sa = https://boscontroller.gram.gpolab.bbn.com:8000
- Enable Flash for Flack
Install xinetd:
apt-get install xinetd
Add this line to /etc/services:
flashpolicy 843/tcp # ProtoGENI flashpolicy service
Add this file to /etc/xinetd.d as flashpolicy:
# The flashpolicy service allows connections to ports 443 (HTTPS) and 8443 # (geni-pgch), as well as ports 8001-8002 which may be used by gcf-am # or related local services. It is harmless to allow these ports via # flashpolicy if they are closed in the firewall. service flashpolicy { disable = no id = flashpolicy protocol = tcp user = root wait = no server = /bin/echo server_args = <cross-domain-policy> <site-control permitted-cross-domain-policies="master-only"/> <allow-access-from domain="*" to-ports="80,443,5001,5002"/> </cross-domain-policy> }
Restart xinetd
sudo service xinetd restart
Configuring config.json
The config.json file (in /etc/gram) is a JSON file that is parsed by GRAM code at configre/install time as well as run time.
JSON is a format for expressing dictionaries of name/value pairs where the values can be constants, lists or dictionaries. There are no constants, per se, in JSON, but the file as provided has some 'dummy' variables (e.g. "000001") against which comments can be added.
The following is a list of all the configuration variables that can be set in the config.json JSON file. For some, defaults are provided in the code but it is advised that the values of these parameters be explicitly set.
parameter | definition |
default_VM_flavor | Name of the default VM flavor (if not provided in request RSpec), e.g. 'm1.small' |
default_OS_image | Name of default VM image (if not provided in request RSpec), e.g. 'ubuntu-12.04' |
default_OS_type | Name of OS of default VM image, e.g. 'Linux' |
default OS_version | Version of OS of default VM image, e.g. '12' |
external_interface | name of the nic connected to the external network (internet) e.g. eth0. GRAM configures this interface with a static IP address to be specified by the user |
external_address | IP address of the interface connected to the external network |
external_netmask | netmask associated with the above IP address |
control_interface | name of the nic that is to be on the control plane |
control_address | IP address of control address. This should be a private address |
data_interface | name of the nic that is to be on the data plane |
data_address | IP address of the data interface |
internal_vlans | Set of VLAN tags for internal links and networks, not for stitching, this must match the OpenFlow switch configuration |
management_interface | name of the nic that is to be on the management plane |
management_address | IP address of the management interface |
management_network_name | Quantum will create a network with this name to provide an interface to the VMs through the controller |
management_network_cidr | The cidr of the quantum management network. It is recommended that this address space be different from the addresses used on the physical interfaces (control, management, data interfaces) of the control and compute nodes |
management_network_vlan | The vlan used on the management switch to connect the management interfaces of the compute/control nodes. |
mysql_user | The name of the mysql_user for OpenStack operations |
mysql_password | The password of the mysql_user for OpenStack operations. ([1] above.] |
rabbit_password | The password for RabbitMQ interface OpenStack operations |
nova_password | The password of the nova mysql database for the nova user |
glance_password | The password of the glance mysql database for the glance user |
keystone_password | The password of the keystone mysql database for the keystone user |
quantum_password | The password of the quantum mysql database for the quantum user |
os_tenant_name | The name of the OpenStack admin tenant (e.g. admin) |
os_username | The name of the OpenStack admin user (e.g. admin) |
os_password | The password of the OpenStack admin user |
os_auth_url | The URL for accessing OpenStack authorization services |
os_region_name | The name of the OpenStack region namespace (default = RegionOne) |
os_no_cache | Whether to enable/disable caching (default = 1) |
service_token | The unique token for identifying this rack, shared by all control and compute nodes of the rack in the same OpenStack instance (ie. the name of the rack, suggest FQDN of host) |
service_endpoint | The URL by which OpenStack services are identified within keystone |
public_gateway_ip | The address of the default gateway on the external network interface |
public_subnet_cidr | the range of address from which quantum may assign addresses on the external network |
public_subnet_start_ip | the first address of the public addresses available on the external network |
public_subnet_end_ip | the last address of the public addresses available on the external network |
metadata_port | The port on which OpenStack shares meta-data (defult 8775) |
backup_directory | The directory in which the GRAM install process places original versions of config files in case of the need to roll-back to a previous state. |
allocation_expiration_minutes | Time at which allocations expire (in minutes), default=10 |
lease_expiration_minutes | Time at which provisioned resources expire (in minutes), default = 7 days |
gram_snapshot_directory | Directory of GRAM snapshots, default '/etc/gram/snapshots' |
recover_from_snapshot | Whether GRAM should, on initialization, reinitialize from a particular snapshot (default = None or "" meaning no file provided) |
recover_from_most_recent_snapshot | Whether GRAM should, on initialization, reinitialize from the most recent snapshot (default = True) |
snapshot_maintain_limit | Number of most recent snapshots maintained by GRAM (default = 10) |
subnet_numfile | File where gram stores the subnet number for last allocated subnet, default = '/etc/gram/GRAM-next-subnet.txt'. Note: This is temporary until we have namespaces working. |
port_table_file | File where GRAM stores the SSH proxy port state table, default = '/etc/gram/gram-ssh-port-table.txt' |
port_table_lock_file | File where SSH port table lock state is stored, default = = '/etc/gram/gram-ssh-port-table.lock' |
ssh_proxy_exe | Location of GRAM SSH proxy utility, which enables GRAM to create and delete proxies for each user requested, default = '/usr/local/bin/gram_ssh_proxy' |
ssh_proxy_start_port | Start of SSH proxy ports, default = 3000 |
ssh_proxy_end_port | End of SSH proxy ports, default = 3999 |
vmoc_interface_port | Port on which to communicate to VMOC interface manager, default = 7001 |
vmoc_slice_autoregister | SHould GRAM automatically reigster slices to VMOC? Default = True |
vmoc_set_vlan_on_untagged_packet_out | Should VMOC set VLAN on untagged outgoing packet, default = False |
vmoc_set_vlan_on_untagged_flow_mod | Should VMOC set VLAN on untagged outgoing flowmod, default = True |
vmoc_accept_clear_all_flows_on_startup | Should VMOC clear all flows on startup, default = True |
control_host_address | The IP address of the controller node's control interface (used to set the /etc/hosts on the compute nodes |
mgmt_ns | DO NOT set this field, it will be set up installation and is the name of the namespace containing the Quantum management network. This namespace can be used to access the VMs using their management address |
disk_image_metadata | This provides a dictionary mapping names of images (as registered in Glance) with tags for 'os' (operating system of image), 'version' (version of OS of image) and 'description' (human readable description of image) e.g. |
{ "ubuntu-2nic": { "os": "Linux", "version": "12.0", "description":"Ubuntu image with 2 NICs configured" }, "cirros-2nic-x86_64": { "os": "Linux", "version": "12.0", "description":"Cirros image with 2 NICs configured" }
control_host | The name or IP address of the control node host |
compute_hosts | The names/addresses of the compute node hosts, e.g. |
{ "boscompute1": "10.10.8.101", "boscompute2": "10.10.8.102", "boscompute4": "10.10.8.104" }
host_file_entries | The names/addresses of machines to be included in /etc/hosts, e.g. |
{ "boscontrol": "128.89.72.112", "boscompute1": "128.89.72.113", "boscompute2": "128.89.72.114" }
stitching_info | Information necessary for the Stitching Infrastructure |
aggregate_id The URN of this AM aggregate_url The URL of this AM edge_points A list of dictionaries for which:
local_switch URN of local switch mandatory port URN port on local switch leading to remote switch mandatory remote_switch URN of remote switch mandatory vlans VLAN tags configured on this port mandatory traffic_engineering_metric configurable metric for traffic engineering optional, default value = 10 (no units) capacity Capacity of the link between endpoints optional, default value = 1000000000 (bytes/sec) interface_mtu MTU of interface optional, default value = 900 (bytes) maximum_reservable_capacity Maximum reservable capacity between endpoints optional, default value = 1000000000 (bytes/sec) minimum_reservable_capacity Minimum reservable capacity between endpoints optional, default value = 1000000 (bytes/sec) granularity Increments for reservations optional, default value = 1000000 (bytes/sec)
Installing Operations Monitoring
Monitoring can be installed after testing the initial installation of GRAM. Most supporting infrastructure was installed by the steps above. Some steps, however, still need to be done by hand and the instructions can be found here: Installing Monitoring on GRAM
Testing GRAM installation
This simple rspec can be used to test the gram installation - attachment:2n-1l.rspec
# Restart gram-am and clearinghose sudo service gram-am restart sudo service gram-ch restart # check omni/gcf config cd /opt/gcf/src ./omni.py getusercred # allocate and provision a slice # I created an rspec in /home/gram called 2n-1l.rspec ./omni.py -V 3 -a http://130.127.39.170:5001 allocate a1 ~/2n-1l.rspec ./omni.py -V 3 -a http://130.127.39.170:5001 provision a1 ~/2n-1l.rspec # check that the VMs were created nova list --all-tenants # check that the VMs booted, using the VM IDs from the above command: nova console-log <ID> # look at the 192.x.x.x IP in the console log # find the namespace for the management place: sudo ip netns list # look at each qrouter-.... for one that has the external (130) and management (192) sudo ip netns exec qrouter-78c6d3af-8455-4c4a-9fd3-884f92c61125 ifconfig # using this namespace, ssh into the VM: sudo ip netns exec qrouter-78c6d3af-8455-4c4a-9fd3-884f92c61125 ssh -i ~/.ssh/id_rsa ssh gramuser@192.168.10.4 # verify that the data plane is working by pinging across VMs on the 10.x.x.x addresses # The above VM has 10.0.21.4 and the other VM i created has 10.0.21.3 ping 10.10.21.3
Turn off Password Authentication on the Control and Compute Nodes
- Generate an rsa ssh key pair on the control node for the gram user or use the one previously generated if it exists: i.e. ~gram/.ssh/id_rsa and ~gram/.ssh/id_rsa.pub
ssh-keygen -t rsa -C "gram@address"
- Generate a dsa ssh key pair on the control node for the gram user or use the one previously generated if it exists: i.e. ~gram/.ssh/id_dsa and ~gram/.ssh/id_dsa.pub. Some components could only deal well with dsa keys and
so from the control node access to other resources on the rack should be using the dsa key.
ssh-keygen -t dsa -C "gram@address"
- Copy the public key to the compute nodes, i.e. id_dsa.pub
- On the control and compute nodes, cat id_rsa.pub >> ~/.ssh/authorized_keys
- As sudo, edit /etc/sshd/config and ensure that these entries are set this way:
RSAAuthentication yes
PubkeyAuthentication yes
PasswordAuthentication no
- Restart the ssh service, sudo service ssh restart.
- Verify by login in using the key ssh -i ~/.ssh/id_dsa gram@address
TODO
- Need to make link to /opt/gcf in compute nodes
- Make sure that your rabbitMQ IP in /etc/quantum/quantum.conf is set to the controller node: (broken sed in OpenVSwitch.py)
- Service token not set in keystone.conf
- add a step in the installation process that checks the status of the services before we start our installation scripts - check dependencies
- fix installation such as the gcf_config has the proper entry for host in aggregate and clearinghouse portions - also need to check where the port number is actually read from
for the AM - as it is not the gcf_config
DEBUGGING NOTES
- If it gets stuck at provisioning, you may have lost connectivity with one or more compute nodes. Check that network-manager is removed
- If ip addresses are not being assigned and the VMs stall on boot: quantum port-delete 192.168.10.2 (the dhcp agent) and restart quantum-* services
- To create the deb package check the Software Release Page for instructions Software Release Procedure
.
Attachments (5)
- force10-running (102.2 KB) - added by 11 years ago.
- GRAMSwitchDiag.jpg (48.9 KB) - added by 11 years ago.
- powerconnect-running (1.6 KB) - added by 11 years ago.
- powerconnect-running.rtf (1.6 KB) - added by 11 years ago.
- 2n-1l.rspec (1.0 KB) - added by 10 years ago.
Download all attachments as: .zip