wiki:GENIRacksHome/OpenGENIRacks/PowerDown

Version 1 (modified by Jeanne Ohren, 5 years ago) (diff)

--

GRAM Rack Power Down Sequence

Before a GRAM rack can be shutdown, GENI experimenters must be notified of the outage window. To schedule a site outage you can send an email to the GMOC (gmoc@grnoc.iu.edu) or you can schedule an outage by submitting a GMOC ticket here. You can also choose to subscribe to the GENI Response Team mailing list to all GENI outage and maintenance notifications from the GMOC. The GMOC also posts notifications on the GMOC calendar.

All GRAM services must be shut down before the rack devices are shutdown.

Note: In the instructions replace instances of admin_user with the your admin account. Logging into the Control Node with SSH key you provided and make sure that you replace identity with the correct path name if your SSH keys are not in a standard location. You can login to the Compute Nodes from the Control Node with SSH keys installed when your account was created.

Shutting Down GRAM Services

  1. Login to the Control Node with your administrative account: For example, type "ssh -Y -i ~/.ssh/id_rsa admin_user@<control addr>"
  1. Make sure that no experiments are running and check for VMs that are in an ACTIVE state:
    1. Type "source /etc/novarc"
    2. Type "nova list --all-tenants". If all experiment have been stopped you will see no resources listed. In the event that experimenters are still running you will see a list of resources, like this:
      +--------------------------------------+------+--------+--------------------------------------------+
      | ID                                   | Name | Status | Networks                                   |
      +--------------------------------------+------+--------+--------------------------------------------+
      | b840058e-4511-420c-aaf4-577562b2dce6 | VM-1 | ACTIVE | GRAM-mgmt-net=192.168.10.3; lan0=10.0.37.1 |
      | cfa1aa58-e68f-4176-beed-60e9a4257ab3 | VM-2 | ACTIVE | GRAM-mgmt-net=192.168.10.4; lan0=10.0.37.2 |
      +--------------------------------------+------+--------+--------------------------------------------+
      
    3. You can use the "suspend" command to store the content of the VMs on disk. Type "nova suspend ID." For example, "nova suspend b840058e-4511-420c-aaf4-577562b2dce6". Expected output of "nova list --all-tenants" would then be:
      +--------------------------------------+------+-----------+--------------------------------------------+
      | ID                                   | Name | Status    | Networks                                   |
      +--------------------------------------+------+-----------+--------------------------------------------+
      | b840058e-4511-420c-aaf4-577562b2dce6 | VM-1 | SUSPENDED | GRAM-mgmt-net=192.168.10.3; lan0=10.0.37.1 |
      | cfa1aa58-e68f-4176-beed-60e9a4257ab3 | VM-2 | ACTIVE    | GRAM-mgmt-net=192.168.10.4; lan0=10.0.37.2 |
      +--------------------------------------+------+-----------+--------------------------------------------+
      
    4. Suspended VMs can be resumed on startup. So, keep track of the IDs that were suspended so that they can be resumed once a rack is back up and operational.

  1. Stop all GRAM processes on the Control Node before shutting down any of the rack devices:
    1. "sudo service gram-am stop"
    2. "sudo service gram-ch stop"
    3. "sudo service gram-ctrl stop"
    4. "sudo service gram-vmoc stop"
    5. "sudo service gram-mon stop"

Shutting Down Rack Devices

Once the GRAM services have been stopped, you can shutdown the GRAM devices in the following order:

  1. OpenFlow Switch - Dell Force10
  2. Compute Nodes
  3. Control Node
  4. Management Switch - Dell Powerconnect 7048
  5. UPS (if included)

Make sure you follow this order, as you may loose access if the ordering is not followed:

  1. Shutting Down Force10 OpenFlow Switch:
    1. From the control node, type "ssh 10.10.8.200"
    2. From the Force 10 console, type "enable"
    3. Then, "type reload"
    4. And then detach the power cord.
  1. Shutting Down Compute Nodes:
    1. From the Control Node, type "ssh <hostname>". The hostnames for the compute nodes can be found in /etc/hosts.
    2. Then type "sudo sync; sudo init 0"
  1. Shutting Down Control Node:
    1. After shutting down all other resources, on the Control Node type "sudo sync; sudo init 0"
  1. Shutting Down PowerConnect 7048:
    1. Detach the power cord.
  1. Shutting down the UPS (if included)
    1. Turn off power switch