wiki:GAPI_AM_API_DRAFT

Version 35 (modified by Aaron Helsinger, 12 years ago) (diff)

--

GENI Aggregate Manager API Draft Revisions

This page documents DRAFT revisions to the GENI Aggregate Manager API, proposed for the next version of the API. As indicated below, some of the revisions documented here have been discussed on the GENI developer mailing list, and during at least one GEC. Other revisions are in early discussions and subject to change or abandonment.

The GENI Aggregate Manager API allows aggregates to advertise resources and to allocate resources to Slices in the form of Slivers. A Sliver is a set of resources allocated by one Aggregate to one Slice. See below for a proposed complete definition.

The current officially adopted version of the API is 2 and is documented on the main API page.

API Version 2 was adopted based on changes previously listed on this page. Those changes have been removed from this page, and are now documented on a separate page. They include:

  1. RSpec related changes, specifying that RSpecs are XML following GENI standard schemas.
    • Since June 2011, the latest software from ProtoGENI and SFA (as of code tag 1.0-24) has complied with these changes.
    • Omni version 1.3 (released June 2011) added client software support for these changes.
  2. Supporting flexible arguments and returns. Specifically, adding a property list to all calls, and making all returns be a property list.

This page documents proposed changes for AM API version 3. These changes are grouped into sets. API Version 3 will be the collection of changes from the change sets below which we next agree on. Change sets still under discussion will then be targeted at a future release.

Proposing Additional Changes

GENI community members are encouraged to propose changes to the GENI Aggregate Manager API.

Technical discussions are generally held on the Developers mailing list

Specific questions may be directed to the software team at the GPO (Tom Mitchell, Aaron Helsinger, and Sarah Edwards) at {tmitchel, ahelsing, sedwards} at geni.net

Proposed changes for GENI Aggregate Manager API version 3

This page documents a long list of proposed changes for AM API version 3. These changes provide ways for aggregates to expand GENI functionality, without requiring further API modifications.

There are many changes here. As such, aggregates may implement these to varying degrees.

  • Clients are reminded that these methods are requests - based on the AM type and resource types, these requests may fail or not make sense. Clients should watch for UNSUPPORTED returns.
  • AMs are encouraged to implement as much of this API as reasonable to provide a common front for clients. When a function is not possible, return UNSUPPORTED, document publicly what functions do work, and suggest alternative ways to get the result the client desired.

Summary

Proposed Changes

At the top level, the proposed changes for AM API v3 include:

  • Change Set C: Add the ability to UpdateSlivers to immediately modify your reservation
  • Change Set D: Slivers: Change methods to clarify that there may be multiple slivers per slice at an AM, and to allow operating on individual slivers
  • Change Set E: Tickets: Add methods using tickets to do negotiated reservations
  • Change Set F1: Define sliver states, and the state changes that various methods cause
  • Change Set F2: Add a new general ActOnSlivers method allowing AMs to support AM and resource-type specific operations
  • Change Set G: Generalize the credentials argument, allowing ABAC support
  • Change Set H: Clarify: A second call to CreateSlivers without an intervening DeleteSlivers is an error.
  • Change Set I1: SliversStatus return structure includes sliver expiration
  • Change Set I2: SliversStatus return includes SSH logins/key for nodes that support SSH access
  • Change Set I3: CreateSlivers return becomes a struct, adds sliver expiration
  • Change Set I4: CreateSlivers optionally does not start resources.
  • Change Set J: Support proxy aggregates with 1 new option and 1 new GetVersion entry
  • Change Set K: Standardize certificate contents, etc
    • Include a real serial number, holder email, holder uuid, and optionally authority URL in certificates
    • Define slice ID as the UUID in slice certificates
    • Define slice name, sliver name, and user name restrictions, and similar for URNs
    • Publish schemas for credentials and certificates
  • Change Set L: Standardize slice credential privileges

For a full listing of the proposed API methods after all these changes, see below.

Unspecified items

  • Semantics of start/restart after UpdateSlivers
  • Define the ticket service classes
  • Publish ticket schema
  • Publish credential schema
  • Define error codes returned by new methods, conditions

Change Set C: UpdateSlivers

Add an ability for experimenters to modify their allocated resources at an aggregate without deleting (and possibly losing) existing resource allocations.

Motivation

A common complaint among experimenters about the current AM API is that there is no way to add or remove resources from a slice at an aggregate, without completely deleting the slice at that aggregate and recreating it, possibly losing resources to another experimenter and certainly losing state. This proposal aims to address that, by introducing a method to update the slice at an aggregate.

The SFA calls for an UpdateSlice method, "to request that additional resources—as specified in the RSpec—be allocated to the slice".

In the PlanetLab implementation of the SFA, UpdateSliver is in fact a synonym for CreateSliver - the server will ensure that your allocated resources match your request RSpec, adding, removing and modifying resources as needed. It immediately allocates and boots nodes to match the request RSpec.

The ProtoGENI CMV2 API has UpdateSliver, which is described as the way to "Request a change of resuorces for an existing sliver. The new set of resources that are desired are specified in the rspec." At ProtoGENI as at PlanetLab, this method takes the full RSpec description of resources the experimenter wants, and the server computes the difference with what the experimenter already has. At ProtoGENI though, this method returns a ticket. The experimenter must then redeem the ticket to actually acquire the resources.

This topic was discussed at the GEC12 AM API session and on the GENI dev mailing list (in October and November).

UpdateSlivers

This change would add a new method UpdateSlivers, which takes a full request RSpec of the desired final state of the slice at this aggregate. This proposal calls for adding this functionality in the context of tickets: the UpdateSlivers method returns a ticket. See Change Set E: Tickets for details. If tickets are not adopted, consider the alternative proposal below, where UpdateSlivers immediately allocates and starts the requested resources, as in CreateSlivers.

Some points about this change:

  • The method takes a full request RSpec - not a diff.
    • Note that we want the manifest to be readily modifiable to be a request (include component_ids and sliver_ids), but it is not yet.
    • AMs may, as always, return UNSUPPORTED - EG if they are incapable of determining what changes to apply (computing a diff).
  • The request is either fully satisfied, or fails (returns an error code).
  • AMs must document the level of service they provide: will any state be lost on existing resources?
    • Typically this would be a per node or resource-type specification.
    • Use the levels of guarantee in geni_state_guarantee below.
    • Default is to provide no guarantee.
    • This API does not define where AMs provide detailed documentation, but AMs must return this value for the entire change as part of the return struct.
  • Experimenters may specify what level of disruption they can tolerate, using the geni_state_guarantee below. AMs are expected to fail a request with a specified service guarantee that they cannot satisfy. Default is to request no guarantee.
  • Options includes geni_end_time, a RFC3339 requested end time for the reservation. See below for details.
    • If omitted, the AM may reserve the resources for the default sliver duration.
    • AMs should follow the logic of RenewSlivers to determine if the requested duration of the sliver is acceptable.
    • The request should Fail (return an error code) if the resources cannot be reserved until the requested time.

This change adds a new option geni_state_guarantee with these possible values (case insensitive string or integer):

  • 0=NO_GUARANTEE (default: all state may be lost)
  • 1=SAVE_DISK (disk state will be preserved but running processes will be lost)
  • 2=SAVE_DISK_AND_PROCESSES (both disk state and running processes will be preserved, like migrating a VM)
  • 3=NO_DISRUPTION (no noticeable service disruption)

AMs which cannot meet the implied limit to service disruption should fail the request (return an error code).

struct UpdateSlivers(string slice_urn, string credentials[], string rspec, 
                                                 struct options)

Returns a struct:

{
  string ticket=<ticket>
  string geni_status=<sliver state - ticketed>,
  string geni_state_guarantee=<promise from AM of what experimenter state will be lost on trying to 'start' this allocation>,
 <others that are AM specific>
}

See Change Set E: Tickets for details on the ticket returned and ticket semantics.

Alternative proposal: UpdateSlivers with immediate allocation

An alternative proposal would add two versions of UpdateSlivers. Method 1 would return a ticket as above. Method 2 would immediately allocate the resources, as with CreateSlivers.

This proposed version of UpdateSlivers is substantially the same as the main proposal above, with a few differences:

  • Under this alternative proposal, on success the new resources are allocated to the slice. As with CreateSlivers, by default those new resources are initialized or booted or started, such that they will shortly become available for experimenter use.
  • As with CreateSlivers, the AM should start/restart resources immediately, as necessary.
    • This change introduces a new option geni_donotstart. When supplied and true (boolean: 0 or 1 in XML-RPC), aggregates should allocate the resources but not start them. Experimenters will have to explicitly use ActOnSlivers to start or restart resources as necessary. Note that there may be no such distinction for some resources.
      • Removed resources are stopped by the aggregate automatically.
  • Note that the tickets proposal includes a method that updates the resource reservation without allocating them.
  • This method moves the overall sliver state to allocated, and then (if the experimenter did not specify geni_donotstart) configuring and then ready if it succeeds.

Proposed method signature:

struct UpdateSlivers(string slice_urn,
                    string credentials[],
                    <GENI request RSpec schema compliant XML string> rspec,
                    struct users[],
  		    struct options)
Return value
{
 string rspec=<manifest>,
 string geni_start_time=<optional (may be omitted altogether) RFC3339 start time for the allocation: now if not specified>,
 string geni_expires=<RFC3339 sliver expiration>,
 string geni_status=<sliver state - allocated or changing or ready>,
 string geni_state_guarantee=<promise from AM of what experimenter state will be lost on trying to 'start' this allocation>,
 <others that are AM specific>
}

Change Set D: Sliver-specific operations

A slice may have multiple slivers at a single AM. Experimenters can operate on slivers independently, if the AM supports it. AMs define slivers as groups of resources, and give them locally unique sliver_urns for identifying that group of resources.

Motivation

This change set was discussed at the GEC12 AM API session.

The current AM API calls take a Slice URN, and operate on all resources under that label at the given aggregate - all the resources for that slice at the aggregate are allocated, renewed, and deleted together. There is no provision for releasing some of the resources allocated to the slice at that aggregate, or for adding new resources to the reservation for that slice at a given aggregate.

This ties closely to the precise definition of a Slice vs a Sliver. The current AM API methods imply that a sliver represents all resources at an aggregate for a given slice. However, this does not match the definition that previous GENI documents have used, nor the functionality that experimenters desire.

Previous GENI documents have used this definition: A sliver is the smallest set of resources at an aggregate that can be independently reserved and allocated. A given slice may contain multiple slivers at a single aggregate. A sliver may contain multiple components.

Given this definition, the current AM API methods in fact operate on a group of slivers.

This change set would provide a means for experimenters to operate on individual slivers within their slice at a given aggregate.

Define sliver

A Sliver is an aggregate defined grouping of resources within a slice at this aggregate, whose URN identifies the sliver, and can be used as an argument to methods such as DeleteSlivers or RenewSlivers, and whose status can be independently reported in the return from SliversStatus. The AM defines 1 or more of these groupings to satisfy a given resource request for a slice. All reserved resources are directly contained by exactly 1 such sliver container, which is in precisely 1 slice.

Slivers are identified by an aggregate selected URN. See other change proposals for details on standardizing such URNs.

Addressable Slivers

Considering the clarified sliver definition, several API names are misleading. This change proposal modifies those method names to clarify that they may work with multiple slivers. Additionally, some methods can logically operate on individual slivers: this change modifies those methods' arguments to allow specifying a particular sliver.

  1. Rename some existing methods to clarify that they act on 1+ slivers:
    • CreateSliver -> CreateSlivers
    • RenewSliver -> RenewSlivers
    • DeleteSliver -> DeleteSlivers
    • SliverStatus -> SliversStatus
  1. Some methods that take slice_urn now take a urn that may be a slice or sliver:
    • RenewSlivers, DeleteSlivers, SliversStatus
    • AMs are responsible for distinguishing whether the request operates on a slice or a sliver (see Change Set K which defines how slice and sliver URNs differ).
    • AMs are free to refuse to Renew, Delete, or provide status on an individual sliver, if the local AM or that resource type does not support it.
      • AMs should return an error message. Clients may often use UpdateSlivers instead to similar effect.

Change Set E: Tickets

AM APIv3 adds support for negotiated reservations or two-phase commit, by adding methods that allow an experimenter to reserve resources for a slice without committing to using them, or forcing the AM to incur the cost of instantiating them.

Motivation

This possible change was discussed at the GEC12 AM API session.

The SFA defines the concept of a ticket. SFA1.0 section 5.3 says "A component signs an RSpec to produce a ticket, indicating a promise by the component to bind resources to the ticket-holder at some point in time." Tickets are promises to allocate resources.

Tickets are used in the ProtoGENI CMV2 interface, and are discussed on the PG wiki. Tickets with a slightly different semantics (and leases) are also used extensively in Orca. For details on the use of leases and tickets in Orca, see the Orca Book. However, each of these uses of the notion of tickets differs.

Tickets would potentially enable a number of useful and possibly critical features:

  • Coordinated or negotiated reservations: reserving resources from aggregate B only if aggregate A can give you a complementary resource. For example, a VLAN tag. This is related to stitching, both network stitching and the more general form.
  • Two phase commit reservations (similar to the above).
  • Scheduled reservations in the future.
  • Brokers: 3rd parties consolidating, scheduling and allocating resources on behalf of a number of other aggregates
  • Lending resources to other experimenters
  • Giving experimenters explicit control over when resources are started, stopped, and restarted (see the discussion on UpdateSliver).

Tickets semantics

This proposal would add tickets to the existing AM API, allowing experimenters to reserve (hold) resources temporarily and cheaply. Tickets represent a promise to the named slice to allocate the specified resources, if the ticket is 'redeemed' while the ticket is still valid. Tickets describe a complete specification of the resources that will be allocated to the slice at this aggregate, if the ticket is redeemed.

Some key properties of the tickets proposed here:

  • Tickets are IOUs from an AM to a slice (not to an experimenter - no delegation is necessary or possible).
  • Experimenters do not need to use tickets to reserve resources: existing methods without tickets still work.
  • A ticket is a promise by the AM to give the specified resources to the slice if an authorized slice member requests them.
    • The aggregate is saying they will not give away these resources to other slices, but only to this slice.
    • AMs must document how firm their promises are. See the attribute geni_ticket_service_class.
      • Some aggregates may only offer soft promises, as in PlanetLab.
  • Tickets are signed by the AM: non repudiatable.
  • Tickets are bound to a slice: they contain the slice certificate.
  • Tickets may be passed from 1 researcher on a slice to another freely - no explicit delegation is required.
    • Indeed, any experimenter with appropriate slice credentials can retrieve the ticket from the aggregate.
    • Tickets may not be delegated to another slice or other entity; these tickets do not support brokers.
  • Tickets promise a particular set of resources: they include an RSpec. Note that this may be an unbound RSpec.
    • Note that we do not currently have unbound manifest RSpecs. For now we specify only that this is an RSpec.
  • Tickets are good for a limited time.
    • They must be redeemed by a specified time, redeem_before, after which the aggregate is free to assign the resources elsewhere.
      • Aggregates determine redeem_before, which is some epsilon in the near future.
      • Aggregates may accept a new option geni_reserve_until which is a request for a particular redeem_before, but are not required to support this (they may ignore the option).
    • Tickets specify when the resources will be available from (starts, typically essentially now), and when they will be available until (typically now plus the aggregate-local default sliver expiration time).
      • The resources may be available even longer, but that would require a separate RenewSlivers call.
  • Tickets specify the full final state of the slice after applying this ticket.
    • Tickets are not incremental changes, and are not additive.
    • The implication is that there may be only 1 ticket outstanding for a slice per aggregate (except for scheduled reservations, see below).
    • This also implies that these tickets are not suitable for use by brokers.
  • Aggregates must attempt to honor their promises. As a result, aggregates must remember all outstanding tickets until they are redeemed or expire.
  • All ticket related timestamps must be in the format of RFC3339 (http://www.ietf.org/rfc/rfc3339.txt)
    • Full date and time with explicit timezone: offset from UTC or in UTC.
    • eg: 1985-04-12T23:20:50.52Z or 1996-12-19T16:39:57-08:00

Ticket contents

Tickets have an ID, the certificate of the slice to whom the resources are promised, an RSpec representing the promised resources, several timestamps, other attributes, and a signature by the aggregate (including the aggregate's certificate).

Tickets are externally represented as signed XML documents following the XML Digital Signatures specification.

Tickets contain:

  • owner_gid = the certificate of the experimenter who requested the ticket
  • target_gid = slice certificate
  • uuid
    • Unique ID for the ticket, in the hexadecimal digit string format given in RFC 4122
  • expires - RFC3339 compliant Date/Time when the resources will no longer be yours per this reservation (eg sliver duration+now)
  • redeem_before: RFC3339 compliant Date/Time when you must redeem this reservation, or your resources will be returned to the available pool (eg now+epsilon)
  • starts - RFC3339 compliant Date/Time when the resources will be yours, per this reservation (eg now)
  • RSpec (not specified as request or manifest)
  • Attributes (AM/resource-type specific name/value pairs)
    • Including optionally geni_state_guarantee as defined below, to indicate if existing slivers will be disrupted (default is no guarantee).
    • Including geni_ticket_service_class as defined below, to indicate the firmness of the promise this ticket represents
  • signature including issuing AM's certificate

More formally:

{
 owner_gid = <the certificate of the experimenter who requested the ticket>,
 target_gid = <slice certificate, following GENI AM API certificate specification>,
 uuid = <RFC 4122 compliant string>,
 expires = <RFC3339 compliant Date/Time when the resources will no longer be yours per this reservation (eg sliver duration+now)>,
 redeem_before = <RFC3339 compliant Date/Time when you must redeem this reservation, or your resources will be returned to the available pool (eg now+epsilon)>,
 starts = <RFC3339 compliant Date/Time when the resources will be yours, per this reservation (eg now)>,
 rspec = <RSpec (not specified as request or manifest)>,
 attributes = { 
    geni_state_guarantee = <string>,
    geni_ticket_service_class = <string>,
    <others>
  },
  signature
}

Tickets may include in the attributes element the attribute geni_state_guarantee, indicating whether the AM will preserve the state of any existing resources (case insensitive string or integer).

  • 0=NO_GUARANTEE (Default: all state may be lost)
  • 1=SAVE_DISK (disk state will be preserved but running processes will be lost)
  • 2=SAVE_DISK_AND_PROCESSES (both disk state and running processes will be preserved, like migrating a VM)
  • 3=NO_DISRUPTION (no noticeable service disruption)

Tickets should include the geni_ticket_service_class attribute for advertising the firmness of the promise that a ticket represents (case insensitive string or integer).

  • FIXME: Provide definitions for these service classes.
  • 1=WEAK_EFFORT
  • 2=BEST_EFFORT
  • 3=ELASTIC_RESERVATION
  • 4=HARD_RESERVATION

Tickets will follow a defined schema, to be published on geni.net.

Tickets logically have a URN (not included in the ticket): urn:publicid:IDN+<AM name>+ticket+<uuid>

For a similar structure in ProtoGENI, see https://www.protogeni.net/trac/protogeni/attachment/wiki/Authentication/credential.rnc

Methods

  1. GetTicket
    struct GetTicket (string slice_urn, string credentials[], string requestRSpec, 
                               struct options)
    
  • Get a ticket promising resources requested in the rspec.
  • If there is already an outstanding ticket for the slice, an error is returned.
  • Return: ticket
  • Result State: ticketed
  • Options may include geni_start_time and geni_end_time (see below)
  1. RedeemTicket
    struct RedeemTicket(string slice_urn, string credentials[], string ticket, 
                                     struct users[] (as in CreateSlivers), struct options)
    
  • Return:
    {
     string rspec=<manifest>,
     string geni_start_time=<optional (may be omitted altogether): now if not specified>,
     string geni_expires=<RFC3339 sliver expiration>,
     string geni_status=<sliver state - allocated (or optionally changing or ready)>,
     string geni_state_guarantee=<promise from AM of what experimenter state will be lost on trying to 'start' this allocation>,
     <others that are AM specific>
    }
    
  • Begin allocating the resources promised in the ticket.
  • Option geni_auto_start:
    • If supplied and true (boolean: 0 or 1 in XML-RPC), the aggregate automatically start/restarts resources as necessary, as though the experimenter called ActOnSlivers(start).
      • State will be changing and then ready
    • If omitted the aggregate does not start resources (default behavior). The final state is allocated, and the experimenter must explicitly start or restart resources using ActOnSlivers
    • Note that resources which do not require a 'start' may already be ready on normal return from RedeemTicket.
  • Omitting the ticket causes the aggregate to redeem the outstanding ticket for this slice if any. If none, return an error code.
  • The ticket must be valid: not expired or previously redeemed or replaced. If so, an error is returned.
  1. ReleaseTicket
    struct ReleaseTicket(string slice_urn, string credentials[], string ticket, struct options)
    
  • Give up the reservation for resources.
  • Return: True or error
  • Omitting the ticket causes the aggregate to release the 0 or 1 outstanding tickets for this slice.
  • If this ticket was from UpdateSlivers, then the sliver returns to the allocated state and existing resources are not modified.
  1. UpdateTicket

(atomic release/get)

struct UpdateTicket(string slice_urn, string credentials[], string requestRSpec, 
                                string ticket, struct options)
  • For updating a reservation in place, replacing one ticket with a new one. On success, the old ticket is invalid.
  • Return: Ticket
  • Result State: ticketed
  • Options may include geni_start_time and geni_end_time (see below)
  • The ticket must be valid: not expired or previously redeemed or replaced. If so, an error is returned.
  1. UpdateSlivers
    struct UpdateSlivers(string slice_urn, string credentials[], string requestRSpec, 
                                                     struct options)
    
  • Returns a struct:
    {
      string ticket=<ticket>
      string geni_status=<sliver state - ticketed>,
      string geni_state_guarantee=<promise from AM of what experimenter state will be lost on trying to 'start' this allocation>,
     <others that are AM specific>
    }
    
  • Get a promise for resources that would replace currently allocated resources, as defined in Change Set C.
  • Result State: ticketed
  • On completion, the slice has both a ticket and a set of slivers at this aggregate. Overall it is both allocated and ticketed, which is called ticketed.
  • Options may include geni_start_time and geni_end_time, a RFC3339 requested start and end time for the reservation (option not required).
    • The request should Fail (return an error code) if the resources cannot be reserved from or until the requested time.
  • The method takes a full request RSpec - not a diff.
    • AMs may, as always, return UNSUPPORTED - EG if they are incapable of determining what changes to apply (computing a diff).
  • The request is either fully satisfied, or fails (returns an error code).
  • AMs must document the level of service they provide using levels from geni_state_guarantee: will any state be lost on existing resources?
    • Default is to provide no guarantee.
  • Experimenters may specify what level of disruption they can tolerate, using the geni_state_guarantee option.
    • AMs are expected to fail a request with a specified service guarantee that they cannot satisfy. Default is to request no guarantee.
  • For further details on the UpdateSlivers semantics, see Change Set C.

For a similar set of functions in ProtoGENI, see: https://www.protogeni.net/trac/protogeni/wiki/ComponentManagerAPIV2

Other changes to support tickets

  • CreateSlivers remains the first call: do not use it to add resources to the slice.
  • ListResources return value changes to be:
    {
       string rspec (ad or Manifest - may be empty though)
       string tickets[] (required but may be an empty list)
    }
    
  • For ListResources with no slice_urn, tickets shall be an empty list, and rspec shall be an ad RSpec.
  • For ListResources with a slice_urn, rspec is the manifest RSpec for everything belonging to that slice at this AM, if anything is currently allocated (not just a ticket). tickets is then any outstanding ticket(s) for this slice.

Scheduling support using Tickets

This ticket structure and methods, with small additions, supports using tickets for scheduling. This proposal does not require support for scheduling at aggregates.

  • We are not explicitly supporting scheduling, but the timestamps here should be sufficient.
  • GetTicket, CreateSlivers, ListResources, UpdateTicket, UpdateSlivers all accept new RFC3339 compliant geni_start_time and geni_end_time options to support scheduling in the future.
    • For GetTicket and CreateSlivers, if left out then the reservation start is 'now or really soon' and the end is start plus the default sliver duration.
    • AMs that do not support scheduling return UNSUPPORTED when passed geni_start_time.
    • AMs should still support geni_end_time, following the logic of RenewSlivers to determine if the requested duration of the sliver is acceptable.
      • IE at CreateSlivers and GetTicket and UpdateSlivers in particular
      • The request should Fail (return an error code) if the resources cannot be reserved until the requested time.
  • redeem_before in tickets should be starts+epsilon. That epsilon is AM specific, but typically a small number of minutes.
  • Multiple tickets may be outstanding for a single slice at a single AM only for non overlapping time intervals.
    • IE you could request 2 tickets: 1 for machines 1-3 on Tuesday and simultaneously request 1 for machines 4-6 on Thursday.
  • These options are accepted in ListResources as well.
    • Specifying geni_start_time means tell me what will be available at that time. Default is now.
    • Specifying both geni_end_time and geni_start_time means show me only things available for that entire duration.

An Alternative: Provide two UpdateSlivers methods

One alternative would be to define two versions of UpdateSlivers, with and without an intermediate ticket. The no-ticket version of this method would behave like CreateSlivers, immediately allocating requested resources. For details on this proposal, see above.


Change Set F: Support AM and resource-type specific methods.

Define the control API (the AM API) as about moving slivers through various states at an AM.

Motivation

AM API methods logically change the state of the slivers at this AM. But the API is not clear what experimenters should expect, and does not provide easy ways for experimenters to control when and how states change. There is in particular no way to move slivers through states and change them in ways otherwise undefined by the API.

Change Set F1: Define Sliver States

Currently the AM API defines several possible states as valid returns in SliversStatus: configuring, ready, unknown, and failed. This change changes and expands that list of valid states, and explicitly defines the expected states after each AM API method call. Additionally, this change provides a mechanism for aggregates to supply their own states.

The GENI AM API can be thought of as manipulating slivers. As such, each method potentially changes the state of 1 or more slivers. With the changes proposed here, several of the methods return a new geni_status field, whose value is one of the standard GENI sliver status values. Aggregates must use one of the standard GENI values for that return.

geni_status legal values (case insensitive):

  • uninitialized: This is the state before any AM-local operation for this slice.
  • ticketed: The resources are reserved for the slice, but not currently provisioned for the slice. Slivers are ticketed after GetTicket, UpdateTicket, or after UpdateSlivers. Note in particular that a slice may have some resources that are ready and others which are ticketed after an UpdateSlivers call: we call the whole slice ticketed in this case.
  • allocated: The sliver(s) are currently provisioned for the slice, but not necessarily fully ready for experimental use (eg, not booted). This is the state after RedeemTicket, or after CreateSlivers with the geni_donotstart option.
  • ready: The resources are ready for experimental use, as in after CreateSlivers completes any booting or starting. Similarly after ActOnSlivers with the start command. Note that each of those methods starts a process that may take significant time to complete. During that time the sliver will not yet be ready.
  • closed: When the slice was previously provisioned resources, which have now expired or been de-allocated with DeleteSlivers, we call the sliver closed. Note that this state is rarely seen in practice - aggregates do not respond in this API to queries about slices that do not currently have outstanding allocations or tickets.
  • changing: This is the state of a sliver in transition. For example, while a machine is booting (changing from allocated to ready). This state used to be known as configuring.
  • shutdown: This is the state of a sliver after the Shutdown operation - the sliver is still allocated to the slice, possibly still booted and configured for the slice, but is not available for experimental use. And administrator must intervene to recover or delete the slivers.
  • failed: When an operation fails leaving the sliver unusable and requiring administrative intervention, it will be marked failed.
  • unknown: If the aggregate does not know the state of a sliver, it will be marked unknown. This state may be transitive, or may require an admin to recover.

As in previous versions of this API, the state of the full set of slivers in a slice at an aggregate is a roll-up of the states of each sliver. For each of ticketed, allocated, and ready, the set of slivers is only in that state if all individual slivers are in that state. If any sliver is shutdown or failed or changing (in order of decreasing precedence), then the set of slivers is in that state. If all slivers are unknown or closed, then the slice at this aggregate is unknown or closed.

  • If not all resources in the sliver/slice can be moved to the desired next state, then the call fails.
  • When moving from state 1 to 2, the slice is in state 1 until all slivers are in state 2 (EG moving from ticketed->allocated).

Aggregates are free to also return an aggregate specific status - either in an AM-specifically-named entry, or in am_specific_status. Such values should be thought of as sub-states within the GENI state. For example, where the GENI state might be changing, the AM specific state might also be imaging or booting. Methods which accept a state (ActOnSlivers) may accept either one of the geni_status values, or an aggregate specific value. Aggregates must document the meaning and use of aggregate specific status values.

State changes by method:

  • GetTicket: From uninitialized to ticketed
  • UpdateTicket: From ticketed to ticketed
  • ReleaseTicket: From ticketed to uninitialized (or allocated if this was an update)
  • RedeemTicket: From ticketed to allocated
  • CreateSlivers: From uninitialized to allocated
    • And then to ready via changing if the geni_donotstart option is not supplied
  • UpdateSlivers: From allocated or ready to ticketed
  • DeleteSlivers: From ready or allocated (or changing, etc) to closed (not ticketed)
  • Shutdown: From allocated or ready to shutdown

Note that changing or unknown may be a source state for any of these methods. Operations may fail, leaving a sliver failed, and operations may take time leaving a sliver changing for some time.

Methods and state transitions as a picture:

No image "sliver-states.jpg" attached to GAPI_AM_API_DRAFT

Note: some resources may not require an explicit 'start' operation. In this case CreateSlivers may leave some slivers ready, skipping right past allocated.

Summary of changes:

  • configuring becomes changing, which can be used in many other cases, in returns from SliversStatus
  • New states uninitialized, ticketed, allocated, closed, and shutdown are added
  • State transitions for each method are defined
  • am_specific_status optional return defined
  • geni_status is returned by SliversStatus, RedeemTicket, UpdateSlivers, CreateSlivers, and ActOnSlivers (if all relevant change sets are adopted).

Change Set F2: ActOnSlivers

This change introduces a new method, providing a generic way to act on slivers in an AM or resource type specific way. This method shall be used to 'start' or 'stop' or 'restart' resources that have been allocated but not started by CreateSlivers or RedeemTicket. It may also be used to change the state of slivers (or their contained resources) in an aggregate or resource specific way. Some aggregates may use this method to change configuration details of allocated resources. This might include changing acceptable login keys.

ActOnSlivers takes a command, urn, state, and options. The method return is a struct that includes the urn, geni_status of the sliver(s), and any other AM and operation specific options. The URN may be a slice urn, meaning all slivers in that slice at this AM are effected. Or the URN may be a particular sliver URN. The state argument is one of the geni_status values, or an AM-specific value. The state meaning depends on the command, but typically indicates the desired or resulting new state of the sliver(s). If the AM wishes to return an aggregate specific sliver status, it should still return a valid geni_status, and use an additional entry to also return the aggregate specific state. The command argument is aggregate defined. This API does not specify how aggregates advertise valid commands.

Three particular commands are specified however: start, stop, and restart (case insensitive). If an aggregate provides resources which require an explicit action to make allocated resources ready for experimenter use (booting, applying a configuration change) then the aggregate must make that operation available using these commands. These commands are used after RedeemSlivers or when the geni_donotstart option is supplied to CreateSlivers for example.

For example, to start allocated resources:

Arguments:
  command = start
  urn = <slice or sliver urn>
  state = ready
  options = <none required>
Result: 
  urn = <same as input>
  geni_status = changing or ready on success

FIXME: After UpdateSlivers, does start on the slice start only new stuff? How do changes to existing resources take effect? Does restart on the slice restart everything or only changed things? Must the experimenter selectively restart changed things and use start to start new things?

Method signature:

struct ActOnSlivers(string command, string credentials[], string urn, string state, struct options)

Return struct:

{
 string urn=<urn of sliver or slice>,
 string geni_status=<new state of the slivers>,
 <other entries specific to the AM or resources - specifically am_specific_status>
}

Change Set G: Credentials are general authorization tokens.

Motivation

Most AM API methods take a list of credentials to authorize operations. Currently the API requires credentials in a particular format, and would disallow others, such as ABAC. The API should allow for other innovative authorization tokens.

Make credentials more general

This change modifies the credentials argument to all methods. Each credential is now defined only as a signed document. A given list of credentials may contain credentials in multiple formats. The list may be empty. A given authorization policy at an AM may require 0, 1, or many credentials. Aggregates are required to allow credentials which are not used by local authorization policy or engines, using only credentials locally relevant.

  • An AM must pick credentials out of the list that it understands and be robust to receiving credentials it does not understand.
    • Current slice and user credentials will be recognizable for following the schema defined in Change Set K.
  • AMs are required to continue to accept current-format credentials.
    • In particular, a single standard slice credential remains sufficient for most authorization policies.
  • Other credential formats acceptable by some aggregates might include ABAC x509 Attribute certificates, eg.
  • AMs may get other authorization material from other sources: EG a future Credential Store service.

Changes to existing methods

Modify a few existing methods to make certain operations easier or more experimenter friendly.

Change Sets H&I: Misc other method changes

  • Change Set H: A second call to CreateSlivers without an intervening DeleteSlivers is an error.
    • This change just clarifies expected behavior that was left under-specified in AM API v1.
    • CreateSlivers takes a full Ad RSpec, it is not a way to 'add' resources.
    • Silently replacing the existing slivers with new slivers (similar to a call to UpdateSlivers) is not acceptable.
  • Change Set I1: Add geni_expires to return from SliversStatus for whole slice and then each sliver
    • This change standardizes behavior necessary for experimenters to determine their sliver expiration times.
    • Format is RFC3339 (http://www.ietf.org/rfc/rfc3339.txt)
      • Full date and time with explicit timezone: offset from UTC or in UTC)
      • eg: 1985-04-12T23:20:50.52Z or 1996-12-19T16:39:57-08:00
  • Change Set I2: Add SSH logins/keys to each node that supports SSH login in the return from SliversStatus

This change standardizes behavior so experimenters can readily find how to log in to reserved resources.

'users' => [{'urn'   => $user1_urn.
             'login' => $login,
             'protocol' => [ssh, or ?],
             'port' => [22 or ?],
             'keys'  => [...] },
            {'urn'   => $user2_urn.
             'login' => $login,
             'protocol' => [ssh, or ?],
             'port' => [22 or ?],
             'keys'  => [...] }
           ]
  • A note on distinguishing ListResources from SliversStatus:
    • ListResources in the context of a slice URN is for listing the reserved resources. It provides mostly static information. (But if the manifest contains things which can change, then the manifest must change when those things (like say IP addresses) change.)
    • SliversStatus is for everything else: anything which the AM can change for you using API calls, or which changes over time. So it has up/down state, expiration time, and now login keys. It provides that for your whole slice at this aggregate and all contained slivers.
  • Change Set I3: Return sliver expiration from CreateSlivers

Experimenters currently do not know the expiration of their slivers without explicitly asking. This change makes the CreateSlivers return value become a struct:

{
 string rspec=<manifest>,
 string geni_start_time=<optional (may be omitted altogether): now if not specified>,
 string geni_expires=<RFC3339 sliver expiration, as in geni_expires from SliversStatus>,
 string geni_status=<sliver state - allocated or changing or ready>,
 <others that are AM specific>
}
  • Change Set I4: CreateSlivers optionally does not start resources.

Currently, CreateSlivers auto starts resources, moving them from allocated through changing to ready.

  • Add a new option geni_donotstart:
    • If supplied and true (boolean: 0 or 1 in XML-RPC), final state is allocated, and experimenter must explicitly start resources using ActOnSlivers.
    • If omitted or false (0) (default), AM automatically starts resources as before, and state will be changing and then ready.

Change Set J: Proxy aggregate managers are supported

The GENI architecture that the architecture group is developing uses a proxy aggregate to handle standardized authorization and logging. This change is necessary to make such aggregates possible.

Proxy aggregates will be supported by having client tools delegate slice credentials to the proxy aggregate (preferably without undue work by experimenters). This delegated credential will typically be marked non-delegatable and otherwise limited where possible. Note that experimenters are trusting proxy aggregates to perform only the desired action on their slice, although the delegated slice credential would allow other actions. This trust however is no more than the trust experimenters place in other AMs.

Clients would contact the proxy aggregate URL, having learned that the proxy's URL is the way to contact the aggregate whose resources the client wants. Then through slice credential delegation the proxy AM can apply standard AM API authorization logic to authorize the operation at the proxy AM, and the 'real AM' can receive the call from the proxy AM, and also apply standard AM API authorization logic to validate the call from the proxy AM. Some proxied aggregate implementations may accept connections only from a proxy AM.

Two changes are required to make this possible:

  1. Add a new optional return entry from GetVersion: geni_is_proxy_am, a boolean (0 or 1 in XML-RPC). When present and true, the aggregate is advertising that it is a proxy aggregate. Clients should delegate any slice credentials to this aggregate when making subsequent AM API calls. This API does not specify how proxy aggregates advertise their certificate, which is required for clients to delegate slice credentials to the proxy.
  1. Define a new option geni_experimenter_urn.
    • Proxy AMs will retrieve the experimenter URN from the subjectAltName of the client SSL certificate, and then supply this value in calls to the 'real AM' where the call is being redirected.
    • Such 'real' AMs should recognize they are being invoked by a proxy AM (having received a slice credential delegated to a proxy aggregate). In that case and that case only, the AM should include this geni_experimenter_urn in all logs to indicate the actual experimenter being granted resources.
    • The supplied slice credential should be a delegated credential, indicating that this 'real' experimenter was granted the required rights on this slice at some point in this chain - typically directly by the slice authority, and as the signer of the current credential delegating rights to the proxy AM.

Change Set K: Standardize certificates and credentials

Motivation

The current AM API specifies that certificates and credentials follow a particular format, using URNs that are based on experimenter supplied names. However that specification is not sufficiently specific, and there are currently differences in implementation among existing certificate and credential producers. This has led to errors, experimenter confusion, and messy code.

Changes

This proposal requires that certificates include a UUID and email address for the subject. It codifies restrictions on usernames, sliver names, and slice names. The proposal specifies that slices have a UUID to be used to identify slices in a consistent way over time (slice names may be re-used).

Some overall points:

  • Aggregates are expected to fail requests that use certificates or URNs or names that violate this API.
  • Aggregates are required to consult and accept Certificate Revocation Lists in accordance with RFC 3280 and RFC 5280.
  • Schemas for certificates & credentials will be published on geni.net.

Certificates:

  • GENI uses x509v3 identity certificates to identity users, slices, aggregates, and authorities, and these restrictions apply to all such certificates.
  • See http://groups.geni.net/geni/wiki/GeniApiCertificates.
  • Aggregates are required to properly validate all certificates to authenticate access to AM API calls, and fail calls that supply invalid certificates.

Certificate contents:

  • Version shall be properly marked: 3
  • serialNum is required to be unique within the certificate authority: each newly issued certificate must have a unique serial number.
  • The Distinguished Name should include a human readable identifier, for both subject and issuer. Details are not specified
  • User (and most authority) certificates shall be marked CA:TRUE in the x509 v3 basic constraints; Slices shall be marked FALSE
  • The Subject Alternative Name field must include 3 pieces of information
    • Entries are comma separated (', '), and may be in any order.
    • The URN identifier, following GENI URN standards as described here: http://groups.geni.net/geni/wiki/GeniApiIdentifiers
      • The URN is identifiable by looking for the entry beginning "URI:urn:publicid:IDN", for example: URI:urn:publicid:IDN+emulab.net+user+stoller.
    • A UUID, providing a globally unique ID for the entity.
      • In the hexadecimal digit string format given in RFC 4122
      • The UUID is identified with this prefix: "URI:urn:uuid" (as specified by RFC4122), for example: URI:urn:uuid:33178d77-a930-40b1-9469-3aae08755743.
    • The email address is an RFC2822 compliant and working address for contacting the subject of the certificate (experimenter, authority administrator, or slice owner).
      • The email entry is identified by the prefix "email:", for example: email:stoller@emulab.net
  • Recommendation: Authorities are encouraged but not required to include a URL where more information about the subject is available (eg slice authority registry URL). That URL may be included in a certificate extension, in the DN, or in the subjectAltName.

Slices:

  • Slice ID is the UUID in the slice certificate, which is unique over time and space.
  • Slice URN is a label - unique at a point in time but not over time.
    • Format: urn:publicid:IDN+<SA name>+slice+<slice name>
  • Slice names are <=19 characters, only alphanumeric plus hyphen (no hyphen in first character): '^[a-zA-Z0-9][-a-zA-Z0-9]+$'
  • Aggregates are required to accept any compliant slice name and URN.
    • Note that this currently causes problems at PlanetLab/SFA aggregates, where node login names are based on slice names and are limited to 31 characters.

Slivers:

  • Have a URN (returned in manifest RSpec), determined by the aggregate.
  • This URN should be unique over time within an AM for record-keeping / operations purposes.
    • Format: urn:publicid:IDN+<AM name>+sliver+<sliver name>
  • Sliver names
    • Must be unique over time within that AM, and are selected by the AM.
    • May use only alphanumeric characters plus hyphen.

Usernames:

  • Usernames (user identifiers to the system) are set at the authority.
  • Usernames are case-insensitive internally, though they may be case-sensitive in display.
    • EG JohnSmth as a display name is johnsmth internally, and there cannot also be a user JOHNSMTH.
  • Usernames should begin with a letter and be alphanumeric or underscores - no hyphen or '.': ('^[a-zA-Z][\w]+$').
  • Usernames are limited to 8 characters.

Change Set L: Change SFA credentials' privileges

Our goal is to simplify and standardize privilege strings. Currently there are wildcards, bind, embed, and others. They are confusing. We also want extensibility to use these credentials elsewhere in future.

Credentials should support these kinds of operations:

  • Learn about the slice
  • Add/Modify/Delete resources in the slice
  • Read slice details like I&M?
  • Use the slice
  • Operator shutdown

Proposal - Replace all existing privileges with only the following possible strings (case insensitive):

  • CanWrite
    • If present in a valid slice credential, aggregates may permit CreateSlivers, RenewSlivers, DeleteSlivers, Shutdown, plus new methods ActOnSlivers, UpdateSlivers, GetTicket, RedeemTicket, UpdateTicket, ReleaseTicket
    • Thus it replaces bind, embed, control, instantiate, sa, pi, or * in various places
  • CanRead
    • If present in a valid slice credential, aggregates may permit ListResources with a slice_urn, SliversStatus
    • Thus it replaces info or * in various places
  • CanReadDetails
  • CanUse

Note that those last 2 may never get used, but are there in case I&M or opt-in make those useful.

Note also that operators who wish to shut down a slice would need a slice credential with the CanWrite privilege.

Privilege and credential semantics are defined as follows:

  • Aggregates may only grant access to a method if at least one valid credential
    • grants the required privilege (if any)
    • to the caller of the API method
      • (identified by their SSL client certificate and the owner_gid in the credential)
    • over the slice (if any) on which they are operating
      • (target_gid in the credential).
  • Other privileges may be present in the same or other credentials.
  • Local aggregate policy may deny access to a particular method even in the presence of a valid credential granting the required privilege.
  • Some operations (e.g. GetVersion) may either simply require a valid credential with no particular privilege, or have no credentials argument at all.

Note also that some current AMs do not require any particular privileges to do ListResources, even with a slice_urn. This change explicitly requires that aggregates require a valid slice credential with CanRead privileges to perform this operation.


Changes Not Included

RSpec changes resulting in GENI v4 RSpecs

  • Support unbound manifests
  • Make manifest an extension of Request, so you can readily edit & resubmit a manifest
  • Make configuration information in request and manifest optional, so it can be supplied/returned separately
  • Fully implement the compute ontology from Ilia
  • Ilia's other requests (Openflow related information)
  • Document process for updates per my dev list email
  • Be consistent: ref vs idref vs href
  • Include AM name/URL in RSpecs? Experimenter who allocated it URN?
  • Incorporate stitching extension as part of the 'base' RSpec

Misc

  • Split ListResources with slice_urn from ListResources without. ListResources with slice_urn we call Resolve(slice_urn).
  • Add geni_credential_types to GetVersion return?
  • Add geni_am_info block to GetVersion return (name, id, url, location, description, is_proxy, proxy: {(a geni_am_info block)}, proxy_for[] (list of geni_am_info blocks))
  • Define a new option geni_am_id for all methods. Proxy AMs may use this ID to look up the URL of the real AM, and pass the call along.
    • Instead just encode this am_id in the URL that the experimenter accesses, so that the proxy AM knows where to direct the call.
  • Allow the update methods to take a generic rspec argument, allowing AMs to accept full or diff RSpecs
  • Tickets
    • Remove requestor certificate?
    • Support brokers: Make ticket methods return multiple tickets. Define tickets as optionally diffs (additive). Make RedeemTicket and UpdateTicket take a list of tickets. Tickets are delegatable via signing, but the delegated ticket must be for a strict subset of the resources in the original.

Stitching

AMs must support the stitching extension where Layer 2 connections are available:


Change summary - method signatures

If all change sets listed here are adopted, the final method signatures will be as follows:

GetVersion

struct GetVersion([optional: struct options])

Return struct:

      {
        int geni_api;
        struct geni_api_versions {
             URL <this API version #>; # value is a URL, name is a number
             [optional: other supported API versions and the URLs where they run]
        }
        array geni_request_rspec_versions of {
             string type;
             string version;
             string schema;
             string namespace;
             array extensions of string;
        };
        array geni_ad_rspec_versions of {
             string type;
             string version;
             string schema;
             string namespace;
             array extensions of string;
        };
      }

ListResources

struct ListResources(string credentials[], struct options)

Where options include:

{
  boolean geni_available;
  boolean geni_compressed;
  string geni_slice_urn;
  struct geni_rspec_version {
    string type;
    string version;
  };
  string geni_start_time;
  string geni_end_time;
}

Return struct:

{
   rspec (ad or Manifest - may be empty though)
   tickets[] (required but may be an empty list)
}

GetTicket

struct GetTicket (string slice_urn, string credentials[], string requestRSpec, 
                           struct options)

Options include geni_start_time and geni_end_time

Return: ticket

UpdateTicket

struct UpdateTicket(string slice_urn, string credentials[], string requestRSpec, 
                                string ticket, struct options)

Options include geni_start_time and geni_end_time

Return: ticket

RedeemTicket

struct RedeemTicket(string slice_urn, string credentials[], string ticket, 
                                 struct users[], struct options)

Options include geni_auto_start

Return struct:

{
 string rspec=<manifest>,
 geni_start_time=<optional (may be omitted altogether): now if not specified>,
 geni_expires=<RFC3339 sliver expiration>,
 string geni_status=<sliver state - allocated or changing or ready>,
 string geni_state_guarantee=<promise from AM of what experimenter state will be lost on trying to 'start' this allocation>,
 <others that are AM specific>
}

UpdateSlivers

struct UpdateSlivers(string slice_urn, string credentials[], string rspec, 
                                                 struct options)

Options include geni_start_time and geni_end_time

Return struct:

{
  string ticket=<ticket>
  string geni_status=<sliver state - ticketed>,
  string geni_state_guarantee=<promise from AM of what experimenter state will be lost on trying to 'start' this allocation>,
 <others that are AM specific>
}

ReleaseTicket

struct ReleaseTicket(string slice_urn, string credentials[], string ticket, struct options)

Return: boolean

CreateSlivers

struct CreateSlivers(string slice_urn,
                    string credentials[],
                    string rspec,
                    struct users[],
                    struct options)

Options include:

{
  boolean geni_donotstart (optional),
  string geni_start_time <datetime> (optional),
  string geni_end_time <datetime> (optional)
}

Return struct:

{
 string rspec=<manifest>,
 geni_start_time=<optional (may be omitted altogether): now if not specified>,
 geni_expires=<RFC3339 sliver expiration, as in geni_expires from SliversStatus>,
 string geni_status=<sliver state - allocated or changing or ready>,
 <others that are AM specific>
}

RenewSlivers

struct RenewSlivers(string urn,
                    string credentials[],
                    string expiration_time, 
                    struct options)

Return: boolean

SliversStatus

struct SliverStatus(string slice_urn, string credentials[], struct options)

Return:

{
  string geni_urn: <sliver URN>
  string geni_status: ready
  geni_expires: <datetime of expiration>
  struct geni_resources: [ { geni_urn: <resource URN>
                      geni_status: ready
                      geni_expires: <datetime of individual sliver expiration>
                      geni_error: ''},
                    { geni_urn: <resource URN>
                      geni_status: ready
                      geni_expires: <datetime of individual sliver expiration>
                      geni_error: ''}
                  ]
}

Where for individual resources this block may be returned:

'users' => [{'urn'   => $user1_urn.
             'login' => $login,
             'protocol' => [ssh, or ?],
             'port' => [22 or ?],
             'keys'  => [...] },
            {'urn'   => $user2_urn.
             'login' => $login,
             'protocol' => [ssh, or ?],
             'port' => [22 or ?],
             'keys'  => [...] }
           ]

ActOnSlivers

struct ActOnSlivers(string command, string credentials[], string urn, string state, struct options)

Return struct:

{
 string urn=<urn of sliver or slice>,
 string geni_status=<new state of the slivers>,
 <other entries specific to the AM or resources - specifically am_specific_status>
}

DeleteSlivers

struct DeleteSlivers(string urn, string credentials[], struct options)

Return: boolean

Shutdown

struct Shutdown(string slice_urn, string credentials[], struct options)

Return: boolean

Attachments (1)

Download all attachments as: .zip