ClusterDMtg070209: cluster-D-review-notes-2July09.txt

File cluster-D-review-notes-2July09.txt, 15.7 KB (added by hmussman@bbn.com, 15 years ago)
Line 
1Cluster D Review
2July 2nd, 2009
3
4GPO: Chip, Harry, Heidi, Aaron
5PIs: Prashant Shenoy (vise), Michael Zink, David Irwin, Ilia, Brian
6     Lynn, Rajiv Rambanth?, Brian Irvine,
7     phone: Karen Bergmen, Hongwei,  Franz (Columbia), Caroline
8     (Columbia), Michael (Columbia)
9
10Slides, agenda on wiki at
11http://groups.geni.net/geni/wiki/ClusterDMtg070209
12
13======
14
15Harry: Introduction (from slides)
16
17- review of spiral 1 goals
18  - for GENI/ORCA clearinghouse
19  - for each substrate
20
21- GENI goals for spiral 2
22  - lots of live experiments
23  - continuous operation (may be challenging)
24  - identity management
25  - improved integration of data and control planes
26  - instrumentation
27  - interoperability of clusters, permitting clusters to access the
28    widest number of aggregates
29
30- Process
31  - aim for GENI spiral 2 goals
32  - start with ORCA integration roadmap
33  - add input from all projects & GPO
34  - to be completed by mid-July
35    - want consensus of GPO and all PIs
36  - will drive spiral 2 milestones for each project
37
38======
39
40ORCA/BEN, Ilia (from slides)
41
42Progress so far
43- deployed ORCA into BEN
44- Demonstrated VLANs across BEN using ORCA at GEC4
45- Developed drivers for BEN
46- Developed NDL-OWL RSpecs
47- Connected BEN to NLR FrameNet via RENCI and Duke
48- Collaborated with Cluster D: software releases, ORCA-fest,
49  integration assistance
50
51Challenges
52- Resource representation for heterogeneous substrates
53- Stitching of slivers to slices; requires information not available
54  until after provisioning, true for networks and VMs
55- Connectivity issues: delays in dealing with campus IT, cost
56
57Integration process
58- Produced an integration document that outlines how projects of
59  different types can choose to integrate
60- Code releases: tarball
61- Clearinghouse to be stood up in July after GEC5 demo
62
63Backbone connectivity
64- NLR connectivity sorted out at two BEN PoPs; 10 VLANs available from
65  NLR (only 2 remapped by Duke)
66- Working on getting I2 DCN connectivity; talking to John Balbrecht at
67  I2 for research; aiming for demo before 10/31; some costs from local
68  provider, 100 miles between I2 drop-off and research triangle; I2 DCN
69  people are different from the network engineering people and are
70  different from the GENI wave
71
72Spiral 2 plans
73- Ontological resource representation; working with Keren on
74  measurements and Cluster E&D on wireless
75- Generic solution to sliver stitching
76- Introducing measurement devices into BEN and adding ORCA support
77- Identity management using Shib
78- Cloud substrate and control
79- Feature roadmap items
80
81Progress against milestones
82- Planning to make ORCA a production capability for BEN; 2 issues:
83  will the software work? and will there be sufficient documentation
84  that people can use it?
85- Can ORCA be used to request connections to I2?  Yes, working with
86  Chris Tracy.  More than 1 way to do it.  Could create an ORCA AM for
87  I2.  Complicated by I2's work on their next-gen network
88  architecture.
89- Experimenters in mind for spiral 1?  Mostly Keren, already share
90  cluster & interests.  More detail later today.  Other experimenters
91  will need patience and expertise.  Plan to take a cautious approach.
92- Could move faster with more funding.  Could GPO help with
93  documentation?  Maybe a little but probably not.
94
95ORCA Milestone Review, Harry
96- on geni.net wiki page
97- 6/1 milestones will be done by 7/7 demo
98
99=======
100
101DOME, Brian Levine (from slides)
102
103Progress in year 1
104- hardware upgraded, OS virtualized, wifi virtualized
105- basic ORCA setup on buses, not integrated into clearinghouse yet
106- v1 software release started, completed portal, experiment control
107- preparing for demo later this summer
108- started cluster integration
109
110Integration with ORCA
111- developed a controller, creates lease, and handler, that performs
112  actions based on leases
113- need DOME portal interfaces to ORCA
114- ORCA instance will be shared with VISE project on geni.cs.umass.edu
115
116Connection to Internet2
117- In discussions with local OIT staff
118- Will connect geni.cs.umass.edu to i2 directly via fiber from campus
119  to Springfield to Boston, working in tandem with VISE team;
120  requesting 1 VLAN ID, expecting little traffic; will have the
121  capability to inject frames with VLAN tags
122
123Experiments by outsiders
124- Have portal for job submission, one resource: wifi card, 900MHz
125  radio coming next year; gets status; interfaces to ORCA
126- Experiments on Buses. DTN tolerant downloading of experiments to
127  buses; dynamic creation of VM sandbox to execute experiments,
128  includes partitions, networking, devices; scheduling of experiments
129  based on ORCA leases
130- Instrumented measurements.  Half the core is about remote management
131  and diagnostics.  Researcher data is uploaded automatically if
132  certain file format is used.
133
134Plans for Spiral 2
135
136- Goal is to deploy on buses this summer.  5-6 out of 40 buses in field per
137  day.
138
139- Will bring on undergrad projects in the fall from "Intro to
140  Networking" class.  Imagine only 1 or 2 projects will get far enough
141  to use the real system.  Have a PhD graduating this year heading to
142  Arkansas who is expected do experiments.
143
144- Plan to allow experimenters to access XTend radios
145
146- Want 'true and open access to testbed by experimenters'
147
148- Tutorials on how to use?  May require some separate funding.  Keren
149  very interested in such a tutorial. 
150
151  Concern about maintaining documents after publishing.  This is made
152  more difficult since the prototypes are rapidly evolving.
153
154  Concern that supporting users will delay system development because
155  of the need to have a stable system.  GPO thinks projects have
156  latitude to define the level of stability and support that makes
157  sense for them.  Currently, GENI is in the 'exploratory prototyping'
158  phase, likely that composition and design will change over time.
159
160  Anish: We've found that tutorials are good as a way to train PIs
161  grad students and colleagues.  There's a real cost to keeping it up
162  but it is in general a good idea.  What events would be a good fit?
163  Having a pull from users helps a lot. 
164
165  Chip: one approach is that folks want to use a remote system
166  initially then make their own local copy.  Anish: yes, that's our
167  experience but doing 'cross-experiments' has been quite hard.  Not
168  clear what the benefit is to other users.
169
170  What are the GPO's incentives to get people to do experiments?  Not
171  much.  The GPO doesn't sponsor research.  This is what NSF does.
172  The GPO sponsor trial experiments to help shake down the system.
173
174Milestone Review, Harry (from wiki)
175
176======
177
178VISE, Preshant Shenoy with assist from David Irwin (from slides)
179
180year 1 progress
181- completed assembly, deployment of 3 nodes
182- initial ORCA integration
183- GEC4 demo (no radars but pan-zoom cameras)
184- outreach with UPRM underway
185- clearinghouse integration underway
186- sensor virtualization underway with some issues
187- can virtualize an actuator in a guest VM
188- testbed available for use within cluster
189
190Challenges: year 1
191- Getting up to peed on radar/sensor interfaces; upgrading nodes to
192  support multiple users; lost domain expert; simplifying
193  radar/sensor control code.
194- Radar/sensor virtualization. problems controlling some devices in Xen
195  and other virtualization domains; temporarily switching to VServers
196  to continue progress; Xen folks seem to be fixing the problems.
197
198ORCA integration
199- Developing plug-in points: resource handlers, slice controllers,
200  table-driven allocation policy
201- setup geni.cs.umass.edu with DOME; will host both DOME and ViSE
202  actor servers; will transition from using a local clearinghouse to
203  the one at RENCI when necessary
204
205Internet2
206- Multiple meetings with UMass-Amherst OIT; working to get VLAN
207  connection over NEREN to an I2 PoP in Boston (NOX). 
208- What happens to our traffic at NOX?  Ilia introduced ViSE to John
209  Volbrecht at I2 DCN.  Coordinating with him one I2 PoP connection
210  is in place.
211
212Plans: Experiment Examples
213
214- Sensor-centric experiments.  E.g., comparing fidelity from Furuno,
215  Raymarine, MA1 (students from Peurto Rico) to focus on ground truth
216  verification and long-term data collection with the UMass trace
217  repository.  Peurto Rico very interested in low-infrastructure
218  radars due to its geography.  Have a PhD student going back to PR,
219  might be a good GEC grant candidate.  Virtualization of these
220  systems might be of great interest to engineering communities, not
221  just networking.  PR is developing a student testbed with tight
222  collaboration with UMass.
223
224- Long-distance wireless experiments such as looking at wireless BW
225  over long-distance.  One REU and one UMass undergrad.  Doing
226  experiments now, would like to migrate to GENI.
227
228- Longer term vision.  Connect sensed data to cloud-based storage and
229  processing. Run complete experiments: sensing + storage +
230  processing.  BEN is applying for some processing; has a state
231  project for micro-rain radar
232
233Current Spiral 2 Milestones
234- 12/2009
235  - sensor slivering
236  - VISE integration with clearinghouse
237- 1/2010
238  - installation of rapidly-deployed node
239  - installation of camera devices
240  - 3day class at univ of PR on virtualization & GENI
241- 4/2010
242  - virtualize camera devices
243  - integration of slivering into TB
244  - TB allocation policy for sensors
245  - experiment control framework from updated reference software
246- 10/2010
247  - demo with multiple experiments
248  - make federated TB available to outside GENI users
249
250- Some spiral 2 milestones accelerated to spiral 1
251
252Discussion
253
254- At a high level, looks like the VISE and DOME milestones are
255  independent from ORCA.  Is this true?  No.  Original milestones
256  pre-date ORCA-awareness.  Big ORCA contribution is in resource
257  representation. 
258
259======
260
261KanseiGeni, Anish (from slides)
262
263- 2 primary activities this year:
264
265  - phase 1 GENIfication: required re-factoring component & aggregate
266    managers as web services
267
268  - phase 2 ORCAfiction: decomposed researcher portal to introduce
269    ORCA actors, implementing some parts, specifying some policies for
270    RSpec definition,
271
272- Experiments: tend to be either those interested in a) understanding
273  low-power networking phenomena, e.g., security without shared keys
274  requiring modification of MAC layer or b) those using sensors, e.g.,
275  collecting sensor data in a portable array then injecting it into a
276  sensornet application running in the testbed, can be used for
277  protocol evaluation or tuning.  A recent focus has been on energy
278  monitoring.
279
280  - External users at UCLA, Northwestern, UT Dallas, ICT Australia,
281    Michigan State, Wayne State, SUNY Buffalo
282
283  - Neteye in use by universities in the US, Hong Kong, China and
284    maybe Bangalore.
285
286  - Motivated by federation scenarios: multi-fabric sensing, seamless
287    regression testing, portable and stationary arrays, and
288    fabric-via-cloud
289
290Challenges
291
292- What is the motivation for engaging external partners in GENI?
293
294- Overly-constrained budget. Interested in science education in later
295  spirals; additional capabilities in D&P proposal; still interested
296  in SunSPOTs. 
297
298  NSF doesn't have a way to pay for operations costs. Might want to
299  have a workshop or other organized forum educate CISE on operations
300  costs.
301
302- Effort needed for GENI-fication and ORCA-fication.  Got a lot of
303  help with ORCA-fication.
304
305Internet2 connection
306
307- Can get VLAN connect into Chicago PoP.  Would cost ~$5k/mo.  100Mbps
308  would be lower, cost unknown.  Just an L3 node is $235/mo.  At Wayne
309  via MERIT L3 is available at no cost, L2 too high at $30k/yr. 
310
311Spiral 2 plans
312- Focus is on federation.  Need to make netEye KanseiGENI compatible.
313- Experiment interaction user service, i.e., GENIfication of Kansei
314  researcher portal
315- Basic federation of resource discovery, embedding, and scheduling
316- Support for experiments, make Kansei ORCA integration model
317  available to other testbeds.
318
319Milestone Review, Harry (via wiki)
320
321- split some milestones to reflect work accomplished and highlight
322  ORCA integration necessary
323
324=================
325
326Embedded Real-Time Measurements, Franz Fidler (from slides)
327
328- Motivation: Emphasis on monitoring PHY layer conditions; seeing
329  interest by others in making measured data available.  Networks will
330  have greater diversity of bit-rates, waveforms, as well as dynamic
331  optical routing and cross-layer optimization.  Flexibility will be
332  needed by PHY layers will cause relaxed performance constraints and
333  that will create the need for greater monitoring.
334
335- GENI challenges: integration of measurement resources into substrate
336  and control plane frameworks.
337
338- Progress in year 1:
339  - assessed requirements for real-time measurements within future
340    GENI infrastructure
341  - Assessed interface specifications based on GENI requirements
342  - proposed unified measurement framework (UMF)
343  - performed performance simulations
344  - joined cluster D
345    - Working towards I2 connection via perfSONAR
346    - Developing plans for experiments by outside GENI researchers
347
348- Year 1: unified measurement framework requirements & example
349
350- Plans for Spiral 2
351
352  - drive prototyping forward: hardware part of UMF and demonstrate
353    interface between UMF and a network device capable for PHY layer
354    monitoring
355
356  - start integration efforts with BEN, who has fiber switches
357
358======
359
360Remaining tasks for Spiral 1
361  - see Harry's notes
362
363======
364
365Discussion on monitoring
366
367- Jeff: hyperic may be a good tool that uses ganglia to monitor VM
368  status, can be extended to other things than VM; have a student
369  working on this; looking for input as to whether this is a good
370  direction.
371
372  Ilia: have looked at PCP for a monitoring system.  some concern that
373  Hyperic may have an uncertain future as it was acquired last month.
374  have indicated to GMOC that there is a strong preference to
375  developing a SOAP interface
376
377  Heidi: need to think about what data you want to expose and share
378
379  Anish: Kansei has an interface for getting experiment and substrate
380  status. 
381
382=======
383
384Discussion of Spiral 2 Capabilities  (see Harry's notes)
385
386- Chip: central goal of Spiral 2 is live experimentation; won't be
387  easy; everything else is secondary
388
389- Experiments: Want to see experiments of end-to-end systems, e.g.,
390  matching networks with servers, sensors with clouds, content with
391  buses, radars & buses
392
393- Identity management: DOME plans to plug into some foreign
394  authentication manager.  GPO plans to try to go to Shib & InCommon,
395  since it looks like it might take hold broadly.  Rough mental model
396  is that in the early days, almost anybody can use this.  Need ways
397  to say 'these people are out' but probably don't need elaborate
398  policies early on.  GPO is interested in knowing whether folks
399  believe moving to Shibboleth is a bad idea.  Shib has been most
400  successful for managing user access to a portal, seems good for GENI
401  as far as that goes and that's what ORCA uses it for.  What that
402  user is allowed to do is another issue and not currently handled by
403  Shib attributes.  Shib is only starting to handle delegated
404  authority, still early days.
405
406- Improved integration. 
407
408  Need to improve distribution of keys to containers.  Currently
409  manual, ugly, error prone, requires documentation but will work
410  until container reaches 10's.  Currently no privacy.  Could be added
411  by using HTTPS on transport.
412
413- Measurements
414
415  DOME collects location information on buses but needs a solution on
416  providing it to GENI users.  Has also been thinking about
417  longitudinal studies, Brian will send Chip a copy of a paper.
418
419  ERTM will prototype hardware, interface it with BEN, integrate with
420  ORCA.
421
422- Interoperability.  Want to avoid CF balkanization.