Professional Documents
Culture Documents
IEEEComputer Gil
IEEEComputer Gil
Yolanda Gil1, Ewa Deelman1, Mark Ellisman2, Thomas Fahringer3, Geoffrey Fox4, Dennis
Gannon4, Carole Goble5, Miron Livny6, Luc Moreau7, Jim Myers8
1
USC Information Sciences, 2University of California San Diego, 3Innsbruck University,
4
Indiana University, 5Manchester University, 6University of Wisconsin Madison, 7University of
Southampton, 8 National Center for Supercomputing Applications
[email protected], [email protected], [email protected], [email protected],
[email protected], [email protected], [email protected],
[email protected] , [email protected], [email protected]
Abstract
Workflows have recently emerged as a paradigm
for representing and managing complex distributed
scientific computations and therefore accelerate the
pace of scientific progress. A recent workshop on the
Challenges of Scientific Workflows, sponsored by the
National Science Foundation and held on May 1-2,
2006, brought together domain scientists, computer
scientists, and social scientists to discuss requirements
of future scientific applications and the challenges that
they present to current workflow technologies. This
paper reports on the discussions and recommendations
of the workshop, the full report can be found at
http://www.isi.edu/nsf-workflows06.
1. Introduction
Significant scientific advances are increasingly
achieved through complex sets of computations and
data analyses. These computations may comprise
thousands of steps, where each step may integrate
diverse models and data sources developed by
different groups. The applications and data may be
also distributed in the execution environment. The
assembly and management of such complex distributed
computations present many challenges, and
increasingly
ambitious
scientific
inquiry
is
continuously pushing the limits of current technology.
Workflows have recently emerged as a paradigm for
representing and managing complex distributed
scientific computations and therefore accelerate the
pace of scientific progress [1,2,3,4,5,6]. Scientific
workflows capture the individual data transformations
and analysis steps as well as the mechanisms to carry
them out in a distributed environment. Each step in the
Figure 1: A view of the Rho Oph dark cloud constructed with Montage from deep exposures made
with the Two Micron All Sky Survey (2MASS) Extended Mission [10]. Montage relies on workflow
technologies to generate science-grade mosaics of the sky that are used by astronomers to discover new
phenomena.
background in the infrared images hid the structure of
treatment of workflows is needed to meet long-term
the galaxy. By using Montage, which is able to
requirements of scientific applications.
rectify the backgrounds to a common level,
astronomers were finally able to see the bar structure
To examine the nature of these challenges and to
consider what steps should be taken to address them, a
Much research is underway to address issues of
Workshop on the Challenges of Scientific Workflows
creation, reuse, provenance tracking, performance
was held at the National Science Foundation on May
optimization, and reliability. However, to fully realize
1-2, 2006. The meeting brought together domain
the promise of workflow technologies, many
scientists, computer scientists, and social scientists to
additional requirements and challenges must be met.
discuss requirements of future scientific applications
Scientific applications are driving workflow systems to
and the challenges that they present to current
examine issues such as supporting dynamic eventworkflow technologies.
driven
analyses,
handling
streaming
data,
accommodating interaction with users, intelligent
This report summarizes the discussions and
assistance and collaborative support for workflow
recommendations of the workshop. The workshop
design, and enabling result sharing across
discussions focused on four main topics, summarized
collaborations. As a result, a more comprehensive
in the following four sections. The final section of the
2. Discussion
Requirements
Topic
I:
Application
Dynamic
System-level
www.teragrid.org,www.nsfmiddleware.org.
6.
Concluding
Recommendations
Remarks
7. Summary of Recommendations
The following recommendations were made by the
workshop participants:
and
8. Acknowledgements
This workshop was sponsored by the National Science
Foundation under grant # 0629361. We would like to
thank Maria Zemankova, Program Manager of the
Information and Intelligent Systems Division, for
supporting the workshop and contributing to the
discussions. We would also like to thank all the
workshop attendees for their contributions: Mark
Ackerman, Ilkay Altintas, Roger Barga, Francisco
Curbera, Constantinos Evangelinos, Juliana Freire, Ian
Foster, Alexander Gray, Jeffrey Grethe, Jim Hendler,
Carl Kesselman, Craig Knoblock, Chuck Koelbel,
Karen Myers, Walt Scacchi, Ashish Sharma, Amit
Sheth, Alex Szalay, and Gregor Von Laszewski. The
authors would also like to thank Bruce Berriman for
his input and discussions.
9. References
[1] Deelman, E. and Gil, Y. (Eds.) Final Report of the
NSF Workshop on Challenges of Scientific
Workflows,
National
Science
Foundation,
Arlington, VA, May 1-2, 2006. Available at
http://www.isi.edu/nsf-workflows06.
[2] Deelman, E. and Taylor, I. (Eds.) Journal of Grid
Computing, Special Issue on Scientific
Workflows, Volume 3, Number 3-4, September
2005.
[3] Deelman, E., Zhao, Z., and Belloum, A, (Eds.)
Scientific Programming Journal, Special Issue on
Workflows to Support Large-Scale Science. 2006.
[4] Fox, G. and Gannon, D. (Eds.) Concurrency and
Computation: Practice and Experience, Special
Issue on Workflow in Grid Systems. Volume 18,
Issue 10, August 2006.
[5] Ludaescher, B. and Goble, C. (Eds.) SIGMOD
Record, Special Issue on Scientific Workflows,
Volume 34, Number 3, September 2005.