Personal tools
  •  
You are here: Home Tutorials WORLDCOMP'09 Tutorial: Prof. H. J. Siegel
Current Event
WORLDCOMP'15
Click Here
Other Events
WORLDCOMP'14
Click Here

WORLDCOMP'13
Click Here

WORLDCOMP'12
Click Here

WORLDCOMP'11
Click Here

WORLDCOMP'10
Click Here

WORLDCOMP'08
Click Here

WORLDCOMP'06 & '07
Click Here


Join Our Mailing List
Sign up to receive email announcements and updates about conferences and future events




 

WORLDCOMP'09 Tutorial: Prof. H. J. Siegel

Last modified 2009-07-03 10:49


Robust Resource Management for Parallel and Distributed Computing Systems: Models and Methods
Prof. H. J. Siegel
Abell Endowed Chair Distinguished Professor of Electrical and Computer Engineering and Professor of Computer Science
Director, CSU Information Science and Technology Center (ISTeC)
Colorado State University, Fort Collins, Colorado, USA

Date: July 13, 2009
Time: 6:00-9:00 PM
Location: Ballroom 6


DESCRIPTION

    What does “robust” mean? Often people state that their system software component, piece of hardware, application code, or technique is “robust,” but never define what they mean by “robust.” How does one determine if a claim of robustness is true when it is not defined? Furthermore, without a definition, robustness cannot be quantified, so if two people claim to have robust computing systems, for example, how can one decide which is the more robust? These are the types of issues we address in this tutorial. We study robustness in the context of resource allocation in heterogeneous parallel and distributed computing systems, but the robustness concepts presented have broad applicability.

    In heterogeneous parallel and distributed computing environments, a collection of different machines is interconnected and provides a variety of computational capabilities. These capabilities can be used to execute a workload composed of different types of applications, each of which may consist of multiple tasks, where the tasks have diverse computational requirements. The execution times of a task may vary from one machine to the next, and just because some machine A is faster than some machine B for task 1 does not mean it will be faster for task 2. Furthermore, there can be inter-task data dependencies. Tasks must share the computing and communication resources of the system. A critical research problem for heterogeneous computing is how to assign tasks to machines and schedule the order of their execution.

    The resources in heterogeneous parallel and distributed computing systems should be allocated to the computational tasks in a way that optimizes some given system performance measure. However, allocation decisions and associated performance prediction are often based on estimated values of task and system parameters. The actual values of these parameters are uncertain, and may differ from the estimates. For example, the estimates may represent only average values, the models used to generate the estimates may have limited accuracy, or there may be changes in the environment. Because the actual values of these parameters are uncertain, the actual system performance may differ from the predicted performance. Thus, it is important to develop resource management strategies that strive to meet particular system performance requirements even when such uncertainties are present.

    To address this problem, we have designed two models for deriving the degree of robustness of a resource allocation. One model is based on having deterministic estimates of the parameters whose exact values are uncertain. In this case, the degree of robustness of a resource allocation is quantified as the maximum amount of collective difference between actual and estimated values in these system parameters within which a user-specified level of system performance (QoS) can be guaranteed. The second model assumes that stochastic information is available about the values of these parameters whose actual values are uncertain. With this model, the degree of robustness is quantified as the probability that a user-specified level of system performance can be met.

    Both robustness models, and the robustness metrics associated with each, will be presented. It will be shown how they can be used to evaluate and compare the robustness of different resource allocations. In addition, it will be demonstrated how these models can be incorporated into resource management heuristics that produce robust allocations to optimize some user-specified performance criterion. Robust resource allocation heuristics for a variety of environments will be discussed and compared. This will be done for both static heuristics, which are executed off-line for production environments, and dynamic heuristics, which are executed on-line for environments where tasks must be assigned resources as they arrive into the system.

    The tutorial material is applicable to various types of heterogeneous computing and communication environments, including parallel, distributed, cluster, grid, Internet, cloud, embedded, multicore, content distribution networks, wireless networks, and sensor networks. Furthermore, the robustness models, concepts, and metrics presented are generally applicable to design problems throughout various scientific and engineering fields.



OBJECTIVES

    This course will enable you to:

    • Understand the problem of robust resource allocation in heterogeneous parallel and distributed computing systems
    • Ask the “three robustness questions” that must be answered whenever anyone makes robustness claims
    • Apply the appropriate model of robustness depending on the information available about the system uncertainties
    • Develop and use robustness metrics to quantify the robustness of a particular resource allocation for a given computational environment
    • Design resource allocation heuristics that incorporate robustness, for both static (off-line) and dynamic (on-line) environments
    • Use the concepts of robustness in a variety of problem domains in the scientific and engineering fields


INTENDED AUDIENCE

    This course is intended for faculty, engineers, scientists, and graduate students who want to learn how to define, model, and quantify robustness when designing and using heterogeneous suites of computers (including clusters and clouds) to execute applications in a way that will optimize some performance criterion.

BIOGRAPHY OF INSTRUCTOR

    H. J. Siegel is the George T. Abell Endowed Chair Distinguished Professor of Electrical and Computer Engineering at Colorado State University (CSU), where he is also a Professor of Computer Science. He is the Director of the CSU Information Science and Technology Center (ISTeC), a university-wide organization for enhancing CSU’s research, education, and outreach activities pertaining to the design and innovative application of computer, communication, and information systems. From 1976 to 2001, he was a professor in the School of Electrical and Computer Engineering at Purdue University. He received two B.S. degrees from the Massachusetts Institute of Technology (MIT), and the M.A., M.S.E., and Ph.D. degrees from Princeton University. He is a Fellow of the IEEE and a Fellow of the ACM. Prof. Siegel has co-authored over 360 published technical papers in the areas of parallel and distributed computing and communications. He was a Coeditor-in-Chief of the Journal of Parallel and Distributed Computing, and was on the Editorial Boards of the IEEE Transactions on Parallel and Distributed Systems and the IEEE Transactions on Computers. For more information, please see www.engr.colostate.edu/~hj.

CONTACT INFORMATION

    Professor H. J. Siegel
    Department of Electrical and Computer Engineering and Department of Computer Science
    Colorado State University
    Fort Collins, CO 80523-1373
    Office: (970) 491-7982
    Fax: (970) 491-2249
    E-mail: hj@colostate.edu
    www.engr.colostate.edu/~hj

Academic Co-Sponsors

United States Military Academy, Network Science Center


Biomedical Cybernetics Laboratory, HST of Harvard University and MIT, USA


Argonne's Leadership Computing Facility of Argonne National Laboratory

Functional Genomics Laboratory, University of Illinois at Urbana-Champaign, USA
Minnesota Supercomputing Institute, University of Minnesota, USA
Intelligent Data Exploration and Analysis Laboratory, University of Texas at Austin, Austin, Texas, USA
Harvard Statistics Department Genomics & Bioinformatics Laboratory, Harvard University, USA

Texas Advanced Computing Center, The University of Texas at Austin, Texas

Center for the Bioinformatics and Computational Genomics, Georgia Institute of Technology, Atlanta, Georgia, USA

Bioinformatics & Computational Biology Program, George Mason University, Virginia, USA


Institute of Discrete Mathematics and Geometry, Vienna University of Technology, Austria

BioMedical Informatics & Bio-Imaging Laboratory, Georgia Institute of Technology and Emory University, Atlanta, Georgia, USA
Knowledge Management & Intelligent System Center (KMIS) of University of Siegen, Germany

National Institute for Health Research, UK


Hawkeye Radiology Informatics, Department of Radiology, College of Medicine, University of Iowa, Iowa, USA

Institute for Informatics Problems of the Russian Academy of Sciences, Moscow, Russia.
Medical Image HPC & Informatics Lab (MiHi Lab), University of Iowa, Iowa, USA
SECLAB An inter-university research group (University of Naples Federico II, the University of Naples Parthenope, and the Second University of Naples, Italy)
The University of North Dakota, Grand Forks, North Dakota, USA
Intelligent Cyberspace Engineeing Lab., ICEL, Texas A&M; University (Com./Texas)

International Society of Intelligent Biological Medicine


World Academy of Biomedical Sciences and Technologies



Corporate Sponsor



Other Co-Sponsors
European Commission
High Performance Computing for Nanotechnology (HPCNano)

HoIP - Health without Boundaries


Hodges' Health

The International Council on Medical and Care Compunetics

GridToday - enewsletter focused on Grid, SOA, Virtualization, Storage, Networking and Service-Oriented IT


HPCwire - The Leading Source for Global News and Information Covering the Ecosystem of High Productivity Computing

The UK Department for Business, Enterprise & Regulatory Reform
VMW Solutions Ltd.
Scientific Technologies Corporation

Bentham Science Publishers


 


Administered by UCMSS
Universal Conference Management Systems & Support
San Diego, California, USA
Contact: Kaveh Arbtan

If you can read this text, it means you are not experiencing the Plone design at its best. Plone makes heavy use of CSS, which means it is accessible to any internet browser, but the design needs a standards-compliant browser to look like we intended it. Just so you know ;)