Personal tools
  •  
You are here: Home Tutorials WORLDCOMP'13 Featured Tutorial: Prof. H. J. Siegel
Conference Proceedings
Get WORLDCOMP'13 & '14 Proceedings
Click Here

Photo Galleries

Important Dates
July 17, 2013
Registration Deadline

July 22-25, 2013
The 2013 World Congress in Computer Science, Computer Engineering, and Applied Computing


« September 2015 »
Su Mo Tu We Th Fr Sa
12345
6789101112
13141516171819
20212223242526
27282930
 

WORLDCOMP'13 Featured Tutorial: Prof. H. J. Siegel

Last modified 2013-07-07 17:49

Robust Resource Management for Parallel and Distributed Computing Systems

Prof. H. J. Siegel


Abell Endowed Chair Distinguished Professor of Electrical and Computer Engineering and Professor of Computer Science
Director, CSU Information Science and Technology Center (ISTeC)
Colorado State University, Fort Collins, Colorado, USA


Date: July 22, 2013
Time: 6:00pm
Location: Montecristo 2



DESCRIPTION

    What does “robust” mean? Often people state that their system software component, piece of hardware, application code, or technique is “robust,” but never define what they What does “robust” mean? Often people state that their system software component, piece of hardware, application code, or technique is “robust,” but never define what they mean by “robust.” How does one define robustness in a given situation so it can be determined if a system is indeed robust? Furthermore, how does one quantify robustness, so that if two people claim to have robust computing systems, for example, how can one decide which is the more robust? These are the types of issues we address in this tutorial. We study robustness in the context of resource allocation in heterogeneous parallel and distributed computing systems (including energy-awareness), but the robustness concepts presented have broad applicability.

    In heterogeneous parallel and distributed computing environments, a collection of different machines is interconnected and provides a variety of computational capabilities. These capabilities can be used to execute a workload composed of different types of tasks, where the tasks have diverse computational requirements. The execution time of a task on a machine is based on how the task’s computational requirements interact with the machine’s capabilities. Thus, a task’s execution time may vary from one machine to the next, and just because some machine A is faster than some machine B for task 1 does not mean it will be faster for task 2. Tasks must share the computing resources of the system.

    A critical research problem for heterogeneous parallel and distributed computing systems is how to assign tasks to machines, and schedule the order of their execution, to optimize some given system performance measure, possibly under a given constraint. Often, these allocation decisions must be made when there is uncertainty in relevant system parameters, such as the data-dependent execution time of a given task on a given machine. Because the actual values of these parameters are uncertain, the actual system performance may differ from the predicted performance. Thus, it is important to develop resource management strategies that strive to meet particular system performance requirements even when such uncertainties are present, that is, it is important for system performance to be robust against uncertainty.

    To accomplish this, we present a model for quantifying the robustness of a resource allocation. This model assumes that historical information is available for a parameter whose actual values are uncertain. The robustness of a resource allocation is quantified as the probability that a user-specified level of system performance can be met. We show how to use historical data to build a probabilistic (stochastic) model to evaluate the robustness of resource assignments and to design resource management techniques that produce robust allocations.

    This robustness model, and example associated robustness metrics, will be presented. It will be shown how they can be used to evaluate and compare the robustness of different resource allocations. In addition, it will be demonstrated how this model can be incorporated into resource management heuristics that produce robust allocations to optimize some user-specified performance criterion.

    Robust resource allocation heuristics for a variety of environments will be discussed and compared, including energy-aware resource management. This will be done for both static heuristics, which are executed off-line for production environments, and dynamic heuristics, which are executed on-line for environments where tasks must be assigned resources as they arrive into the system.

    The tutorial material is applicable to various types of heterogeneous computing and communication environments, including parallel, distributed, cluster, grid, Internet, cloud, embedded, multicore, content distribution networks, wireless networks, and sensor networks. Furthermore, the robustness model, concepts, and metrics presented are generally applicable to resource allocation problems throughout various fields, such as smart grids and search and rescue operations.


OBJECTIVES

    This course will enable you to:

    • understand the problem of robust resource allocation in heterogeneous parallel and distributed computing systems
    • ask the "three robustness questions" that must be answered whenever anyone makes robustness claims
    • develop and use robustness metrics to quantify the robustness of a particular resource allocation for a given computational environment
    • design resource allocation heuristics that incorporate robustness, for both static (off-line) and dynamic (on-line) environments
    • use the concepts of robustness for resource allocation in a variety of problem domains
    • create energy-aware resource management methods that use system energy consumption as a performance measure or constraint

INTENDED AUDIENCE

    This course is intended for faculty, engineers, scientists, and graduate students who want to learn how to define, model, and quantify robustness when designing and using heterogeneous suites of computers (including clusters and clouds) to execute applications in a way that will optimize some performance criterion, which may include energy.

BIOGRAPHY OF INSTRUCTOR

    H. J. Siegel is the George T. Abell Endowed Chair Distinguished Professor of Electrical and Computer Engineering at Colorado State University (CSU), where he is also a Professor of Computer Science. He is the Director of the CSU Information Science and Technology Center (ISTeC), a university-wide organization for enhancing CSU’s research, education, and outreach activities pertaining to the design and innovative application of computer, communication, and information systems. From 1976 to 2001, he was a professor in the School of Electrical and Computer Engineering at Purdue University. He received two B.S. degrees from the Massachusetts Institute of Technology (MIT), and the M.A., M.S.E., and Ph.D. degrees from Princeton University. He is a Fellow of the IEEE and a Fellow of the ACM. Prof. Siegel has co-authored over 400 published technical papers in the areas of parallel and distributed computing and communications. He was a Coeditor-in-Chief of the Journal of Parallel and Distributed Computing, and was on the Editorial Boards of the IEEE Transactions on Parallel and Distributed Systems and the IEEE Transactions on Computers. For more information, please see www.engr.colostate.edu/~hj.

CONTACT INFORMATION

    Professor H. J. Siegel
    Department of Electrical and Computer Engineering and Department of Computer Science
    Colorado State University
    Fort Collins, CO 80523-1373
    Office: (970) 491-7982
    Fax: (970) 491-2249
    E-mail: hj@colostate.edu
    www.engr.colostate.edu/~hj

Current Event
WORLDCOMP'15
Click Here

Past Events
WORLDCOMP'14
Click Here

WORLDCOMP'12
Click Here

WORLDCOMP'11
Click Here

WORLDCOMP'10
Click Here

WORLDCOMP'09
Click Here

WORLDCOMP'08
Click Here

WORLDCOMP'07 & '06
Click Here

Join Our Mailing List
Sign up to receive email announcements and updates about conferences and future events




 


Administered by UCMSS
Universal Conference Management Systems & Support
San Diego, California, USA
Contact: Kaveh Arbtan

If you can read this text, it means you are not experiencing the Plone design at its best. Plone makes heavy use of CSS, which means it is accessible to any internet browser, but the design needs a standards-compliant browser to look like we intended it. Just so you know ;)