WORLDCOMP'12 Keynote Lecture - Dr. Flavio Villanustre
ECL and Distributed Machine Learning with the HPCC Systems Platform
Dr. Flavio Villanustre, MD.
Vice-President of Technology Architecture and Product, HPCC Systems, USA
Date: July 16, 2012
Time: 11:00 - 11:55am
Location: The Monte Carlo Pavilion
The exponential proliferation of data, and the recognition of the value in extracting knowledge from it, have made Machine Learning methodologies critical to many domains. Data mining techniques applied to, for example, large scale classification problems, recommendation systems, sentiment analysis, social graph traversal and fraud detection, are becoming pervasive across different industries. However, Big Data, as defined by its attributes of volume, velocity and variety, challenges traditional analytical approaches, and requires innovative algorithms to scale processing, through parallel execution across multiple computing cores and interconnected commodity nodes, in order to leverage cost-efficient architectures.
But explicit parallelism comes at a cost, pushing complexity into the high level algorithms, and requiring to shift the attention of the data analyst or application software developer from the core task at hand, to get involved with the intrinsic details of the specific architecture and the parallel execution itself. This is especially common among imperative programming frameworks.
These challenges can be overcome with approaches that provide for implicit parallelism, particularly through declarative programming paradigms. Declarative programming normally allows for significant compile time optimizations and more succinct solutions, resulting in more efficient overall software development models. ECL, a high level declarative data oriented programming language, part of the open source HPCC Systems platform, and the associated Machine Learning toolkit, offer effortless linear scalability across entire clusters. And a set of fully parallel linear algebra operations, written in ECL, provides extensibility, allowing for the quick implementation of new high level algorithms.
We'll explore the existing capabilities and algorithms of the platform and the ECL language, some of the leading edge developments in this field, and ongoing research in the area.
Flavio Villanustre is the Vice-President of Technology Architecture and Product. In this position, Flavio is responsible for overall platform architecture strategy and new product development.
Prior to 2001, Flavio served in a variety of roles at different companies including Infrastructure, Information Security and Information Technology. In addition to this, Villanustre has been involved with the Open source community for over 15 years through multiple initiatives. Some of these include founding the first Linux User Group in Buenos Aires (BALUG) in 1994, releasing several pieces of software under different Open source licenses, and evangelizing Open source to different audiences through conferences, training and education. He is a frequent speaker in the areas of data-intensive computing and machine learning, and has coauthored papers on parallel and data-intensive computing. In a prior life, Flavio was a neurosurgeon.