A heterogeneous computing system for data mining workflows in multi-agent environments

A heterogeneous computing system for data mining workflows in multi-agent environments

0.00 Avg rating0 Votes
Article ID: iaor20071997
Country: United Kingdom
Volume: 23
Issue: 5
Start Page Number: 258
End Page Number: 272
Publication Date: Nov 2006
Journal: Expert Systems
Authors: , , , ,
Abstract:

The computing-intensive data mining (DM) process calls for the support of a heterogeneous computing system, which consists of multiple computers with different configurations connected by a high-speed large-area network for increased computational power and resources. The DM process can be described as a multi-phase pipeline process, and in each phase there could be many optional methods. This makes the workflow for DM very complex and it can be modeled only by a directed acyclic graph (DAG). A heterogeneous computing system needs an effective and efficient scheduling framework, which orchestrates all the computing hardware to perform multiple competitive DM workflows. Motivated by the need for a practical solution of the scheduling problem for the DM workflow, this paper proposes a dynamic DAG scheduling algorithm according to the characteristics of an execution time estimation model for DM jobs. Based on an approximate estimation of job execution time, this algorithm first maps DM jobs to machines in a decentralized and diligent (defined in this paper) manner. Then the performance of this initial mapping can be improved through job migrations when necessary. The scheduling heuristic used considers the factors of both the minimal completion time criterion and the critical path in a DAG. We implement this system in an established multi-agent system environment, in which the reuse of existing DM algorithms is achieved by encapsulating them into agents. The system evaluation and its usage in oil well logging analysis are also discussed.

Reviews

Required fields are marked *. Your email address will not be published.