Resource Search in HPC Systems using Levy Flights
Abstract
Parallel applications represented by Directed Acyclic Graphs (DAGs) as Parallel Task Graphs (PTGs) requiring high execution times with large amounts of storage, are executed on High Performance Computing (HPC) Systems such as clusters. For the execution of these applications, a scheduler performs the scheduling and allocation of the resources contained in the HPC System. One of the activities of the scheduler is the search for idle resources that are geographically dispersed in the clusters to schedule and allocate them to the tasks. The search for idle resources in the clusters is a process that consumes time and system resources due to the geographical distances that must be traveled and the repetitive and permanent execution within the system. A sequential search for resources causes the scheduling and allocation of resources in the HPC System to be slowed down and paused, and increases the waiting times of the tasks that remain in the queue. The techniques that shorten the location times of resource and perform more exhaustive searches in dispersed geographic spaces can reduce the times generated by sequential searches. The open access paper: Scheduling in Heterogeneous Distributed Computing Systems Based on Internal Structure of Parallel Tasks Graphs with Meta-Heuristics presents the Array Method, a scheduler for scheduling and allocation resources in an HPC System. Array Method uses a sequential search process for the idle resources that are geographically dispersed in the clusters and save their location and characteristics in an array, which is updated every time idle resources are located in the clusters. Considering the above, this paper presents a search for idle resources using Levy random walks, a technique ´ used for searching resources in large geographical spaces; this technique avoids sequential node-by-node searches of each cluster, and promote short and long range searches over the entire geographical extent of the HPC Systems. To obtain experimental results, sequential resource search algorithm versus Levy random walks with the synthetic loads, different ´ clusters and different numbers of resources per cluster as proposed in the open access paper aforementioned, are used. Obtained results show Levy random walks ´ locates more idle resources in less time, optimizes the times of the resource searches in the clusters and update the array of available resources more frequently. With more idle resources found, the total execution times of the tasks are reduced.
Keywords
High performance computing systems, clusters, levy flights, scheduling resources, allocation resources