2.5: Scheduling multiprocessor systems
Introduction
This section introduces the learner to the multiprocessor scheduling. In this the workload called tasks can be spread across processors and thus be executed much faster.
Activity Details
In computer science, multiprocessor scheduling is an NP-hard optimization problem. The problem statement is: “Given a set J of jobs where job ji has length li and a number of processors m, what is the minimum possible time required to schedule all jobs in J on m processors such that none overlap?”
The applications of this problem are numerous, but are, as suggested by the name of the problem, most strongly associated with the scheduling of computational tasks in a multiprocessor environment.
Multiprocessor schedulers have to schedule tasks which may or may not be dependent upon one another. For example take the case of reading user credentials from console, then use it to authenticate, then if authentication is successful display some data on the console. Clearly one task is dependent upon another. This is a clear case of where some kind of ordering exists between the tasks. In fact it is clear that it can be modelled with partial ordering. Then, by definition, the set of tasks constitute a lattice structure.
The general multiprocessor scheduling problem is a generalization of the optimization version of the number partitioning problem, which considers the case of partitioning a set of numbers (jobs) into two equal sets (processors).
Processors purpose-specific graphics and GPU
General-purpose computing on graphics processing units (GPGPU, rarely GPGP or GPU) is the use of a graphics processing unit (GPU), which typically handles computation only for computer graphics, to perform computation in applications traditionally handled by the central processing unit (CPU). The use of multiple graphics cards in one computer, or large numbers of graphics chips, further parallelizes the already parallel nature of graphics processing. In addition, even a single GPU-CPU framework provides advantages that multiple CPUs on their own do not offer due to the specialization in each chip.
GPGPU pipeline is a kind of parallel processing between one or more GPUs and CPUs that analyzes data as if it were in image or other graphic form. While GPUs generally operate at lower frequencies, they usually have many times more cores to make up for it (up to hundreds at least) and can, thus, operate on pictures and graphical data effectively much faster, dozens or even hundreds of times faster than a traditional CPU, migrating data into graphical form and then using the GPU to “look” at it and analyze it can result in profound speedup. GPGPU pipeline is a kind of parallel processing between one or more GPUs and CPUs that analyzes data as if it were in image or other graphic form. While GPUs generally operate at lower frequencies, they usually have many times more cores to make up for it (up to hundreds at least) and can, thus, operate on pictures and graphical data effectively much faster, dozens or even hundreds of times faster than a traditional CPU, migrating data into graphical form and then using the GPU to “look” at it and analyze it can result in profound speedup.
Reconfigurable Logic and Purpose-specific Processors
Reconfigurable computing is a computer architecture combining some of the flexibility of software with the high performance of hardware by processing with very flexible high speed computing fabrics like field-programmable gate arrays (FPGAs). The principal difference when compared to using ordinary microprocessors is the ability to make substantial changes to the data path itself in addition to the control flow. On the other hand, the main difference with custom hardware, i.e. application-specific integrated circuits (ASICs) is the possibility to adapt the hardware during runtime by “loading” a new circuit on the reconfigurable fabric.
The concept of reconfigurable computing has existed since the 1960s, when Gerald Estrin’s paper proposed the concept of a computer made of a standard processor and an array of “reconfigurable” hardware. The main processor would control the behavior of the reconfigurable hardware. The latter would then be tailored to perform a specific task, such as image processing or pattern matching, as quickly as a dedicated piece of hardware. Once the task was done, the hardware could be adjusted to do some other task. This resulted in a hybrid computer structure combining the flexibility of software with the speed of hardware. Reconfigurable architectures can bring unique capabilities to computational tasks. They offer the performance and energy efficiency of hardware with the flexibility of software.
Conclusion
This section introduced the learner to the multiprocessor scheduling applied in the computer architecture by partitioning the jobs to be performed.
Assessment
1. Briefly describe multiprocessor scheduling
Multiprocessor scheduling is an NP-hard optimization problem. The problem statement is: “Given a set J of jobs where job ji has length li and a number of processors m, what is the minimum possible time required to schedule all jobs in J on m processors such that none overlap?” The applications of this problem are numerous, but are, as suggested by the name of the problem, most strongly associated with the scheduling of computational tasks in a multiprocessor environment.
Multiprocessor schedulers have to schedule tasks which may or may not be dependent upon one another. For example take the case of reading user credentials from console, then use it to authenticate, then if authentication is successful display some data on the console. Clearly one task is dependent upon another. This is a clear case of where some kind of ordering exists between the tasks. In fact it is clear that it can be modelled with partial ordering. Then, by definition, the set of tasks constitute a lattice structure.
The general multiprocessor scheduling problem is a generalization of the optimization version of the number partitioning problem, which considers the case of partitioning a set of numbers (jobs) into two equal sets (processors)