Standards of Parallel Algorithm Design .

Uploaded on:
Principles of Parallel Algorithm Design. Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar. To accompany the text “Introduction to Parallel Computing”, Addison Wesley, 2003. Chapter Overview: Algorithms and Concurrency . Introduction to Parallel Algorithms Tasks and Decomposition
Slide 1

Standards of Parallel Algorithm Design Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar To go with the content "Prologue to Parallel Computing", Addison Wesley, 2003.

Slide 2

Chapter Overview: Algorithms and Concurrency Introduction to Parallel Algorithms Tasks and Decomposition Processes and Mapping Processes Versus Processors Decomposition Techniques Recursive Decomposition Recursive Decomposition Exploratory Decomposition Hybrid Decomposition Characteristics of Tasks and Interactions Task Generation, Granularity, and Context Characteristics of Task Interactions.

Slide 3

Chapter Overview: Concurrency and Mapping Techniques for Load Balancing Static and Dynamic Mapping Methods for Minimizing Interaction Overheads Maximizing Data Locality Minimizing Contention and Hot-Spots Overlapping Communication and Computations Replication versus Correspondence Group Communications versus Indicate Point Communication Parallel Algorithm Design Models Data-Parallel, Work-Pool, Task Graph, Master-Slave, Pipeline, and Hybrid Models

Slide 4

Preliminaries: Decomposition, Tasks, and Dependency Graphs The initial phase in building up a parallel calculation is to break down the issue into errands that can be executed simultaneously A given issue might be docomposed into assignments in a wide range of ways. Errands might be of same, distinctive, or even interminate sizes. A deterioration can be represented as a coordinated chart with hubs relating to undertakings and edges showing that the consequence of one assignment is required for handling the following. Such a chart is known as an errand reliance diagram .

Slide 5

Example: Multiplying a Dense Matrix with a Vector Computation of every component of yield vector y is autonomous of different components. In light of this, a thick lattice vector item can be decayed into n errands. The figure highlights the part of the network and vector got to by Task 1. Perceptions: While undertakings share information (to be specific, the vector b ), they don\'t have any control conditions - i.e., no errand needs to sit tight for the (fractional) fruition of some other. All undertakings are of similar size as far as number of operations. Is this the most extreme number of errands we could break down this issue into?

Slide 6

Example: Database Query Processing Consider the execution of the question: MODEL = ``CIVIC\'\' AND YEAR = 2001 AND (COLOR = ``GREEN\'\' OR COLOR = ``WHITE) on the accompanying database:

Slide 7

Example: Database Query Processing The execution of the inquiry can be isolated into subtasks in different ways. Every undertaking can be considered as producing a middle of the road table of sections that fulfill a specific proviso. Disintegrating the given inquiry into various undertakings. Edges in this chart mean that the yield of one assignment is expected to fulfill the following.

Slide 8

Example: Database Query Processing Note that similar issue can be disintegrated into subtasks in different courses also. A substitute disintegration of the given issue into subtasks, alongside their information conditions. Diverse undertaking disintegrations may prompt noteworthy contrasts concerning their inevitable parallel execution.

Slide 9

Granularity of Task Decompositions The quantity of errands into which an issue is decayed decides its granularity. Disintegration into countless results in fine-grained deterioration and that into a little number of assignments results in a coarse grained decay. A coarse grained partner to the thick network vector item illustration. Every errand in this illustration compares to the calculation of three components of the outcome vector.

Slide 10

Degree of Concurrency The quantity of errands that can be executed in parallel is the level of simultaneousness of a disintegration. Since the quantity of assignments that can be executed in parallel may change over program execution, the most extreme level of simultaneousness is the greatest number of such undertakings anytime amid execution. What is the most extreme level of simultaneousness of the database question cases? The normal level of simultaneousness is the normal number of assignments that can be handled in parallel over the execution of the program. Accepting that every errands in the database case takes indistinguishable handling time, what is the normal level of simultaneousness in every decay? The level of simultaneousness increments as the disintegration gets to be better in granularity and the other way around.

Slide 11

Critical Path Length A coordinated way in the errand reliance diagram speaks to a grouping of assignments that must be prepared in a steady progression. The longest such way decides the most limited time in which the program can be executed in parallel. The length of the longest way in an undertaking reliance diagram is known as the basic way length.

Slide 12

Critical Path Length Consider the errand reliance diagrams of the two database inquiry disintegrations: What are the basic way lengths for the two assignment reliance charts? On the off chance that every errand takes 10 time units, what is the briefest parallel execution time for every decay? What number of processors are required for every situation to accomplish this base parallel execution time? What is the most extreme level of simultaneousness?

Slide 13

Limits on Parallel Performance It would create the impression that the parallel time can be made discretionarily little by making the disintegration better in granularity. There is an intrinsic bound on how fine the granularity of a calculation can be. For instance, on account of duplicating a thick framework with a vector, there can be close to (n 2 ) simultaneous undertakings. Simultaneous assignments may likewise need to trade information with different undertakings. This outcomes in correspondence overhead. The tradeoff between the granularity of a disintegration and related overheads regularly decides execution limits.

Slide 14

Task Interaction Graphs Subtasks for the most part trade information with others in a decay. For instance, even in the trifling decay of the thick grid vector item, if the vector is not repeated over all assignments, they will need to impart components of the vector. The chart of assignments (hubs) and their communications/information trade (edges) is alluded to as an undertaking cooperation diagram . Take note of that undertaking collaboration diagrams speak to information conditions, though errand reliance charts speak to control conditions.

Slide 15

Task Interaction Graphs: An Example Consider the issue of increasing an inadequate framework A with a vector b . The accompanying perceptions can be made: As some time recently, the calculation of every component of the outcome vector can be seen as an autonomous errand. Not at all like a thick lattice vector item however, just non-zero components of framework A take part in the calculation. In the event that, for memory optimality, we additionally parcel b crosswise over undertakings, then one can see that the errand collaboration chart of the calculation is indistinguishable to the diagram of the network A (the chart for which A speaks to the contiguousness structure).

Slide 16

Task Interaction Graphs, Granularity, and Communication In general, if the granularity of a disintegration is better, the related overhead (as a proportion of helpful work assocaited with an errand) increments. Case: Consider the scanty network vector item case from past thwart. Accept that every hub sets aside unit opportunity to handle and every connection (edge) causes an overhead of a unit time. Seeing hub 0 as an autonomous assignment includes a valuable calculation of one time unit and overhead (correspondence) of three time units. Presently, in the event that we consider hubs 0, 4, and 5 as one undertaking, then the assignment has helpful calculation totaling to three time units and correspondence comparing to four time units (four edges). Plainly, this is a more great proportion than the previous case.

Slide 17

Processes and Mapping when all is said in done, the quantity of assignments in a disintegration surpasses the quantity of preparing components accessible. Therefore, a parallel calculation should likewise give a mapping of undertakings to forms. Note: We allude to the mapping as being from assignments to forms, instead of processors. This is on the grounds that run of the mill programming APIs, as we might see, don\'t permit simple official of errands to physical processors. Or maybe, we total undertakings into procedures and depend on the framework to delineate procedures to physical processors. We utilize forms, not in the UNIX feeling of a procedure, rather, just as a gathering of undertakings and related information.

Slide 18

Processes and Mapping Appropriate mapping of assignments to procedures is basic to the parallel execution of a calculation. Mappings are dictated by both the undertaking reliance and errand communication diagrams. Undertaking reliance charts can be utilized to guarantee that work is similarly spread over all procedures anytime (least sitting and ideal load adjust). Errand collaboration charts can be utilized to ensure that procedures require least cooperation with different procedures (least correspondence).

Slide 19

Processes and Mapping A suitable mapping must minimize parallel execution time by: Mapping free undertakings to various procedures. Doling out assignments on basic way to forms when they get to be accessible. Minimizing cooperation between procedures by mapping undertakings with thick collaborations to similar process. Take note of: These criteria regularly strife eith each other. For instance, a decay into one errand (or no disintegration by any stretch of the imagination) minimizes cooperation however does not bring about a speedup by any means! Will you consider other such clashing cases?

Slide 20

Processes and Mapping: Example Mapping errands in the database question deterioration to forms. These mappings were touched base at by survey the reliance chart as far as levels (no two hubs in a level have conditions). Errands inside a solitary level are then alloted to various procedures.

Slide 21

Decomposition Techniques So how can one break down an undertaking into different subtasks? While t

View more...