BSP on the Origin2000.

Uploaded on:
Category: Art / Culture
BSP on the Origin2000 Lab for the course: Class in Experimental Figuring with BSP Dr. Anne Weill – ,ph:4997 Origin2000 (SGI) 32 processors Origin2000/3000 design highlights Critical equipment and programming segments: * hub board: processors + memory
Slide 1

BSP on the Origin2000 Lab for the course: Seminar in Scientific Computing with BSP Dr. Anne Weill – ,ph:4997

Slide 2

Origin2000 (SGI) 32 processors

Slide 3

Origin2000/3000 construction modeling elements Important equipment and programming parts: * hub load up: processors + memory * hub interconnect topology and designs * versatility of the structural planning * registry based store coherency * single framework picture segments

Slide 4

Origin2000 hub load up

Slide 5

Origin2000 – two hubs

Slide 6

Origin2000 interconnect

Slide 7

Origin2000 interconnect 32 processors 64 processors

Slide 8

Origin switch interconnect - Router chip has 6 CrayLink interfaces: 2 for associations with (HUBs) and 4 for associations with different switches in the system * 4-dimensional interconnect - Router connections are point-to-point associations 17+7 wires @ 400 MHz (that is, wire speed 800 MB/s) - Worm gap steering with static directing table stacked at boot - Router deferral is 50 ns in one heading - The interconnect topology is controlled by the PC\'s extent (number of hubs): * direct (consecutive) association for 2 hubs (4 cpu) * emphatically joined 3D shape up to 32 cpu * hypercube for up to 64 cpu * hypercube of hypercubes for up to 256 cpu

Slide 9

Origin location space - Physically the memory is conveyed and not adjoining - Node id is alloted at boot time - Logically memory is a common single bordering location space, the virtual location space is 44 bits (16 TB) - A project (compiler) utilizes the virtual location space - CPU makes an interpretation of from virtual to physical location space 39 32 31 0 hub id 8 bits Node counterbalance 32 bits (4 GB) Empty opening page 0 1 2 n Physical k 1 n 0 Memory present 0 1 2 3 .. Hub id Virtual TLB – Translation Look-aside Buffer

Slide 10

Login to carmel 1. Open a ssh window to : 2. Username : course01-course20 Password : bsp2006 Contact : Dr. Anne Weill – , telephone :4997

Slide 11

Compiling and running codes Setting way set path=($path/u/tcc/anne/BSP/canister) 2. Accumulating %bspcc prog1.c –o prog1 %bspcc –flibrary-level 1 prog1.c –o prog1 (for non-committed machine) 3. Running %bsprun –npes 4 prog1

Slide 12

Running on carmel Interactive mode : % ./prog.exe <parameters> 2. NQE lines : % qsub –q qcourse script.bat

Slide 13

BSP capacities

Slide 14

Sample program

Slide 15

Output of hi system

Slide 16

How it meets expectations P0 Prog.exe P1 Prog.exe bsprun P2 Prog.exe P3 Prog.exe

Slide 17

SPMD – single program various information Each processor sees just its neighborhood memory. Substance of variable X are distinctive in diverse processors. Exchange of information can happen on a fundamental level through uneven or two-sided correspondence.

Slide 18

DRMA-direct remote memory get to All processors must enroll the space into which remote “read” and “write” will happen Calls to bsp_put Calls to bsp_get Call to bsp_sync – all processors synchronize, all correspondence is finished after the call

Slide 19

BSP capacities for correspondence

Slide 20

Running on carmel Interactive mode : % ./prog.exe <parameters> 2. NQE lines : % qsub –q qcourse script.bat

Slide 21

Script document for bunch

Slide 22

Output of summon: “qstat –a”

Slide 23

Another sample *What does the accompanying system ? What will the project print ?

Slide 25

Output of project

Slide 26

Another sample * Is there an issue with the accompanying illustration? What will the project print ?

Slide 28

Answer As it is composed, the project won\'t print any yield : the information is really exchanged when the bsp_sync explanation Additional inquiry : what will the system print if bsp_sync is set directly after the put proclamation? NB : the projects are in catalog/u/tcc/anne/BSPcourse, under prog2.c and prog2wrong.c – attempt them

Slide 29

Exercise1 (due Nov. 26d 2006) Copy over to your index the registry:/u/tcc/anne/BSPcourse. Investigate the bspedupack.h document. Compose a C program in which every processor composes its pid into a cluster PIDS(0:p-1) on p0. (PIDS(i)=i). Run the project for p=1,2,4,8,16 processors and print PIDS. You can run it intuiti

View more...