Newest Viewed Downloaded

Chapter 5 The Sieve of Eratosthenes

Parallel Programming in C with MPI and OpenMP Michael J. Quinn

Chapter 5 The Sieve of Eratosthenes

Chapter Objectives Analysis of block allocation schemes Function MPI_Bcast Performance enhancements

Outline Sequential algorithm Sources of parallelism Data decomposition options Parallel algorithm development, analysis MPI program Benchmarking Optimizations

Sequential Algorithm 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 3 9 15 21 27 33 39 45 51 57 5 25 35 55 7 49 Complexity: (n ln ln n)

Pseudocode 1. Create list of unmarked natural numbers 2, 3, …, n 2. k  2 3. Repeat (a) Mark all multiples of k between k2 and n (b) k  smallest unmarked number > k until k2 > n 4. The unmarked numbers are primes

Sources of Parallelism Domain decomposition Divide data into pieces Associate computational steps with data One primitive task per array element

Making 3(a) Parallel Mark all multiples of k between k2 and n  for all j where k2  j  n do if j mod k = 0 then mark j (it is not a prime) endif endfor

Making 3(b) Parallel Find smallest unmarked number > k  Min-reduction (to find smallest unmarked number > k) Broadcast (to get result to all tasks)

Agglomeration Goals Consolidate tasks Reduce communication cost Balance computations among processes

Data Decomposition Options Interleaved (cyclic) Easy to determine “owner” of each index Leads to load imbalance for this problem Block Balances loads More complicated to determine owner if n not a multiple of p

Block Decomposition Options Want to balance workload when n not a multiple of p Each process gets either n/p or n/p elements Seek simple expressions Find low, high indices given an owner Find owner given an index

Method #1 Let r = n mod p If r = 0, all blocks have same size Else First r blocks have size n/p Remaining p-r blocks have size n/p

Examples 17 elements divided among 7 processes 17 elements divided among 5 processes 17 elements divided among 3 processes

Method #1 Calculations First element controlled by process i Last element controlled by process i Process controlling element j

Method #2 Scatters larger blocks among processes First element controlled by process i Last element controlled by process i Process controlling element j

Examples 17 elements divided among 7 processes 17 elements divided among 5 processes 17 elements divided among 3 processes

Comparing Methods Operations Method 1 Method 2 Low index 4 2 High index 6 4 Owner 7 4 Assuming no operations for “floor” function Our choice

Pop Quiz Illustrate how block decomposition method #2 would divide 13 elements among 5 processes. 13(0)/ 5 = 0 13(1)/5 = 2 13(2)/ 5 = 5 13(3)/ 5 = 7 13(4)/ 5 = 10

Block Decomposition Macros #define BLOCK_LOW(id,p,n) ((id)*(n)/(p)) #define BLOCK_HIGH(id,p,n) \ (BLOCK_LOW((id)+1,p,n)-1) #define BLOCK_SIZE(id,p,n) \ (BLOCK_LOW((id)+1)-BLOCK_LOW(id)) #define BLOCK_OWNER(index,p,n) \ (((p)*(index)+1)-1)/(n))

Showing 1 - 20 of 38 items Details

Name: 
Chapter 5
Author: 
Michael J. Quinn
Company: 
Oregon State University
Description: 
Chapter 5 The Sieve of Eratosthenes
Tags: 
mpi | block | prime | process | size | low | elements | processes
Created: 
12/21/2001 12:01:37 AM
Slides: 
38
Views: 
1
Downloads: 
0
Rating: 
0


> Comment



Share this presentation
|

Comments

Share this presentation:

|
Sitemap