Newest Viewed Downloaded

Storing VectorsDivide vector elements among processes Replicate vector elements Vector replication acceptable because vectors have only n elements, versus n2 elements in matrices

Parallel Programming in C with MPI and OpenMP Michael J. Quinn

Chapter 8 Matrix-vector Multiplication

Chapter Objectives

Review matrix-vector multiplicaiton Propose replication of vectors Develop three parallel programs, each based on a different data decomposition

Outline

Sequential algorithm and its complexity Design, analysis, and implementation of three parallel programs Rowwise block striped Columnwise block striped Checkerboard block

Sequential Algorithm

2 1 0 4 3 2 1 1 4 3 1 2 3 0 2 0 1 3 4 1 =  2 2 1 5 0 4 1 3 5 9 4 1 9 2 1 0 4 1 3 4 1 3 3 1 9 2 3 13 1 4 14 1 1 14 1 3 4 1 3 2 1 1 4 4 1 13 3 3 17 1 4 19 2 1 19 1 3 4 1 4 3 1 2 3 3 1 3 3 0 11 2 4 11 0 1 11 1 3 4 1 3 0 2 0

Storing Vectors

Divide vector elements among processes Replicate vector elements Vector replication acceptable because vectors have only n elements, versus n2 elements in matrices

Rowwise Block Striped Matrix

Partitioning through domain decomposition Primitive task associated with Row of matrix Entire vector

Phases of Parallel Algorithm

Row i of A b Row i of A b ci Inner product computation Row i of A b c All-gather communication

Agglomeration and Mapping

Static number of tasks Regular communication pattern (all-gather) Computation time per task is constant Strategy: Agglomerate groups of rows Create one task per MPI process

Complexity Analysis

Sequential algorithm complexity: (n2) Parallel algorithm computational complexity: (n2/p) Communication complexity of all-gather: (log p + n) Overall complexity: (n2/p + log p)

Isoefficiency Analysis

Sequential time complexity: (n2) Only parallel overhead is all-gather When n is large, message transmission time dominates message latency Parallel communication time: (n) n2  Cpn  n  Cp and M(n) = n2 System is not highly scalable

Block-to-replicated Transformation

MPI_Allgatherv

MPI_Allgatherv

int MPI_Allgatherv ( void *send_buffer, int send_cnt, MPI_Datatype send_type, void *receive_buffer, int *receive_cnt, int *receive_disp, MPI_Datatype receive_type, MPI_Comm communicator)

MPI_Allgatherv in Action

Function create_mixed_xfer_arrays

First array How many elements contributed by each process Uses utility macro BLOCK_SIZE Second array Starting position of each process’ block Assume blocks in process rank order

Function replicate_block_vector

Create space for entire vector Create “mixed transfer” arrays Call MPI_Allgatherv

Function read_replicated_vector

Process p-1 Opens file Reads vector length Broadcast vector length (root process = p-1) Allocate space for vector Process p-1 reads vector, closes file Broadcast vector

Function print_replicated_vector

Process 0 prints vector Exact call to printf depends on value of parameter datatype

Run-time Expression

: inner product loop iteration time Computational time:  nn/p All-gather requires log p messages with latency  Total vector elements transmitted: (2log p -1) / 2log p Total execution time:  nn/p + log p + (2log p -1) / (2log p )

Showing 1 - 20 of 68 items Details

Name: 
Chapter 8
Author: 
Michael J. Quinn
Company: 
Oregon State University
Description: 
Storing VectorsDivide vector elements among processes Replicate vector elements Vector replication acceptable because vectors have only n elements, versus n2 elements in matrices
Tags: 
mpi | int | comm | process | all | processes | grid | vector
Created: 
12/21/2001 12:01:37 AM
Slides: 
68
Views: 
0
Downloads: 
0
Rating: 
0


> Comment



Share this presentation
|

Comments

Share this presentation:

|
Sitemap