Newest Viewed Downloaded

Example 195% of a program’s execution time occurs inside a loop that can be executed in parallel. What is the maximum speedup we should expect from a parallel version of the program executing on 8 CPUs?

Parallel Programming with MPI and OpenMP Michael J. Quinn

Chapter 7 Performance Analysis

Learning Objectives

Predict performance of parallel programs Understand barriers to higher performance

Outline

General speedup formula Amdahl’s Law Gustafson-Barsis’ Law Karp-Flatt metric Isoefficiency metric

Speedup Formula

Execution Time Components

Inherently sequential computations: (n) Potentially parallel computations: (n) Communication operations: (n,p)

Speedup Expression

(n)/p

(n,p)

(n)/p + (n,p)

Speedup Plot

“elbowing out”

Efficiency

0  (n,p)  1

All terms > 0  (n,p) > 0 Denominator > numerator  (n,p) < 1

Amdahl’s Law

Let f = (n)/((n) + (n))

Example 1

95% of a program’s execution time occurs inside a loop that can be executed in parallel. What is the maximum speedup we should expect from a parallel version of the program executing on 8 CPUs?

Example 2

20% of a program’s execution time is spent within inherently sequential code. What is the limit to the speedup achievable by a parallel version of the program?

Pop Quiz

An oceanographer gives you a serial program and asks you how much faster it might run on 8 processors. You can only find one function amenable to a parallel solution. Benchmarking on a single processor reveals 80% of the execution time is spent inside this function. What is the best speedup a parallel version is likely to achieve on 8 processors?

Pop Quiz

A computer animation program generates a feature movie frame-by-frame. Each frame can be generated independently and is output to its own file. If it takes 99 seconds to render a frame and 1 second to output it, how much speedup can be achieved by rendering the movie on 100 processors?

Limitations of Amdahl’s Law

Ignores (n,p) Overestimates speedup achievable

Amdahl Effect

Typically (n,p) has lower complexity than (n)/p As n increases, (n)/p dominates (n,p) As n increases, speedup increases

Showing 1 - 20 of 49 items Details

Name: 
Chapter 7
Author: 
Michael J. Quinn
Company: 
Oregon State University
Description: 
Example 195% of a program’s execution time occurs inside a loop that can be executed in parallel. What is the maximum speedup we should expect from a parallel version of the program executing on 8 CPUs?
Tags: 
parallel | speedup | the | time | processors | overhead | execution | isoefficiency
Created: 
12/21/2001 12:01:37 AM
Slides: 
49
Views: 
0
Downloads: 
0
Rating: 
0


> Comment



Share this presentation
|

Comments

Share this presentation:

|
Sitemap