Newest Viewed Downloaded

3. Interconnection Networks

3. Interconnection Networks

Historical Perspective

Early machines were: Collection of microprocessors. Communication was performed using bi-directional queues between nearest neighbors. Messages were forwarded by processors on path. “Store and forward” networking There was a strong emphasis on topology in algorithms, in order to minimize the number of hops = minimize time

Network Analogy

To have a large number of transfers occurring at once, you need a large number of distinct wires. Networks are like streets: Link = street. Switch = intersection. Distances (hops) = number of blocks traveled. Routing algorithm = travel plan. Properties: Latency: how long to get between nodes in the network. Bandwidth: how much data can be moved per unit time. Bandwidth is limited by the number of wires and the rate at which each wire can accept data.

Design Characteristics of a Network

Topology (how things are connected): Crossbar, ring, 2-D and 3-D meshs or torus, hypercube, tree, butterfly, perfect shuffle .... Routing algorithm (path used): Example in 2D torus: all east-west then all north-south (avoids deadlock). Switching strategy: Circuit switching: full path reserved for entire message, like the telephone. Packet switching: message broken into separately-routed packets, like the post office. Flow control (what if there is congestion): Stall, store data temporarily in buffers, re-route data to other nodes, tell source node to temporarily halt, discard, etc.

Performance Properties of a Network: Latency

Diameter: the maximum (over all pairs of nodes) of the shortest path between a given pair of nodes. Latency: delay between send and receive times Latency tends to vary widely across architectures Vendors often report hardware latencies (wire time) Application programmers care about software latencies (user program to user program) Observations: Hardware/software latencies often differ by 1-2 orders of magnitude Maximum hardware latency varies with diameter, but the variation in software latency is usually negligible Latency is important for programs with many small messages

Performance Properties of a Network: Bandwidth

Routing and control header Data payload Error code Trailer Bandwidth is important for applications with mostly large messages The bandwidth of a link = w * 1/t w is the number of wires t is the time per bit Bandwidth typically in Gigabytes (GB), i.e., 8* 220 bits Effective bandwidth is usually lower than physical link bandwidth due to packet overhead.

Performance Properties of a Network: Bisection Bandwidth

bisection cut not a bisection cut bisection bw= link bw bisection bw = sqrt(n) * link bw Bisection bandwidth is important for algorithms in which all processors need to communicate with all others Bisection bandwidth: bandwidth across smallest cut that divides network into two equal halves Bandwidth across “narrowest” part of the network

Network Topology

In the past, there was considerable research in network topology and in mapping algorithms to topology. Key cost to be minimized: number of “hops” between nodes (e.g. “store and forward”) Modern networks hide hop cost (i.e., “wormhole routing”), so topology is no longer a major factor in algorithm performance. Example: On IBM SP system, hardware latency varies from 0.5 usec to 1.5 usec, but user-level message passing latency is roughly 36 usec. Need some background in network topology Algorithms may have a communication topology Topology affects bisection bandwidth.

Linear and Ring Topologies

Linear array Diameter = n-1; average distance ~n/3. Bisection bandwidth = 1 (in units of link bandwidth). Torus or Ring Diameter = n/2; average distance ~ n/4. Bisection bandwidth = 2. Natural for algorithms that work with 1D arrays.

Meshes and Tori

Generalizes to higher dimensions (Cray T3D used 3D Torus). Natural for algorithms that work with 2D and/or 3D arrays. Two dimensional torus Diameter = sqrt( n ) Bisection bandwidth = 2* sqrt(n) Two dimensional mesh Diameter = 2 * (sqrt( n ) – 1) Bisection bandwidth = sqrt(n)

Hypercubes

001 000 100 010 011 111 101 110 Number of nodes n = 2d for dimension d. Diameter = d. Bisection bandwidth = n/2. 0d 1d 2d 3d 4d Popular in early machines (Intel iPSC, NCUBE). Lots of clever algorithms. Greycode addressing: Each node connected to d others with 1 bit different.

Trees

Diameter = log n. Bisection bandwidth = 1. Easy layout as planar graph. Many tree algorithms (e.g., summation). Fat trees avoid bisection bandwidth problem: More (or wider) links near top. Example: Thinking Machines CM-5.

Butterflies with n = (k+1)2^k nodes

O 1 O 1 O 1 O 1 butterfly switch multistage butterfly network Diameter = 2k. Bisection bandwidth = 2^k. Cost: lots of wires. Used in BBN Butterfly. Natural for FFT.

Topologies in Real Machines

Butterfly BBN Butterfly (really old) Fat tree (approx) IBM SP Arbitrary Myricom (Millennium) Hypercube SGI Origin Fat tree Quadrics (in HP Alpha server clusters) 2D Mesh Intel Paragon (old) 4D Hypercube* Cray X1 Fat tree SGI Altix 3D Torus Blue Gene/L 3D Mesh Red Storm (Opteron + Cray network, future) older newer Many of these are approximations: E.g., the X1 is really a “quad bristled hypercube” and some of the fat trees are not as fat as they should be at the top

Performance Models

Showing 1 - 20 of 30 items Details

Name: 
3networks
Author: 
Kathy Yelick
Company: 
N/A
Description: 
3. Interconnection Networks
Tags: 
bandwidth | latenc | network | bisect | time | topolog | messag | algorithm
Created: 
1/20/1997 7:06:50 AM
Slides: 
30
Views: 
21
Downloads: 
1
Rating: 
0


> Comment



Share this presentation
|

Comments

Share this presentation:

|
Sitemap