Newest Viewed Downloaded

FBS Software Components

Storage Deduplication and Management for Application Testing over a Virtual Network Testbed

Presented by: Taichuan Ted Lu April 19, 2011 Chang-Han Chong (U. Maryland), Ping-Rong Chiang (CTL, Taiwan), Taichuan Ted Lu & C. Jason Chiang, TRIDENTCOM 2011 ‹#›

Outline

Introduction Storage challenge in MANET Application testing Related work Propose method Give storage file-level semantics File systems How file system layout affects the proposed method Conclusion ‹#›

Introduction

Testing MANET applications Simulation is cheap but not real Running applications in real world is real but costly Telcordia’s approach: Virtual Ad hoc Network (VAN) Virtual machines running MANET applications are interconnected by a simulated network Storage challenge in VAN (this paper!) Virtual disks usage is huge E.g. 50 nodes * 6GB(OS+Appliaction) * {fedora,redhat} * 4 Tests=2400GB Battle field applications Real applications Simulated networks ‹#›

Virtual Ad Hoc Network Testbed

VM PC VM Delegate network node VM VM Real applications running on a VM Each VM has a delegate inside the simulator through tunneling/NAT Simulator (OPNet or QualNet) running on a PC simulates the network interconnecting among delegates including multicast, broadcast, routing, etc Delegate network node Delegate network node Delegate network node Delegate network node Network node Network node ‹#›

Storage Challenges in VAN

Storage redundancy Same operating system Same application packages A simple solution Copy-on-write (CoW) Initially, an object (file or disk) has N clones If a clone modified, save only the modified part Object Clone 1 Clone 2 Clone N Modified Part Object Logical View Actual Storage ‹#›

Sharing Contents Virtual Disks

File-level approaches Files inside virtual disks are shared Mount read-only shared directory (e.g. NFS) Apply CoW (e.g. Union file system) Pro: just administrative work Cos: OS/File system specific; need human intervention; (bad for VAN) Block-level approaches Blocks inside virtual disks are shared Pro: No worry about OS and file systems (good for VAN) Con: Post-snapshot Block Sharing Problem After snapshots instantiated, modified blocks with identical contents won’t be shared Preferred Clone 1 Clone 2 Clone N Same Modified Part Object Object Logical View Actual Storage Redundant ‹#›

Sharing Redundant Blocks, Conceptually

Redundancy exists Modified blocks with same content should keep only one instance ‹#›

Current solutions to Post-snapshot Block Sharing Problem

Content Addressable Storage (CAS) Compute hash for every written block in real time(e.g. MD5) Coalesce blocks with same hashes (may verify the contents) Implementations IBM Data Deduplication Elimination(DDE) Centralized has single server bottle neck VMWare Decentralized Deduplication (DeDe) Decentralized performance improved ‹#›

IBM DDE

Central Storage (SAN) Meta-data server Client computer 1 1. Write block #i 2. Update block #i’s hash Client computer 2 3. Write block #j 4. Update block #j’s hash 5. If #i and #j are the same hash/content, they can be coalesced Performance bottleneck ‹#›

VMWare DeDe

2. calculate block #i’s hash Central Storage (SAN) Client computer 3 Client computer 2 Client computer 1 1. Write block #i 3. Write block #j 4. calculate block #j’s hash 5. Since #i and #j are the same hash/content, they can be coalesced Performance bottleneck Don’t fit MANET application testing Hash computation costs X times CPU Not acceptable for VAN Eat CPU  application running impact or some CPU cores are dedicated for hash computation  operation cost++ ‹#›

Is CAS necessary for Dedeuplication?

Why hashing is required in CAS? To check if two blocks have the same content Alternate way to achieve this? Yes. If block #i and #j in different virtual disks belong to the files with the same name, they are likely to have the same content How to know which file a block belongs to? Snoop block-level I/O and parse the file system information it’s a undesirable dirty hack block-to-file mapping (difficult) Our approach Let storage parse the file system information file-to-block mapping (easy) ‹#›

File-to-block Mapping

Mount the virtual disks in read-only mode Use the file system debug tool/library E.g. debugfs for Ext2/3/4 E.g. libntfs for NTFS file /bin/gzip in Ext2 file system Besides i-node, this file has 12 data blocks, followed by a indirect block for metadata and another data block ‹#›

File-level Block Sharing (FBS)

‹#›

FBS Software Components

‹#›

FBS Prototype

‹#›

Evaluation

42 virtual machines Ubuntu Linux Server 10.04 Software packages to share OpenOffice and required packages (438MB) for its size and popularity Online overhead Offline overhead ‹#›

Comparing to Related Work

CAS FBS Sharing unit a block in a virtual disk a file in a virtual disk Non-sharable blocks file system meta-data file system meta-data; files which have some identical blocks but have at least one non-identical block Back-end storage cluster file system Linux Logical Volume Manager (LVM) Volumes needed for k virtual disks 1 template volume + k copy-on-write volumes 1 template volume + k copy-on-write volumes + 1 common volume Major online overhead hashing of written blocks none Major offline overhead coalescing blocks comparing files with files in common volume + coalescing blocks of files ‹#›

FBS and File Systems

Meta-data blocks are not sharable The same block in a file may be allocated to different location in different virtual disks E.g. inode, indirect blocks in Ext 2/3 Meta-data blocks decide the space-efficiency of block-level deduplication Ext 2/3 file system layout ‹#›

Meta-data Overhead in Ext2/Ext3

File size~4MBytes >99% files in our test OpenOffice package Meta data 2/1038=0.19% ‹#›

Extents reduces meta-datas

Extents Extent=(logical address, physical address, length) consecutive blocks represented by only one extent Supported by modern file systems e.g. NTFS and EXT4 Benefits Reducing meta data Deleting big files is fast Good for FBS (less overhead to remap the blocks) Meta-data size in Ext4 with extents ‹#›

Showing 1 - 20 of 22 items Details

Name: 
jong-trident2011
Author: 
Natarajan, Narayanan
Company: 
Telcordia
Description: 
FBS Software Components
Tags: 
block | file | system | data | virtual | network | storag | share
Created: 
12/20/2010 7:35:24 PM
Slides: 
22
Views: 
1
Downloads: 
0
Rating: 
0


> Comment



Share this presentation
|

Comments

Share this presentation:

|
Sitemap