What are the ones and zeroes?Hard drive Computer "Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris. 1011010
Data storage
Charles McAnany
What are the ones and zeroes?
Hard drive Computer "Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris. 1011010
Definitions
A bit is a single one or zero. A byte is eight bits. Numbers stored as ones and zeroes are stored in binary.
Binary numbers
A list of ones and zeroes is a number. It’s just like a number in base ten. 1576 10110 Ones place Thousands place Hundreds place Tens place Ones place Eights place Fours place Twos place Sixteens place
Converting from binary to decimal
10110 Ones place Eights place Fours place Twos place Sixteens place Place Power Value Present? 1 2^0 1 No 2 2^1 2 Yes 3 2^2 4 Yes 4 2^3 8 No 5 2^4 16 Yes Then, add up all the present values. 16 + 4 + 2 = 22
From decimal to binary
Harder! Find the largest power of two that fits in the decimal number. Subtract that number, and mark it as present in the binary. Repeat until the decimal number is zero. 1577
1577
Place Power Value Present? 1 2^0 1 2 2^1 2 3 2^2 4 4 2^3 8 5 2^4 16 6 2^5 32 7 2^6 64 8 2^7 128 9 2^8 256 10 2^9 512 11 2^10 1,024 12 2^11 2,048 13 2^12 4,096 14 2^13 8,192 15 2^14 16,384 Largest power of two that fits: 1024 (not 2048, because 2048 > 1577.) 1577 -1024 = 553 Mark the 1024 spot, and continue with 553.
553
Place Power Value Present? 1 2^0 1 2 2^1 2 3 2^2 4 4 2^3 8 5 2^4 16 6 2^5 32 7 2^6 64 8 2^7 128 9 2^8 256 10 2^9 512 11 2^10 1,024 yes 12 2^11 2,048 13 2^12 4,096 14 2^13 8,192 15 2^14 16,384 Largest power of two that fits: 512 553 - 512 = 41 Mark the 512 spot, and continue with 41.
41
Place Power Value Present? 1 2^0 1 2 2^1 2 3 2^2 4 4 2^3 8 5 2^4 16 6 2^5 32 7 2^6 64 8 2^7 128 9 2^8 256 10 2^9 512 yes 11 2^10 1,024 yes 12 2^11 2,048 13 2^12 4,096 14 2^13 8,192 15 2^14 16,384 Largest power of two that fits: 32 41 - 32 = 9 Mark the 32 spot, and continue with 9.
9
Place Power Value Present? 1 2^0 1 2 2^1 2 3 2^2 4 4 2^3 8 5 2^4 16 6 2^5 32 yes 7 2^6 64 8 2^7 128 9 2^8 256 10 2^9 512 yes 11 2^10 1,024 yes 12 2^11 2,048 13 2^12 4,096 14 2^13 8,192 15 2^14 16,384 Largest power of two that fits: 8 9 - 8 = 1 Mark the 8 spot, and continue with 1.
1
Place Power Value Present? 1 2^0 1 2 2^1 2 3 2^2 4 4 2^3 8 5 2^4 16 6 2^5 32 yes 7 2^6 64 8 2^7 128 9 2^8 256 10 2^9 512 yes 11 2^10 1,024 yes 12 2^11 2,048 13 2^12 4,096 14 2^13 8,192 15 2^14 16,384 Largest power of two that fits: 1 1 - 1 = 0 Mark the 1 spot, the number is zero, so you’re done.
Wherever you marked present, that’s a 1. If there’s no mark, that number’s a 0. Starting at the bottom, fill in the binary number. The number is 000011000100001.
Playing for money.
The first person to convert the following number to binary will receive a cash prize. The number is: 200,000,000.
Text
Each byte is a character. So, each character has a number. The capital letters are in the handout. The following string (bytes separated by commas) is: 01001000, 01000101, 01001100, 01001100, 01001111 H E L L O Please take a moment to write your name in binary.
Images: Images are broken into pixels.
Each color is given a number. Blue = 0, Light blue = 1, red = 2, white = 3 00 01 00 01 10 10 10 00 00 00 00 11 11 11 00 01 00 01 10 10 10 11 11 11 11 11 11 11 10 10 10 10 10 10 10 Then, store the numbers, and any other info needed to make the image. The image format might specify, for instance, the first eight bits is the row length. So, our file would be Glue the rows together. (remembering how long a row is elsewhere.) 000001110001000110101000000000111111000100011010101111111111111110101010101010
Recreating an image.
The first eight bits are the row length. Use the coloring scheme in the handout. Here’s the file as it appears on the disk. Recreate the image. (I’ve broken it into bytes for ease of reading.) 00000101 00010001 00000010 00001100 00001100 11111100
Compression
If a particular pattern occurs often in a file, it may be possible to compress the contents. We’ll use Huffman coding to deflate a text file. The original text is “this is an example of a huffman tree”. We start by analyzing letter frequency. The most common letters should have the shortest codes.
Huffman coding
Char Freq Code space 7 111 a 4 010 e 4 000 f 3 1101 h 2 1010 i 2 1000 m 2 0111 n 2 0010 s 2 1011 t 2 0110 l 1 11001 o 1 00110 p 1 10011 r 1 11000 u 1 00111 x 1 10010 To encode, we replace each character with its Huffman code. The word “tree” is originally 01010100 01010010 01000101 01000101 Replacing it with the codes, we get: 011011000000000
Image compression using Huffman coding
Our image example used two bits for every pixel. That’s great for images with four colors. But most images are stored using 32 or even 64 bits per pixel. If most of the image is one color, we can give that color a code of very few bits, converting all of those 64-bit pixels into 1 bit pixels.
Other compression methods
Huffman coding is very widely used in lossless compression. When you view the data that was encoded, you get the EXACT same data back. But some things (music and images in particular) may not need to be stored with perfect accuracy. For these, we use lossy compression.
Comments