Foundations of Computer Science Chapter 2
Data Representation
Foundations of Computer Science Chapter 2
Data Representation
Outline
2.1 Data type 2.2 Data inside the computer 2.3 Representing data 2.4 Hexadecimal Notation 2.5 Octal Notation
2.1
Data Types
Different Types of Data
Data Text Number Image Audio Video Numbers, text, images, audio, and video are all forms
of data. Computers need to process all types of data.
Multimedia
The computer industry uses the term
“multimedia” (多媒體) to define
Information that contains numbers,
text, images, audio, and video.
2.2
Data Inside
The Computer
Bit Pattern
All data types from outside a computer are transformed into a uniform representation called a bit pattern for processing by computers.
A bit is the smallest unit of data that can be stored in a computer
Bit Pattern
1 byte = 8 bits 1 KB = 1024 bytes 1 MB = 1024 KB 2 bytes = 16 bits 1 GB = 1024 MB K : kilo (千)
M: mega (百萬)
G: giga (十億) 210 = 1024 A bit pattern of length 8 is called a byte
Byte also been used to measure the size of memory or other storage devices
Bit Pattern
1000101010111111 A switch, with its two states of on and off, can represent a bit(0 or 1)
A bit pattern is a sequence of bits that can represent a symbol
Examples of Bit Patterns
Data are coded (邊碼) when they enter a computer and decoded (解碼) when they are presented to the user
2.3
Representing Data
Representing Symbols Using Bit Patterns
How many bits are needed in a bit pattern to represent a symbol in a language?
Depend on how many symbols are in the set
Number of Symbols and Bit Pattern Length
Number of Symbols---------------------
2
4
8
16
…
128
256
…
65,536 Bit Pattern Length---------------------
1
2
3
4
…
7
8
…
16
Codes
Code: different sets of bit patterns have been designed to represent text symbols
ASCII: popular code for symbols
EBCDIC: used in IBM mainframes
Unicode (萬國碼): 16-bit code, allow a greater number of symbols
ISO: 32-bit code, allow a greater number of symbols
Coding
Coding is the process of transforming data into a bit pattern
ASCII
27 = 128 American Standard Code for Information Interchange (美國資訊交換標準碼)
The code be developed from the American National Standards Institute (ANSI, 美國國家標準局)
ASCII uses 7 bits for each symbol
Some Features of ASCII
ASCII uses a 7-bit pattern: 0000000 ~ 1111111
0000000 null character; 1111111 delete character
There are 31 control (nonprintable) characters
The numeric characters (0~9) are coded before letters
Some Features of ASCII
There are several special printable characters
The uppercase letters (A~Z) come before the lowercase letters (a~z)
The upper and lowercase characters are distinguished by only 1 bit. (A 1000001; a 1100001)
There are six special characters between the upper and lowercase letters
Extended ASCII
To make the size of each pattern 1 byte (8 bits), the ASCII bit patterns are augmented with an extra 0 at the left.
Extended ASCII uses a 8-bit pattern: 00000000 ~ 01111111
EBCDIC
28 = 256 Extended Binary Coded Decimal Interchange Code (延伸的二進位十進制交換碼)
EBCDIC uses 8 bits for each symbols
Only uses in IBM mainframes
Comments