Download Data Representation

Data Representation January 9–14, 2013 1 / 40 Quick logistical notes In class exercises Bring paper and pencil (or laptop) to each lecture! Goals: • break up lectures, keep you engaged • chance to work through problems in class • ask questions! First homework will be posted before Friday’s lecture! 2 / 40 Outline Internal vs. external representations Representing the natural numbers Binary number system Binary arithmetic Hexadecimal and base-N number systems Fixed-size integer representations Representing negative numbers Big endian vs. little endian 3 / 40 Internal vs. external representations Internal representation How the data is actually represented in the computer hardware External representation How we interpret or conceptualize the internal representation 4 / 40 Internal representations Usually two states, which we interpret as 0 and 1 Volatile representations: • Capacitor (DRAM) • charged or not • Flip-flop circuit (SRAM) • one of two output signals is high Non-volatile representations: • Region of a magnetized surface (hard disks, tape) • positive or negative • Floating gate transistor (flash) • change in voltage • one cell can represent more than two states! • e.g. one 16-level cell ≈ four flip-flops 5 / 40 Interacting with the internal representation Architecture provides an interface • can interact with the internal representation • using the abstraction of the external representation Advantages: • Don’t have to think about internal representation • Architecture can be implemented by different hardware 6 / 40 Organization of the internal representation Usually can’t refer to individual bits • Internal representation organized into groups • Through ISA, can read/write a group by an address Addressable groups in MIPS • byte = 8 bits • word = 4 bytes = 32 bits • (also halfword = 2 bytes = 16 bits) 7 / 40 External representations Conceptually, view data as a sequence of 0s and 1s The same data can be interpreted in different ways: Example: 1111 0110 ö 246 −10 extended ASCII character unsigned integer signed 8-bit integer 8 / 40 Outline Internal vs. external representations Representing the natural numbers Binary number system Binary arithmetic Hexadecimal and base-N number systems Fixed-size integer representations Representing negative numbers Big endian vs. little endian 9 / 40 Decimal number system (base 10) How it works (positional number system): • 10 digits, used in sequence • each position corresponds to a power of 10 • sum of each digit multiplied by position value Example: 2037 ... 105 104 103 102 101 100 . . . 100,000 10,000 1000 100 10 1 (0) (0) 2 0 3 7 2 ·1000 + 0 ·100 + 3 ·10 + 7 ·1 = 2000 + 0 + 30 + 7 = 2037 10 / 40 Binary number system (base 2) Works the same way! • 2 bits, used in sequence (binary digit) • each position corresponds to a power of 2 • sum of each bit multiplied by position value Example: 110101 . . . 27 26 25 24 23 22 21 20 . . . 128 64 32 16 8 4 2 1 (0) (0) 1 1 0 1 0 1 1 · 32 + 1 · 16 + 0 · 8 + 1 · 4 + 0 ·2 + 1 ·1 = 32 + 16 + 0 + 4 + 0 + 1 = 53 11 / 40 Converting from binary to decimal Very easy: • Since binary is just 0s and 1s, no need to multiply • Just add up the position values of the 1 bits Example: 1011 0010 . . . 27 26 25 24 23 22 21 20 . . . 128 64 32 16 8 4 2 1 1 0 1 1 0 0 1 0 128 + 32 + 16 + 2 = 178 12 / 40 Converting from decimal to binary Method 1: Subtracting powers of 2 For each position p from left to right • If 2p ≤ n, subtract and write 1 • Otherwise, write 0 Example: 157 157 − 128 = 29 29 − 16 = 13 13 − 8 = 5 5−4= 1 1−1= 0 1 for 128’s position 0 for 64, 0 for 32, 1 for 16 1 for 8 1 for 4 0 for 2, 1 for 1 1001 1101 13 / 40 Converting from decimal to binary Method 2: Successive division by 2 • Divide by 2 until you reach 0, keeping track of remainders • Write the remainders, from last to first Example: 157 157 78 39 19 9 4 2 1 ÷ ÷ ÷ ÷ ÷ ÷ ÷ ÷ 2 2 2 2 2 2 2 2 = 78 R 1 = 39 R 0 = 19 R 1 = 9 R 1 = 4 R 1 = 2 R 0 = 1 R 0 = 0 R 1 1001 1101 14 / 40 In class exercises Convert from binary to decimal: • 0010 1010 • 1001 0101 Convert from decimal to binary: • 169, by subtracting powers of 2 • 84, by successive division by 2 15 / 40 Binary addition Just like adding decimal numbers! To add two binary numbers • Pairwise add each bit, starting from the right • 0 + 0 = 0 and 0 + 1 = 1 • On 1 + 1, carry a bit to the left Example: 0110 + 0011 Example: 0011 + 0011 11 + 0110 0011 1001 11 + 0011 0011 0110 16 / 40 Binary multiplication Same algorithm as decimal (only easier) To multiply two binary numbers A and B 1. For each bit b in B: • Multiply b × A, aligning the result with b (since b is 0 or 1, each step yields 0 or a A!) 2. Sum the results Example: 1101 × 1101 1. 1101 ×1101 1101 0 1101 1101 2. 11111 1101 0000 110100 + 1101000 10101001 Often easiest to add results two at a time 17 / 40 Special case: multiplying by a power of 2 Super easy, just like multiplying by powers of 10 in decimal To multiply a binary number by 2p Add p 0s on the right Examples • 100 × 1101 = 110100 • 1010 × 1000 = 1010000 18 / 40 Hexadecimal number system (base 16) Very useful for representing binary data concisely! • 16 digits: 0–9, A, B, C, D, E, F • each position corresponds to a power of 16 • usually prefixed with 0x Each hex digit corresponds to 4 bits 0 1 2 3 0000 0001 0010 0011 4 5 6 7 0100 0101 0110 0111 8 9 A B 1000 1001 1010 1011 C D E F 1100 1101 1110 1111 One byte = 2 hex digits 19 / 40 Converting hexadecimal ⇔ binary Each hex digit corresponds to 4 bits 0 1 2 3 0000 0001 0010 0011 4 5 6 7 0100 0101 0110 0111 8 9 A B 1000 1001 1010 1011 C D E F 1100 1101 1110 1111 Examples • 0xA4F7 = 1010 0100 1111 0111 • 0x0B60 = 0000 1011 0110 0000 We will be doing this a lot this quarter. :) 20 / 40 Converting hexadecimal ⇔ decimal Two strategies: • Convert directly • Convert hexadecimal ⇔ binary ⇔ decimal Example: 0xB6A4 (direct conversion) ... 164 163 162 161 160 . . . 65,536 4,096 256 16 1 (0) B 6 A 4 11 · 4096 + 6 · 256 + 10 · 16 + 4 · 1 = 45056 + 1536 + 160 + 4 = 46,756 21 / 40 Representation in other bases In general, we can represent numbers in any base Some other significant bases: • Base 8 — octal • each octal digit is equivalent to three bits (000 = 08 , 001 = 18 , 010 = 28 , . . . , 111 = 78 ) • useful in old architectures with 12, 24, 36 bit words • support in C and many assembly languages (071 = 718 = 5310 ) • Base 64 (0–9, A–Z, a–z, +, /) • each base-64 digit is equivalent to six bits • used in MIME to transmit binary data in plain ASCII text 22 / 40 In class exercises Add in binary: • 100 1100 + 1110 1111 Multiply in binary: • 1011 × 101 Add in hexadecimal: • 0x28 + 0x4A 0 1 2 3 0000 0001 0010 0011 4 5 6 7 0100 0101 0110 0111 8 9 A B 1000 1001 1010 1011 C D E F 1100 1101 1110 1111 23 / 40 Outline Internal vs. external representations Representing the natural numbers Binary number system Binary arithmetic Hexadecimal and base-N number systems Fixed-size integer representations Representing negative numbers Big endian vs. little endian 24 / 40 Arbitrary vs. fixed precision So far, we have been assuming arbitrary precision • to represent a bigger number, just add more bits/digits! In practice, integers have a fixed size • commonly 32 or 64 bits • based on register size of the architecture This is significant for two reasons: • risk of overflow • representation of negative numbers 25 / 40 Representing negative numbers Must first specify the fixed size of the integer! • With n bits, we can represent 2n different values • Idea: split space so half the values represent negatives Sign and magnitude representation • First bit represents the sign (0 positive, 1 negative) • Rest of bits represent the magnitude, that is |x| Suppose 4-bit integers • Examples: −1 = 1001 −4 = 1100 −7 = 1111 This is exactly the representation you’re used to in decimal! 26 / 40 Problems with sign and magnitude representation This turns out to not be a very good representation . . . why? Issue 1: Multiple zeros • Both 0000 and 1000 represent the same value • This is strange and requires extra effort Issue 2: Complicated arithmetic Simple binary addition doesn’t work 0 010 + 0 011 0 101 3 1 010 + 1 011 1 101 3 0 010 + 1 011 1 001 7 27 / 40 One’s complement representation One’s complement • start with the fixed-size binary representation of |x| • invert every bit Features: • Binary addition is simple (wrap-around carry) • Still two zeros (all 0s and all 1s) Examples • -2 • -3 • -5 1. 0010 1. 0011 1. 0101 2. 1101 2. 1100 2. 1010 28 / 40 One’s complement addition Overflow carries “wrap around” (added on the right) Example: −2 + −3 = −5 11 1101 + 1100 1001 + 1 1010 29 / 40 Two’s complement representation Two’s complement To represent a negative number x: 1. start with the fixed-size binary representation of |x| 2. invert every bit 3. add 1 to the result Suppose 4-bit integers: Examples • -1 • -4 • -7 1. 0001 1. 0100 1. 0111 2. 1110 2. 1011 2. 1000 3. 1111 3. 1100 3. 1001 30 / 40 Features of two’s complement representation Range of expressible values with n bits • max: 2n−1 − 1 • min: −2n−1 0 followed by all 1s 1 followed by all 0s Fixes the issues with sign and magnitude: • Only one zero! (all 0s) • Binary arithmetic “just works” (discard carry out) Examples: 2 + 3 = 5 0010 + 0011 0101 3 −2 + −3 = −5 1110 + 1101 1011 3 2 + −3 = −1 0010 + 1101 1111 3 31 / 40 Sign extension Change the size of an integer without changing its value • if positive (left-most bit 0), pad left with 0s • if negative (left-most bit 1), pad left with 1s Works with both one’s and two’s complement representation Example: Extending from 8-bits to 16-bits • 1001 0110 ⇒ 1111 1111 1001 0110 • 0001 0011 ⇒ 0000 0000 0001 0011 32 / 40 Carry out vs. overflow Carry out: carry after most significant bit ⇒ discard, no error Overflow: result is out of representable range ⇒ error! Carry out 6= overflow! Carry out is a normal part of signed integer addition Will get a carry out when adding: • two negative numbers • a negative and a positive, result is positive Just ignore it! 33 / 40 Two’s complement overflow detection Overflow: result is out of representable range ⇒ error! When adding . . . • two numbers with different signs • overflow can never occur! • two numbers with the same sign • overflow occurs if the sign changes 34 / 40 Trade-offs between representations of negatives In modern architectures, two’s complement is used • Simple arithmetic operations • Only one zero • Hard to read 35 / 40 Unsigned vs. signed integers Can interpret the same n-bit data as either unsigned or signed Unsigned integer • Interpret as a positive number • Range: 0 to 2n Signed integer • Interpret as two’s complement • Range: −2n−1 to 2n−1 − 1 Only different when the leftmost bit is a 1! 36 / 40 Big endian vs. little endian Order of the addressable components in a larger data type • Usually, the order of bytes within a word Big endian Bytes ordered from most significant (left) to least (right) • Example: 256 as a 16-bit halfword: 0x0100 Little endian Bytes ordered from least significant (left) to most (right) • Example: 256 as a 16-bit halfword: 0x0001 Big-endian is what we’ve been assuming so far! 37 / 40 Endian conversion Converting from big endian to little endian 1. Separate the data into addressable components (bytes) 2. Write the components (not the bits!) in reverse order Examples • 0x12345678 • 0xE5AD5CCA 1. 12 34 56 78 1. E5 AD 5C CA 2. 0x78563412 2. 0xCA5CADE5 Same algorithm for converting from little to big! 38 / 40 Which architectures are what endian? Little-endian: • x86, Atmel • MIPS (MARS simulator) Big-endian: • Motorola 6800 and 68k Bi-endian (configurable to be big or little): • ARM, SPARC, PowerPC • MIPS (specification) http://en.wikipedia.org/wiki/Endianness 39 / 40 In class exercises Assume 8-bit integers, addressable in 2-bit chunks For each of the following numbers: 1. write in two’s complement binary form 2. convert to little endian Numbers: • -50 • -100 40 / 40

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Data Representation