問題描述
使用 32 位字計算浮點數 (Floating point number calculation using a 32 bit word)
我正在閱讀 Patterson 的計算機組織第五版書,我對這兩頁文字感到困惑。第一頁:
第一個單詞是否等於十進制的0.5?我看到符號為 0,指數為 ‑1,小數為 0,有效數中隱含 1。所以 1.0_two * 2^‑1 = 0.5?對嗎?
為什麼 1.0 * 2^1 是“較小的二進制數?”。第二個字不是更大嗎?它的符號為 0,指數為 1,有效數為隱含的 1 = 1.0 * 2^1 = 2?對嗎?
我不知道 t 理解以下段落:
因此,理想的符號必須將最負的指數表示為 00 ... 00_two,將最正的指數表示為 11 ... 11_two。這種約定稱為偏差表示法,偏差是從正常的無符號表示中減去的數字,以確定實際值。
參考解法
方法 1:
If you look at them just as binary numbers, the first one is 0x7f800000 while the second is 0x00800000, so the second is a smaller binary number even though it represents a larger floating point number. So using a binary comparison or sort would do the wrong thing.
So instead the biased representation for the exponent is used, which means the binary value for 0.5 is 0x3f000000 and the binary value for 2.0 is 0x40000000, and the binary comparison "works" for comparing and sorting floating point numbers.
The problem being that this is still a sign+magnitude representation, so you need a sign+magnitude binary comparison, while most hardware uses 2s‑complement. So you still end up needing special floating point comparison instructions/hardware.
(by Jwan622、Chris Dodd)