問題描述
我可以用 OR 操作代替 MOV 操作嗎? (Can I substitute a MOV operation with an OR operation?)
首先,我想說我是 ASM 新手,如果這是一個愚蠢的問題,請原諒。
我在 Agner Fog 的微架構手冊 關於部分寄存器停頓(這似乎有點高級,但我很好奇為什麼 64 位中的 32 位指令位模式將寄存器的上半部分歸零)。例 6.13 給出瞭如何避免寄存器停頓的解決方案。我對此仍然有些困惑,為什麼不使用 OR 操作而不是 MOV,例如:
xor eax, eax
mov al, byte [mem8]
; or al, byte [mem8] ; why not this?
我認為效果是一樣的。它們每秒都使用相同數量的周期嗎?一個比另一個更有效嗎?有什麼“在引擎蓋下”嗎?
參考解法
方法 1:
Partial register access in 64‑bit mode
In 64‑bit mode, the following rules apply when accessing registers with less than 64‑bit:
- If a 32‑bit register is accessed, the upper 32 bits of the associated 64‑bit register are cleared
- If a 16‑ or 8‑bit register is accessed, the upper 48 or 56 bits of the associated 64‑bit register remain.
</ul>
If only an 8‑bit register is accessed, the old value of the associated 64‑bit register must first be obtained, the 8‑bit sub‑register changed and then the new value saved.
Example 6.13 from Agner Fog's microarchitecture manual is not related to this, it is only an alternative to movzx
, because this instruction is slow on older pentium processors.
mov
or or
?
The two lines
31 C0 xor eax, eax
8A 05 ## ## ## ## mov al, byte [mem8]
(opcodes on the left) are probably faster than if you replaced the second line with
0A 05 ## ## ## ## or al, byte [mem8]
since there is a depency to the previous line: Only when xor eax, eax
has been calculated the new value in eax
can be passed on to or
. In addition, just as with the variant with mov
, there may be a slowdown because only a partial register is accessed. Instead, I would suggest replacing these two lines with
0F B6 05 ## ## ## ## movzx eax, byte [mem8]
This is one byte shorter than the previous approach and also just a single instruction that accesses a full 32‑bit register. As Agner Fog said
The easiest way to avoid partial register stalls is to always use full registers and use
MOVZX
orMOVSX
when reading from smaller memory operands.