MIPS (curiosity) faster way of clearing a register?_问答_开发者

MIPS (curiosity) faster way of clearing a register?

开发者 https://www.devze.com 2023-01-21 19:56 出处：网络

What is the fastest way of clearing a register (=0) in MIPS assembly? Some examples: xor开发者_如何学编程$t0, $t0, $t0

What is the fastest way of clearing a register (=0) in MIPS assembly?

Some examples:

xor开发者_如何学编程    $t0, $t0, $t0
and    $t0, $t0, $0
move   $t0, $0
li     $t0, 0
add    $t0, $0, $0

Which is the most efficient?

In many MIPS implementations, these ops will both compile to the same instruction, because typically 'mov $a, $b' is an idiom for or $a, $b, $0 and li $r, x is shorthand for ori $r, $0, x:

move $t0, $0
li $t0, 0

and these will both take place on the same pipeline, being architecturally equivalent:

xor $t0, $t0, $t0
and $t0, $t0, $0

and in every RISC implementation I've ever worked with, add is on the same pipe as xor/and/nor/etc.

Basically, this is all particular to the implementation of a particular chip, but they all ought to be single clock. If the chip is out of order, li or and x, $0, $0 might be fastest because they minimize false dependencies on other registers.

I seem to remember that $0 was creted specifically for this case, so I would expect that move $t0 $0 should be the recommended way to clear a register. But I have not done MIPS for almost 10 years ...

Given that all of those instructions take a single pipeline cycle, there shouldn't be much difference between them.

If any, I'd expect the xor $t0, $t0, $t0 to be best for speed because it doesn't use any other registers, thus keeping them free for other values and potentially reducing register file contention.

The xor method is also treated as a specific idiom on some processors, which allow it to use even less resources (e.g. not needing to do the XOR ALU operation.

On most implementations of the MIPS architecture, all of these should offer the same performance. However, one can envision a superscalar system which could execute several instructions simultaneously, as long as they use distinct internal units. I have no actual example of a MIPS system which works like that, but that is how it happens on PowerPC systems. A xor $t0, $t0, $t0 opcode would be executed on the "integer computations" unit (because it is a xor) while move $t0, $0 would not use that unit; conceptually, the latter could be executed in parallel with another opcode which perform integer computations.

In brief, if you find a system where all the ways you list are not equally efficient, then I would expect the move $t0, $0 method to be the most efficient.

It probably depends on what other instructions will be in the pipeline at the same time: when the register was last used, when it will next be used and which internal units are currently in use.

I'm not familiar with the pipeline structure of any particular MIPS processor, but your compiler should be and I would expect it to choose whichever would be the fastest in a given code sequence.

You can simply use the $zero register as a reference and write its value, which is 0 or 0b00000000, into the register you want to clear up.

If you're working with floats or doubles you can simply declare a float and or double variable in .data as 0.0 and write it into the register you want to clear up whenever you want.

Example:

.data
     PI:       .float   3.14
     clear:    .float   0.0
.text
     main:
          lwc1 $f0, PI
          lwc1 $f0, clear

     li $v0, 10
     syscall