c – <快于<=?

我正在读一本书,其中作者说如果(a <901)比if(a< = 900)快. 与此简单示例不完全相同,但循环复杂代码略有性能变化.我想这必须对生成的机器代码做一些事情,以防它甚至是真的.
最佳答案
不,它在大多数架构上都不会更快.您没有指定,但在x86上,所有的整数比较通常都会在两个机器指令中实现:

>测试或cmp指令,设置EFLAGS
>和Jcc (jump) instruction,取决于比较类型(和代码布局):

> jne – 如果不相等则跳转 – > ZF = 0
> jz – 如果为零(等于)则跳转 – > ZF = 1
> jg – 如果更大则跳转 – > ZF = 0且SF = OF
>(等……)

示例(为简洁起见编辑)使用$gcc -m32 -S -masm = intel test.c编译

    if (a < b) {
        // Do something 1
    }

编译为:

    mov     eax, DWORD PTR [esp+24]      ; a
    cmp     eax, DWORD PTR [esp+28]      ; b
    jge     .L2                          ; jump if a is >= b
    ; Do something 1
.L2:

    if (a <= b) {
        // Do something 2
    }

编译为:

    mov     eax, DWORD PTR [esp+24]      ; a
    cmp     eax, DWORD PTR [esp+28]      ; b
    jg      .L5                          ; jump if a is > b
    ; Do something 2
.L5:

因此,两者之间的唯一区别是jg与jge指令.这两个人将花费相同的时间.

我想解决的问题是,没有任何迹象表明不同的跳转指令需要相同的时间.回答这个问题有点棘手,但这就是我能给出的:在Intel Instruction Set Reference中,它们都是在一个共同指令下组合在一起的,即Jcc(如果符合条件则跳转).在Optimization Reference Manual,附录C中,相同的分组一起进行.延迟和吞吐量.

Latency — The number of clock cycles that are required for the
execution core to complete the execution of all of the μops that form
an instruction.

Throughput — The number of clock cycles required to
wait before the issue ports are free to accept the same instruction
again. For many instructions, the throughput of an instruction can be
significantly less than its latency

Jcc的值是:

      Latency   Throughput
Jcc     N/A        0.5

以下关于Jcc的脚注:

7) Selection of conditional jump instructions should be based on the recommendation of section Section 3.4.1, “Branch Prediction Optimization,” to improve the predictability of branches. When branches are predicted successfully, the latency of jcc is effectively zero.

因此,英特尔文档中的任何内容都没有对其他Jcc指令进行任何不同的处理.

如果考虑用于实现指令的实际电路,可以假设在EFLAGS中的不同位上将存在简单的AND / OR门,以确定是否满足条件.那么,没有理由说测试两位的指令应该比仅测试一位的指令花费更多或更少的时间(忽略门传播延迟,这远远小于时钟周期).

编辑:浮点

对于x87浮点也是如此:(与上面的代码完全相同,但是使用double而不是int.)

        fld     QWORD PTR [esp+32]
        fld     QWORD PTR [esp+40]
        fucomip st, st(1)              ; Compare ST(0) and ST(1), and set CF, PF, ZF in EFLAGS
        fstp    st(0)
        seta    al                     ; Set al if above (CF=0 and ZF=0).
        test    al, al
        je      .L2
        ; Do something 1
.L2:

        fld     QWORD PTR [esp+32]
        fld     QWORD PTR [esp+40]
        fucomip st, st(1)              ; (same thing as above)
        fstp    st(0)
        setae   al                     ; Set al if above or equal (CF=0).
        test    al, al
        je      .L5
        ; Do something 2
.L5:
        leave
        ret

转载注明原文:c – <快于<=? - 代码日志