c++ - Do most compilers transform % 2 into bit comparison? Is it really faster? -


in programming, 1 needs check if number odd or even. that, use:

n % 2 == 0 

however, understanding '%' operator performs division , returns remainder; therefore, case above, faster check last bit instead. let's n = 5;

5 = 00000101 

in order check if number odd or even, need check last bit. if it's 1, number odd; otherwise, even. in programming, expressed this:

n & 1 == 0 

in understanding faster % 2 no division performed. mere bit comparison needed.

i have 2 questions then:

1) second way faster first (in cases)?

2) if answer 1 yes, compilers (in languages) smart enough convert % 2 simple bit comparison? or have explicitly use second way if want best performance?

yes, bit-test much faster integer division, by factor of 10 20, or 100 128bit / 64bit = 64bit idiv on intel. esp. since x86 @ least has test instruction sets condition flags based on result of bitwise and, don't have divide , then compare; bitwise and is compare.

i decided check compiler output on godbolt, , got surprise:

it turns out using n % 2 signed integer value (e.g. return n % 2 function return signed int) instead of testing non-zero (if (n % 2)) produces slower code return n & 1. because (-1 % 2) == -1, while (-1 & 1) == 1, compiler can't use bitwise and. compilers still avoid integer division, though, , use clever shift / , / add / sub sequence instead, because that's still cheaper integer division. (gcc , clang use different sequences.)

so if want return truth value based on n % 2, best bet unsigned type. lets compiler optimize single , instruction. (on godbolt, can flip other architectures, arm , powerpc, , see unsigned even (%) function , int even_bit (bitwise &) function have same asm code.)

using bool (which must 0 or 1, not non-zero value) option, compiler have work return (bool) (n % 4) (or test other n%2). bitwise-and version of 0, 1, 2, or 3, compiler has turn non-zero value 1. (x86 has efficient setcc instruction sets register 0 or 1, depending on flags, it's still 2 instructions instead of 1. clang/gcc use this, see aligned4_bool in godbolt asm output.)

with optimization level higher -o0, gcc , clang optimize if (n%2) expect. other huge surprise icc 13 doesn't. don't understand wtf icc thinks it's doing branches.


Comments

Popular posts from this blog

dns - How To Use Custom Nameserver On Free Cloudflare? -

python - Pygame screen.blit not working -

c# - Web API response xml language -