Quote:
@blu:
I know VFP is not good but
did you check the assembly Code that gcc generated?
yes, i did. but actually what is more interesting is the demonstrated IPC (instructions-per-clock) in both cases:
603e: 0.378 flops/clock
VFPv3-lite: 0.122 flops/clock
both are quite close to their theoretical limits of 2 flops/5 clocks, and 1 flop/7 clocks, respectively. so both compilers did a good job there, for scalar code, anyway.
Quote:
Performance of GCC may be better for PowerPC than for ARM... (only a guess - I never used PowerPC myself)
actually lately there's been the opposite trend - gcc gets to do worse on ppc with the later compiler iterations. for instance, gcc 4.2.x produces better code for that little matrix x matrix testcase, than 4.3.x (i've not tried 4.4+.x there).