Matt just gave me a url with a very simple benchmark for floating point at:
http://svn.arhuaco.org/svn/src/emqbit/t ... bit-bench/
So, I downloaded it on both the softfp and the hardfp efikas, build and ran it, here are the results of its two binaries:
softfp:
Code:
$ ./bench
nTimes=93750 16: Dot with C code => (flops 90.691986 : time:0.033079 us)
nTimes=61225 16: Distance with C code => (flops 64.782761 : time:0.046309 us)
nTimes=46875 32: Dot with C code => (flops 96.661934 : time:0.031036 us)
nTimes=30928 32: Distance with C code => (flops 72.299995 : time:0.041494 us)
nTimes=23438 64: Dot with C code => (flops 91.778763 : time:0.032688 us)
nTimes=15545 64: Distance with C code => (flops 75.102257 : time:0.039948 us)
nTimes=11719 128: Dot with C code => (flops 101.638512 : time:0.029517 us)
nTimes=7793 128: Distance with C code => (flops 71.021545 : time:0.042245 us)
nTimes=5860 256: Dot with C code => (flops 102.676842 : time:0.029221 us)
nTimes=3902 256: Distance with C code => (flops 76.789795 : time:0.039076 us)
nTimes=2930 512: Dot with C code => (flops 94.328918 : time:0.031807 us)
nTimes=1952 512: Distance with C code => (flops 76.468056 : time:0.039235 us)
nTimes=1465 1024: Dot with C code => (flops 103.355949 : time:0.029029 us)
nTimes=977 1024: Distance with C code => (flops 72.532097 : time:0.041393 us)
nTimes=733 2048: Dot with C code => (flops 102.796173 : time:0.029207 us)
nTimes=489 2048: Distance with C code => (flops 72.398621 : time:0.041505 us)
nTimes=367 4096: Dot with C code => (flops 103.649727 : time:0.029006 us)
nTimes=245 4096: Distance with C code => (flops 76.943657 : time:0.03913 us)
nTimes=184 8192: Dot with C code => (flops 95.036598 : time:0.031721 us)
nTimes=123 8192: Distance with C code => (flops 77.351425 : time:0.039081 us)
nTimes=92 16384: Dot with C code => (flops 103.635597 : time:0.029089 us)
nTimes=62 16384: Distance with C code => (flops 72.119598 : time:0.042256 us)
nTimes=46 32768: Dot with C code => (flops 88.333809 : time:0.034128 us)
nTimes=31 32768: Distance with C code => (flops 77.195709 : time:0.039477 us)
nTimes=23 65536: Dot with C code => (flops 94.034622 : time:0.032059 us)
nTimes=16 65536: Distance with C code => (flops 70.625801 : time:0.044541 us)
16, 90.691986, 64.782761,
32, 96.661934, 72.299995,
64, 91.778763, 75.102257,
128, 101.638512, 71.021545,
256, 102.676842, 76.789795,
512, 94.328918, 76.468056,
1024, 103.355949, 72.532097,
2048, 102.796173, 72.398621,
4096, 103.649727, 76.943657,
8192, 95.036598, 77.351425,
16384, 103.635597, 72.119598,
32768, 88.333809, 77.195709,
65536, 94.034622, 70.625801
hardfp:
Code:
$ ./bench
nTimes=93750 16: Dot with C code => (flops 124.048958 : time:0.024184 us)
nTimes=61225 16: Distance with C code => (flops 79.258804 : time:0.037851 us)
nTimes=46875 32: Dot with C code => (flops 121.045830 : time:0.024784 us)
nTimes=30928 32: Distance with C code => (flops 80.882591 : time:0.037091 us)
nTimes=23438 64: Dot with C code => (flops 116.150993 : time:0.025829 us)
nTimes=15545 64: Distance with C code => (flops 79.433014 : time:0.03777 us)
nTimes=11719 128: Dot with C code => (flops 112.479904 : time:0.026672 us)
nTimes=7793 128: Distance with C code => (flops 84.996887 : time:0.035299 us)
nTimes=5860 256: Dot with C code => (flops 111.366318 : time:0.026941 us)
nTimes=3902 256: Distance with C code => (flops 84.570282 : time:0.035481 us)
nTimes=2930 512: Dot with C code => (flops 111.581688 : time:0.026889 us)
nTimes=1952 512: Distance with C code => (flops 84.589600 : time:0.035468 us)
nTimes=1465 1024: Dot with C code => (flops 112.115395 : time:0.026761 us)
nTimes=977 1024: Distance with C code => (flops 84.379898 : time:0.035581 us)
nTimes=733 2048: Dot with C code => (flops 111.256500 : time:0.026986 us)
nTimes=489 2048: Distance with C code => (flops 85.648865 : time:0.035084 us)
nTimes=367 4096: Dot with C code => (flops 109.561012 : time:0.027441 us)
nTimes=245 4096: Distance with C code => (flops 84.969376 : time:0.035434 us)
nTimes=184 8192: Dot with C code => (flops 111.856926 : time:0.026951 us)
nTimes=123 8192: Distance with C code => (flops 84.636757 : time:0.035717 us)
nTimes=92 16384: Dot with C code => (flops 110.257339 : time:0.027342 us)
nTimes=62 16384: Distance with C code => (flops 84.866913 : time:0.035909 us)
nTimes=46 32768: Dot with C code => (flops 109.671715 : time:0.027488 us)
nTimes=31 32768: Distance with C code => (flops 85.552208 : time:0.035621 us)
nTimes=23 65536: Dot with C code => (flops 108.386276 : time:0.027814 us)
nTimes=16 65536: Distance with C code => (flops 82.243820 : time:0.038249 us)
16, 124.048958, 79.258804,
32, 121.045830, 80.882591,
64, 116.150993, 79.433014,
128, 112.479904, 84.996887,
256, 111.366318, 84.570282,
512, 111.581688, 84.589600,
1024, 112.115395, 84.379898,
2048, 111.256500, 85.648865,
4096, 109.561012, 84.969376,
8192, 111.856926, 84.636757,
16384, 110.257339, 84.866913,
32768, 109.671715, 85.552208,
65536, 108.386276, 82.243820
And the cfft binary:
softfp:
Code:
$ ./cfft
nTimes=6250 N=16: (flops 43.850990 : time:0.045609 us)
nTimes=2500 N=32: (flops 43.096947 : time:0.046407 us)
nTimes=1042 N=64: (flops 45.744595 : time:0.043735 us)
nTimes=447 N=128: (flops 42.186687 : time:0.047469 us)
nTimes=196 N=256: (flops 44.160267 : time:0.045449 us)
nTimes=87 N=512: (flops 42.001507 : time:0.047724 us)
nTimes=40 N=1024: (flops 43.827175 : time:0.046729 us)
nTimes=18 N=2048: (flops 41.382183 : time:0.048995 us)
nTimes=9 N=4096: (flops 42.877579 : time:0.051585 us)
nTimes=4 N=8192: (flops 40.669060 : time:0.052372 us)
nTimes=2 N=16384: (flops 41.244293 : time:0.055614 us)
nTimes=1 N=32768: (flops 39.966824 : time:0.061491 us)
nTimes=1 N=65536: (flops 36.040470 : time:0.145472 us)
16, 43.850990
32, 43.096947
64, 45.744595
128, 42.186687
256, 44.160267
512, 42.001507
1024, 43.827175
2048, 41.382183
4096, 42.877579
8192, 40.669060
16384, 41.244293
32768, 39.966824
65536, 36.040470
hardfp:
Code:
$ ./cfft
nTimes=6250 N=16: (flops 57.763405 : time:0.034624 us)
nTimes=2500 N=32: (flops 58.339657 : time:0.034282 us)
nTimes=1042 N=64: (flops 57.334785 : time:0.034894 us)
nTimes=447 N=128: (flops 56.781216 : time:0.035268 us)
nTimes=196 N=256: (flops 56.472710 : time:0.03554 us)
nTimes=87 N=512: (flops 54.912746 : time:0.036503 us)
nTimes=40 N=1024: (flops 55.002014 : time:0.037235 us)
nTimes=18 N=2048: (flops 54.731277 : time:0.037045 us)
nTimes=9 N=4096: (flops 54.089794 : time:0.040892 us)
nTimes=4 N=8192: (flops 53.270641 : time:0.039983 us)
nTimes=2 N=16384: (flops 40.333393 : time:0.05687 us)
nTimes=1 N=32768: (flops 50.530472 : time:0.048636 us)
nTimes=1 N=65536: (flops 48.851868 : time:0.107322 us)
16, 57.763405
32, 58.339657
64, 57.334785
128, 56.781216
256, 56.472710
512, 54.912746
1024, 55.002014
2048, 54.731277
4096, 54.089794
8192, 53.270641
16384, 40.333393
32768, 50.530472
65536, 48.851868
That's ~30% speed gain just from a simple recompile! I knew the system actually *felt* faster, but now the numbers prove it once again. Still working on setting up the compile farm -lots of trouble with wanna-build, but I think I've found a solution, stay tuned!