Hmm I'm getting somewhat different results.
Memorybench - copying a block of 80 MB from a -> b
glibc: 366.8876 MB/sec
Freevec: 381.2597 MB/sec
Motovec: 637.9746 MB/sec
FC64: 611.4247 MB/sec
Cachebench - copying a block of 8 KB from a -> b
glibc: 2637.2605 MB/sec
Freevec: 5510.4557 MB/sec
Motovec: 7355.0409 MB/sec
FC64: 4557.8185 MB/sec
*FC64 is real simple copy loop using float registers.
Its loop unrolled to copy 64 byte (2 cache lines) per loop iteration.
The copy gets speed by using a dcbt to prefetch the next two cache lines while copying the current.
I think the improved memory throughput from 360 to 630 MB/sec
does have a huge impack on many applications.
Freevec does not improve the memory throughput as good as libmotovec.
But maye I'm doing something wrong with freevec here.
For some more PowerPC memory benchmarks see here:
glibc benchmarks
Cheers
Gunnar