(pasting this from an email due to high demand
So, sorry for the delay, but I was at a Debian-Edu/Skolelinux meeting
in Oslo, Norway, and I just came back to Greece.
So, to fill you all in what I'm doing right now wrt the altivec
vectorization of Debian/Linux in general:
* I have right now a pretty generic way for applications to use a
single binary for both vectorized and scalar code. This means the
same program will autodetect Altivec if present and use the optimized
code. This is for applications/executables only. It is actually a known way, but it tried to take it further by using function pointers to overload the functions I want to use...
* For the same thing to happen in a library (e.g. glibc), it's much
more complicated to do in a proper and consistent way, but I'm
looking into it, and I'm in constant communication with people more
knowledgeable than I am in these things.
* On the matter of optimizations, I've been doing optimized versions
of common routines that are used in many common programs (i've
started with the package coreutils in debian), and also I began
working with the vectorization of encryption routines (those that can
be parallelizable at least). I've made an altivec version of MD5sum
algorithm but due to the nature of this particular algorithm (it's
not parallelizable at all), there is no speed advantage. But there
are other routines (in library libmcrypt) which are parallelizable,
though the process of vectorizing these algorithms is not an easy
thing to do, and i have to ensure that
* Once I've found how to do altivec auto-detection and optimization in
libraries, it will be a simple thing to include the available
optimizations in glibc and other libraries.
* After basic optimizations are in place, we can work more on
optimizing applications like MySQL, PostgreSQL, apache, etc.
* I can definitely say from my benchmarks so far that the minimum
speed increase will be from 400% (in cases where vectorization is
difficult) to 3200%!! The average is about 1000% (10x). This will be
directly visible to the user in applications that require a lot of
processing power and I'm pretty sure that this is what everyone wants to see eventually
One thing I can say for sure, is that when this thing hits the street,
it will take the market by storm and will totally change the way
people view PowerPC as a potential Linux platform. Esp. for servers,
given the fact that a simple G4 board will probably outperform or at
the very least actively compete a dual Xeon in MySQL/Apache
performance!
* I'm thinking of proposing a website/section in some existing
website, where people will vote on some
applications/routines/libraries they want vectorized and the altivec
developers will work on the optimizations.
* [upd] Once I've tidied up the code/patches, I'll commit it to the pegasos CVS repository on alioth.
Of course, the only problem with this is that I have to work on this
on my spare time, and the past 3 months have been quite busy. This
was the only reason I have not had time to post to penguinppc.org.
The fact I'm also working on a Debian-edu project here in Greece is
also a factor that takes quite a lot of my time.
But I'm confident I can have some results at least on some core
packages of Debian in the next couple of months.
Regards
Konstantinos