Hi Markos,
Sorry my fault, I think I need to better explain what I need and why.
Often people use a database and want to find "similar" results.
Usually the people have a string column and they are looking for records which are equal to the search string or which are nearly equal.
Such a typical db-query works on >1000 rows of strings of typically 32 char length
The query should return all rows which are equal or which have nearly equal to the search pattern.
Normally I would compile levenshtein as function into the MySQL server
and use this to find strings which have less than 2 different chars.
But levenshtein is quite CPU intensive.
Such a SQL query would look like:
SELECT * FROM customer WHERE key=somevalue AND levenshtein("Gunnar von Boehn",contact_name) <= 1;
This query would look for rows that contain a string similar to "Gunnar von Boehn" in the contatc_name column.
The <=1 means that one letter is allowed to be missing or to be more in the string.
My question basicly is if there is an algorythm know which allows us to identify
similar strings that could be vectorized to get more speed.
Cheers
Gunnar