stuff of bertnase:
|back to bertnase|
udf_levtok is a mysql udf (user defined function) module.
it provides a LEVTOK() function to get the minimum levenshtein distance of multiple strings seperated by whitespaces (like words in a sentence).
the levenshtein distance is a measurement of the differences in two stings (see external www.nist.gov reference).
this allows some fuzzy lookup of a word in a sentence.
select levtok("pamela has great talents", "pamela", 0) -> 0
select levtok("pamela has great talents", "panela", 0) -> 1
select levtok("pamela has great talents", "blond", 0) -> 5
here the udf_levtok.cc source with some comments about, how to compile and how to use.
and here udf_flevtok.cc, a variation giving a similarity value between 0.0 (no similarity found) and 1.0 (exact match found).