stuff of bertnase: | back to bertnase | |
udf_levtok is a mysql udf (user defined function) module.it provides a LEVTOK() function to get the minimum levenshtein distance of multiple strings seperated by whitespaces (like words in a sentence).
the levenshtein distance is a measurement of the differences in two stings (see external www.nist.gov reference).
this allows some fuzzy lookup of a word in a sentence.
example:
select levtok("pamela has great talents", "pamela", 0) -> 0
select levtok("pamela has great talents", "panela", 0) -> 1
select levtok("pamela has great talents", "blond", 0) -> 5here the udf_levtok.cc source with some comments about, how to compile and how to use.
and here udf_flevtok.cc, a variation giving a similarity value between 0.0 (no similarity found) and 1.0 (exact match found).