Downloads

Information

The lib_mysqludf_stem library provides stemming capability for a variety of languages.

about

MySQL UDFs offer a powerful way to extend the functionality of your MySQL database.

The UDFs on this site are all free: free to use, free to distribute, free to modify and free of charge. Even for commercial projects.

lib_mysqludf_stem

The lib_mysqludf_stem library provides stemming capability for a variety of languages using Dr. M.F. Porter's Snowball API.

Stemming is the process for reducing inflected (or sometimes derived) words to their stem, base or root form - generally a written word form. The stem need not be identical to the morphological root of the word; it is usually sufficient that related words map to the same stem, even if this stem is not in itself a valid root.
A stemmer for English, for example, should identify the string "cats" (and possibly "catlike", "catty" etc.) as based on the root "cat", and "stemmer", "stemming", "stemmed" as based on "stem". A stemming algorithm reduces the words "fishing", "fished", "fish", and "fisher" to the root word, "fish".

Build / Install

Preparation

Login as root
Download and and unpack lib_mysqludf_stem-*.tar.gz
Change dir to lib_mysqludf_stem source directory
If you've used apt-get to install MySQL, <MYSQL_PREFIX> is '/usr'.

Build and full install for MySQL 4.x and 5.0

./configure [--prefix=<MYSQL_PREFIX>]
make
[sudo] make install && [sudo] make installdb
  

Build and full install for MySQL 5.1+

./configure [--prefix=<MYSQL_PREFIX>] --libdir=`mysql_config --plugindir`
make
[sudo] make install && [sudo] make installdb
  

API

lib_mysqludf_stem_info ( )
Output the library version.
stem_word ( string language, string words )
Stem each word in a string.
NOTE: The language should be constant for each row in a query.