Main Article Content

Abstract

Stemming is a process to restore words to its base form, by stripping each word from
its derivational and affixes. A stemming process has an important role for machinetranslation
and other computational lingustics area. In Malaysian there is a stemming
algorithm that has been developed and tested for application in information retrieval which is
known as Othman algorithm. There are several differences of Bahasa Indonesia’s
morphology and Malay’s morphology, so The Othman algorithm can not be applied directly
in bahasa Indonesia. Furthermore, the accuracy of Othman algorithm also is not good. This
paper proposes some modifications from Othman algorithm. The modifications includes,
various stemming procedures, rule of affixes, and dictionary of root words. Experiments show
that Our modification method has a better accuracy in stemming Bahasa Indonesia’s words.
Keywords: stemming, word-lemmatization, affix-stripping

Article Details