nlp - Morphology software for English -
in application need use piece of software able to: a) convert words basic forms , b) find if 'nouns', 'verbs' etc.
i found list of software able job.
http://aclweb.org/aclwiki/index.php?title=morphology_software_for_english
does have experience of these? 1 recommend?
you can use nltk (python) perform these tasks.
find if 'nouns', 'verbs'...
this task called part-of-speech tagging. can use nltk.pos_tag
function. (see peen treebank tagset)
convert words basic forms
this task called lemmatization. can use nltk.stem.wordnet.wordnetlemmatizer.lemmatize
function.
example
import nltk nltk.stem.wordnet import wordnetlemmatizer nltk.corpus import wordnet wn penn_to_wn = lambda penn_tag: {'nn':wn.noun,'jj':wn.adj,'vb':wn.verb,'rb':wn.adv}.get(penn_tag[:2], wn.noun) sentence = "the rabbits eating in garden." tokens = nltk.word_tokenize(sentence) pos_tags = nltk.pos_tag(tokens) wl = wordnetlemmatizer() lemmas = [wl.lemmatize(token, pos=penn_to_wn(tag)) token, tag in pos_tags]
then if print results:
>>> tokens ['the', 'rabbits', 'are', 'eating', 'in', 'the', 'garden', '.'] >>> pos_tags [('the', 'dt'), ('rabbits', 'nns'), ('are', 'vbp'), ('eating', 'vbg'), ('in', 'in'), ('the', 'dt'), ('garden', 'nn'), ('.', '.')] >>> lemmas ['the', u'rabbit', u'be', u'eat', 'in', 'the', 'garden', '.']
Comments
Post a Comment