mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joyce Babu <>
Subject Mahout for Keyword Extraction
Date Thu, 03 Feb 2011 13:21:34 GMT

I am new to Java and Machine Learning concept. I was searching for a method to extract keywords
(like names of people, organization, places etc) from new stories sorted by relevance. I found
several web services like OpenCalais that provide similar service, but they don't detect most
of my terms. I have a list of approved keywords, and only need to detect from that list.

I found out about Machine Learning and got interested in the concept. I read somewhere that
the classification feature of mahout can be used for detecting keywords by classifying terms
as keywords and non-keywords. I have been trying to learn mahout for the past 30 hours, but
haven't reached anywhere. It is not useful to waste time trying to learn, if mahout is not
the tool to solve my problem.

Can someone provide details on using mahout for term extraction? Is it possible to do this
with little to medium knowledge in Java? Is it an overkill to use mahout for this? Should
I go for an NLP solution?


  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message