lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From app...@dsl.pipex.com
Subject Use of Lucene to store data from RSS feeds
Date Thu, 14 Oct 2010 14:17:43 GMT
Hello

I would like to store data retrieved hourly from RSS feeds in a database or in Lucene so that
the text can be easily
indexed for word frequencies.

I need to get the text from the title and description elements of RSS items.

Ideally, for each hourly retrieval from a given feed, I would add a row to a table in a dataset
made up of the
following columns:

feed_url, title_element_text, description_element_text, polling_date_time

>From this, I can look up any element in a feed and calculate keyword frequencies based
upon the length of time required.

This can be done as a database table and hashmaps used to calculate word frequencies. But
can I do this in Lucene to
this degree of granularity at all? If so, would each feed form a Lucene document or would
each 'row' from the
database table form one?

Can anyone advise?

Thanks

Martin O'Shea.
-- 



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message