lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <hossman_luc...@fucit.org>
Subject Re: lucene - website example question
Date Fri, 02 May 2008 16:06:29 GMT

Hi Bruce, welcome to Lucene.

First off: your questions would better suited for one of the *-user@lucene 
lists ... not java-dev ( http://people.apache.org/~hossman/#java-dev ) 

The question becomes, which list would be a good place for you to start? 
...

: -i'm trying to gather data/words/terms from a number of
:  different test web sites to build a database of terms
:  for a test app
: -i'd like to put the resulting data into some sort of
:  database. is lucene sufficient for doing this, or will
:  i need some sort of additional toolset?

One thing I'm not clear on is whether you are looking for something to 
hadle the crawling of these websites and extracting the terms from various 
file types (in which case you should start with nutch-user@lucene) or if 
you want to do that yourself and have very specific control in your own 
code over where the data comes from before it gets indexed (in which case 
you should email java-user@lucene.

Once you have a lucene index built (either by nutch or by yourself 
using the Java APIs) you can writecode to use that index in a variety of 
ways - including extracting data about the set of known terms and 
frequencies.

If you really want this data put into a relational database, I 
suspect you'd need to do that converstion process yourself -- i don't know 
of any general purpose tools to do that.





-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message