Hi Bruce, welcome to Lucene.
First off: your questions would better suited for one of the *-user@lucene
lists ... not java-dev ( http://people.apache.org/~hossman/#java-dev )
The question becomes, which list would be a good place for you to start?
...
: -i'm trying to gather data/words/terms from a number of
: different test web sites to build a database of terms
: for a test app
: -i'd like to put the resulting data into some sort of
: database. is lucene sufficient for doing this, or will
: i need some sort of additional toolset?
One thing I'm not clear on is whether you are looking for something to
hadle the crawling of these websites and extracting the terms from various
file types (in which case you should start with nutch-user@lucene) or if
you want to do that yourself and have very specific control in your own
code over where the data comes from before it gets indexed (in which case
you should email java-user@lucene.
Once you have a lucene index built (either by nutch or by yourself
using the Java APIs) you can writecode to use that index in a variety of
ways - including extracting data about the set of known terms and
frequencies.
If you really want this data put into a relational database, I
suspect you'd need to do that converstion process yourself -- i don't know
of any general purpose tools to do that.
-Hoss
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org
|