lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Spencer, Dave" <d...@lumos.com>
Subject RE: Indexing Db Table -- Better way request
Date Sat, 09 Nov 2002 00:45:40 GMT
We have a number of internal systems here (content mgmt, bug db, support
email,
CRM), all of which are PHP/MySQL combos - and in all cases Lucene is
used for the
indexing and we have never seen any reason to go to XML
as in intermediate step. We've been at this for 6 months or so.
Only hassle is that if the group that's doing the PHP/MySQL tweaks the
schema,
they have to remember to modify the Lucene indexer so that, say, it
picks
up the new columns - but there's no way around this unless you want to 
be very generic, in which case xml still doesn't give you anything since
you could
just as well use JDBC meta-data to get all columns...


-----Original Message-----
From: Michael Caughey [mailto:michael@caughey.com]
Sent: Friday, November 08, 2002 4:21 PM
To: Spencer, Dave; Lucene Users List
Subject: Re: Indexing Db Table -- Better way request


Converting straight to a document seemed to me the best answer as I
started
to investigate.  Somewhere along the line I thought I remembered seeing
a
suggestion that it was for some reason better to convert to XML and then
add
it as an XML document.  I'd rather not have the hassel of creating then
later parsing the XML.  I could not find the reference again.  This in
part
was what I was hoping to hear.

Thanks,
Michael
----- Original Message -----
From: "Spencer, Dave" <dave@lumos.com>
To: "Lucene Users List" <lucene-user@jakarta.apache.org>
Cc: <michael@caughey.com>
Sent: Friday, November 08, 2002 6:59 PM
Subject: RE: Indexing Db Table -- Better way request


One small comment: what's the point of converting a row to XML?
What I think you want to do is convert a row to a Document and then
pass that off to IndexWriter.

-----Original Message-----
From: Caughey, Michael [mailto:mcaughey@trigon.com]
Sent: Friday, November 08, 2002 2:22 PM
To: 'lucene-user@jakarta.apache.org'
Cc: 'michael@caughey.com'
Subject: Indexing Db Table -- Better way request


Hello,

I'm new to Lucene and this group, if it is improper to send such a
message
to this group I apologize.  I tried to do a reasonable amount of up
front
research before coming here.

I'm about to undertake a piece of my project where I've decided that
Lucene
will be of use.  I have been researching, over the past two week's, ways
to
accomplish this.  I know I'll use an indexWriter to write the index to a
file, but I'm having difficultly settling on how to process the data to
be
indexed.

What I have is a table in a MySQL database called items.  I want to be
able
to search on a couple of fields and have it return the ID:
Fields:
=========
Name VARCHAR (80)
Description TEXT
Location VARCHAR (80)
Qty int
ExpireDate Long YYYYMMDD
Category int
ListingPrice FLOAT(9,2)
Supplier int

Return
=========
ItemId int


On start up of the application every row in the database will be read.
After that I need to keep the table and the index in sync.  Data in the
columns can change, rows can be added and removed.  I have a centeral
entity
controller which is responsible for all access to that table.

I figured on approach which would work would be on start up to read each
row
and build an XML document and submit it to the IndexWriter.
As Inserts, Deletes and updates occurred I could modify both lucene and
the
database.

Seems simple enough, and may be the only way to handle it.  Before I did
it
I wanted to make sure that there wasn't a better way.
Are there documents which can automatically read the table and build a
document?
Should I read the row and just build fields and construct a document?

Does anyone see any problems with storing it in memory versus writing it
to
a file?  Or should I say at point would you consider writing it to a
file,
would you base that on total document size?  I feel that a file index
will
most likely be just fine.

Thanks in advance for any suggestions.






Michael Caughey






--
To unsubscribe, e-mail:   <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>


Mime
View raw message