lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Namit Yadav" <>
Subject Index Rows as Documents? Help me design a solution
Date Tue, 25 Jul 2006 02:05:52 GMT
My question might be very easy for you Lucene experts. But after going
through the Lucene documentation / example, I haven't been able to
figure out how to solve this problem. I'll be really grateful if
someone can help me get a starting point here.

Our application tracks SMSes sent from a particular phone number. We
have gigs of logs that (Lets say) look like this


Now our search will obviously be done on the basis of the phone
number. So we need indexing so that we can:

1 List SMSIDs of all the SMSes that a phone number had sent (Each SMS
message will have a globally unique ID)
2 List SomeData1, SomeData2, SomeData3 and SomeData4 for a given SMSID.

How can I do this efficiently?

I wrote a sample piece of code where each row was a Document, and
PhoneNumber, SMSID and SomeData columns were Fields. The indexing was
taking much more than minutes for a 1 MB log file, so I realized that
I didn't do it right (You can guess how 'not' comfortable I am with
Lucene at present). I would expect to be able to index at least a of
GB of logs within 1 or 2 minutes.

Can someone please point me to the right examples, help me understand
what my Documents / Fields / Analyzers should be or help me design a

Thanks in advance

ps. I just now got Lucene in Action. Is there any example (or similar
concept) explained in the book? From what I see, none of the examples
really help me much.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message