lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <raghavendra.k....@barclays.com>
Subject RE: New Lucene User
Date Tue, 18 Jun 2013 13:38:32 GMT
Heikki,

Thank you very much. I tried it out and the initial results look good.

Although I get "java.lang.OutOfMemoryError: Java heap space" when I search for a single TextField
over 70 million records. Probably my code needs tuning.

I'll research more to figure it out. But this is a great start, thanks to everyone who provided
suggestions.

Regards,
Raghu


-----Original Message-----
From: heikki [mailto:tropicano@gmail.com] 
Sent: Monday, June 17, 2013 5:35 PM
To: java-user@lucene.apache.org
Subject: Re: New Lucene User

hi,

I think Lucene is an excellent option for you.

You don't need to export the data to a flat file first. You can just access your database
(in whatever way you normally like, e.g. using JDBC or Hibernate). You can do this for example
once a day, retrieving only modified records. For each record you retrieve, you create a so-called
Lucene Document. You add fields to these documents as you see fit -- for example, you want
to search in 20 of your 30 columns, so you could add fields containing the values from those
20 columns to the Lucene Document.
You give each Document to an IndexWriter, which will add it to the Lucene index. When you
search, you retrieve such documents, which you can use then to create a UI display for search
results.

Of course there's a lot more to say about this and I'd recommend you check online tutorials
or one of the Lucene books like *Lucene In Action* to learn more about how to use Lucene in
detail.

Kind regards
Heikki Doeleman


On Mon, Jun 17, 2013 at 11:03 PM, <raghavendra.k.rao@barclays.com> wrote:

> Hi,
>
> I have a requirement to perform a full-text search in a new 
> application and I came across Lucene and I want to check if it helps our cause.
>
> Requirement:
>
> I have a SQL Server database table with around 70 million records in it.
> It is not a live table and the data gets appended to it on a daily basis.
>
> The table has about 30 columns. The user will provide one string, and 
> this value has to be searched against 20 columns for each record. All 
> matching records need to be displayed in the UI.
>
> My Analysis
>
> Based on what I have read until now about Lucene, I believe I need to 
> convert my database table data into a flat file, generate indexes and 
> then perform the search.
>
> Questions
>
>
> -          To begin with, is Lucene a good option for this kind of
> requirement? Note: Let us ignore daily index generation and UI display 
> for this discussion.
>
> -          Should the entire data of 70 million records exist in one flat
> file?
>
> -          How do I define what fields (20 columns) should be searched
> among the complete list (30 columns)?
>
> As I am just starting off, I may not even know about other 
> dependencies. I kindly request you to provide clarifications / 
> reference to an example that would suit my case.
>
> Please let me know if you have any questions.
>
> Thanks,
> Raghu
>
>
> _______________________________________________
>
> This message is for information purposes only, it is not a 
> recommendation, advice, offer or solicitation to buy or sell a product 
> or service nor an official confirmation of any transaction. It is 
> directed at persons who are professionals and is not intended for 
> retail customer use. Intended for recipient only. This message is subject to the terms
at:
> www.barclays.com/emaildisclaimer.
>
> For important disclosures, please see:
> www.barclays.com/salesandtradingdisclaimer regarding market commentary 
> from Barclays Sales and/or Trading, who are active market 
> participants; and in respect of Barclays Research, including 
> disclosures relating to specific issuers, please see http://publicresearch.barclays.com.
>
> _______________________________________________
>

_______________________________________________

This message is for information purposes only, it is not a recommendation, advice, offer or
solicitation to buy or sell a product or service nor an official confirmation of any transaction.
It is directed at persons who are professionals and is not intended for retail customer use.
Intended for recipient only. This message is subject to the terms at: www.barclays.com/emaildisclaimer.

For important disclosures, please see: www.barclays.com/salesandtradingdisclaimer regarding
market commentary from Barclays Sales and/or Trading, who are active market participants;
and in respect of Barclays Research, including disclosures relating to specific issuers, please
see http://publicresearch.barclays.com.

_______________________________________________
Mime
View raw message