lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From heikki <>
Subject Re: New Lucene User
Date Mon, 17 Jun 2013 21:34:38 GMT

I think Lucene is an excellent option for you.

You don't need to export the data to a flat file first. You can just access
your database (in whatever way you normally like, e.g. using JDBC or
Hibernate). You can do this for example once a day, retrieving only
modified records. For each record you retrieve, you create a so-called
Lucene Document. You add fields to these documents as you see fit -- for
example, you want to search in 20 of your 30 columns, so you could add
fields containing the values from those 20 columns to the Lucene Document.
You give each Document to an IndexWriter, which will add it to the Lucene
index. When you search, you retrieve such documents, which you can use then
to create a UI display for search results.

Of course there's a lot more to say about this and I'd recommend you check
online tutorials or one of the Lucene books like *Lucene In Action* to
learn more about how to use Lucene in detail.

Kind regards
Heikki Doeleman

On Mon, Jun 17, 2013 at 11:03 PM, <> wrote:

> Hi,
> I have a requirement to perform a full-text search in a new application
> and I came across Lucene and I want to check if it helps our cause.
> Requirement:
> I have a SQL Server database table with around 70 million records in it.
> It is not a live table and the data gets appended to it on a daily basis.
> The table has about 30 columns. The user will provide one string, and this
> value has to be searched against 20 columns for each record. All matching
> records need to be displayed in the UI.
> My Analysis
> Based on what I have read until now about Lucene, I believe I need to
> convert my database table data into a flat file, generate indexes and then
> perform the search.
> Questions
> -          To begin with, is Lucene a good option for this kind of
> requirement? Note: Let us ignore daily index generation and UI display for
> this discussion.
> -          Should the entire data of 70 million records exist in one flat
> file?
> -          How do I define what fields (20 columns) should be searched
> among the complete list (30 columns)?
> As I am just starting off, I may not even know about other dependencies. I
> kindly request you to provide clarifications / reference to an example that
> would suit my case.
> Please let me know if you have any questions.
> Thanks,
> Raghu
> _______________________________________________
> This message is for information purposes only, it is not a recommendation,
> advice, offer or solicitation to buy or sell a product or service nor an
> official confirmation of any transaction. It is directed at persons who are
> professionals and is not intended for retail customer use. Intended for
> recipient only. This message is subject to the terms at:
> For important disclosures, please see:
> regarding market commentary
> from Barclays Sales and/or Trading, who are active market participants; and
> in respect of Barclays Research, including disclosures relating to specific
> issuers, please see
> _______________________________________________

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message