lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From findbestopensource <findbestopensou...@gmail.com>
Subject Re: Converting an existing index format to Lucene Index
Date Fri, 25 Feb 2011 07:05:08 GMT
Hello Lokendra,

You could updates frequently. Anyway i think it is one time job.

My advice would be do insertion and updates in batch.
1. Parse your file and read 1000 lines
2. Do some aggregation and insert / update with lucene.

Regards
Aditya
www.findbestopensource.com



On Fri, Feb 25, 2011 at 11:56 AM, Lokendra Singh <lsingh.969@gmail.com>wrote:

> Hi all,
>
> I am seeking for some guidelines to directly convert an already existing
> index to Lucene index.
> The index available to me is of a set of <value1,value2> pairs. Where each
> pair is :
> < word ,  fileName >
> i.e a word as a 'value1', and the 'value2' being the fileName containing
> that word.
>
> A word might appear in several fileNames as well a same file can contain
> multiple copies of a word. For eg, following index is possible:
> < "my"  , "file1" >
> < "you" , "file2" >
> < "my",  "file2" >
> < "my", "file1">
>
> My actual problem is that the index available to me is very large in size,
> hence I am bit reluctant to create 'Document' object for each file because
> for that I will have to read through all the pairs first and store them in
> memory. Or I will have to 'update' the 'Document' object of a particular
> file while iterating through the Pairs of my index, this 'update', again,
> is
> a costly operation.
>
> Please correct me if my understanding of Lucene is wrong or other
> alternative ways.
>
> Regards
> Lokendra
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message