lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From KARTHIK SHIVAKUMAR <nskarthi...@gmail.com>
Subject Re: No subsearcher in Lucene 3.3?
Date Sat, 03 Sep 2011 05:22:33 GMT
HI

Long time ago I used to do the same ...

I used to name the merger Index  unique names ......so at run time If the
Query returned from  1STMERGER  then the path relevant to 1STMEGEr will be
used.


similarly  U NEED NOT STORE A NEW COLUMN FOR THIS SAKE......


1MERGER  =  /temp/MERGER1      Finally   the PATH OF INDEX SEARCH IS
C:/TEMP/MERGER1
1MERGER  =  /temp/MERGER2      Finally   the PATH OF INDEX SEARCH IS
D:/TEMP/MERGER2



HOPE THIS HELPS


WITH REGARDS
KARTHIK

On Tue, Aug 30, 2011 at 9:59 PM, Joe MA <mrjama@comcast.net> wrote:

>
> Thanks for the replies.  Here is why I need the subreader (or subsearcher
> in earlier Lucene versions):
>
> I have multiple collections of documents, say broken out by years (it's
> more complex than this, but this illustrates the use case):
>
> Collection1 >>>         D:/some folder/2009/*.pdf
> (lots of PDF files)
> Collection2 >>>         D:/another folder/2010/*.pdf
>  (lots of different PDF files)
>
> And so forth.  So in the example above, I would have two indicies, one for
> each year.    When I index, I store the *relative* path of each document as
> a field.  For example, 'link:2009/file1.pdf' or 'link2010/file1.pdf' etc .
>  I do not store the full path to the files in the index.  This has a huge
> advantage because we can move the documents to another file system or server
> or path without rebuilding the index.  I stored the required base path to
> the documents in each collection in a database, external to the collection.
>   For example, in the above example, Collection1 would have a base path of
> "D:/some folder/".     Therefore, to actually access a document referenced
> in a collection, you would concat base_path retrieved from the database to
> the "link" field retrieved from the collection.   I would think this is a
> very common approach.
>
> When searching a single collection, no problem.  But if I want to search
> the two collections at the same time, I need to know which collection the
> hit came from so I can retrieve the base_path from the database.  These
> base_paths can be different.  As mentioned, this was trivial in Lucene 1.x
> and 2.x as I just grabbed the subsearcher from the result, which would for
> example return a 1 or 2 indicating which of the two collections the result
> came from.  Then I can build the path to the file.  In other words,
> subsearcher gave me the foreign key I needed to map to additional external
> information associated with each index during a multisearch.  That is now
> gone in Lucene 3.3.
>
> I guess a real simple solution is just to store a new field with each
> document uniquely identifying which collection.  So in the example above, I
> could create a new field "foreign_key_index"  for each document which would
> be "Collection1" or "Collection2" respectively.  This would surely work, but
> it would break backwards compatibility of my system and would require me to
> rebuild every collection.      Also seems pretty extensive for something so
> simple.
>
> If there is another way to do this, please advise.  Thanks in advance and
> much appreciated.
>
> - JMA
>
>
>
> -----Original Message-----
> From: Uwe Schindler [mailto:uwe@thetaphi.de]
> Sent: Monday, August 29, 2011 8:05 PM
> To: java-user@lucene.apache.org
> Subject: RE: No subsearcher in Lucene 3.3?
>
> Why do you need to know the subreader? If you want to get the document's
> stored fields, use the MultiReader.
>
> If you really want to know the subreader, use this:
>
> http://lucene.apache.org/java/3_3_0/api/core/org/apache/lucene/util/ReaderUtil.html#subReader(int,
> org.apache.lucene.index.IndexReader)
>
> But this is "somewhat slow", so don’t use in inner loops.
>
> Devon suggested:
> > If I'm understanding your question correctly, in the Collector, you are
> told which IndexReader you are working with when the setNextReader method is
> called. Hopefully that helps.
>
> This does not work as expected, because the Collector gets the lowest level
> readers, which are in fact sub-sub-readers (as each single IndexReader
> contains itself of more "SegmentReaders", unless you have optimized
> sub-indexes).
>
> Uwe
>
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
>
>
> > -----Original Message-----
> > From: Joseph MarkAnthony [mailto:mrjama@comcast.net]
> > Sent: Monday, August 29, 2011 8:54 PM
> > To: java-user@lucene.apache.org
> > Subject: No subsearcher in Lucene 3.3?
> >
> > Greetings,
> >     In the past (Lucene version 2.x) I successfully used
> > MultiSearcher.subsearcher() to identify the searchable within a
> > MultiSearcher to which a hit belonged.
> >
> > In moving to Lucene 3.3, MultiSearcher is now deprecated, and I am
> > trying to create a standard IndexSearcher over a MultiReader.  I
> > haven't gotten this to work yet but it appears to be the correct
> > approach.  However, I cannot find any corresponding "subsearcher"
> > method that could identify which subreader is the one that finds the hit.
> >
> > For example, it used to be straightforward:
> >
> > Create a MultiSearcher over several Searchables, and call
> > MultiSearcher.subsearcher to get the searchable that holds each search
> hit.
> >
> > Now, I am creating an IndexSearcher over a MultiReader, which is created
> over
> > an array of IndexReaders.   So when I get a hit, what's the best way to
> > determine which of the several subReaders the hit came from?
> >
> > Thanks in advance,
> > JMA
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>


-- 
*N.S.KARTHIK
R.M.S.COLONY
BEHIND BANK OF INDIA
R.M.V 2ND STAGE
BANGALORE
560094*

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message