lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From khanh-lam....@bnf.fr
Subject Re: Generate Lucene segments_N file
Date Wed, 10 Feb 2016 11:17:07 GMT
Hi,

Mike, thanks a lot for your help, I adapt your code and it actually works 
great !
Thanks for saving us weeks of work.

Here is my code, if it could help someone else :


package org.apache.lucene.index;

import java.io.IOException;
import java.nio.file.Path;
import java.nio.file.Paths;

import org.apache.lucene.codecs.Codec;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.IOContext;
import org.apache.lucene.store.SimpleFSDirectory;

public class GenSegmentInfo {
        public static void main(String[] args) throws IOException {
                Codec codec = Codec.getDefault();
                Path myPath = Paths.get("/tmp/index");
                Directory directory = new SimpleFSDirectory(myPath);
 
                //launch this the first time with random segmentID value
                //then with java debug, get the right segment ID
                //by putting a breakpoint on 
CodecUtil#checkIndexHeaderID(...)
                byte[] segmentID = {88, 55, 58, 78, -21, -55, 102, 99, 
123, 34, 85, -38, -70, -120, 102, -67};
 
                SegmentInfo info = 
codec.segmentInfoFormat().read(directory, "_1rpt",
                                segmentID, IOContext.READ);
                info.setCodec(codec);
                SegmentInfos infos = new SegmentInfos();
                SegmentCommitInfo commit = new SegmentCommitInfo(info, 1, 
-1, -1, -1);
                infos.add(commit);
                infos.commit(directory);
        }
}


Regards,

Khanh-Lam Mai




De :    Michael McCandless <lucene@mikemccandless.com>
A :     Lucene Users <java-user@lucene.apache.org>, khanh-lam.mai@bnf.fr
Date :  10/02/2016 10:17
Objet : Re: Generate Lucene segments_N file



It'd be a challenge, but it is possible.  It's just software ;)

You need something like this to read a SegmentInfo from your sole .si
file, assuming you are on a recent 5.x release:

  SegmentInfo info = codec.segmentInfoFormat().read(directory,
segName, segmentID, IOContext.READ);

To get codec, assuming you used the default codec for indexing, use:

  Codec codec = Codec.getDefault();

Then do something like this:

  SegmentInfos infos = new SegmentInfos();
  infos.add(info);
  infos.commit(directory);

The latter method is package private, so your tool must live in
org.apache.lucene.index package, or use break-out-of-jail magic with
Java's reflection APIs.

Then run CheckIndex on that ... if it fails, iterate with the above code!

Good luck,

Mike McCandless

http://blog.mikemccandless.com


On Tue, Feb 9, 2016 at 9:50 AM,  <khanh-lam.mai@bnf.fr> wrote:
> Hello,
>
> First, I don't know if it's the right mailing list to ask for your help,
> if no please accept my apologies for the inconvenience.
>
> While moving Lucene (5.3) index files from a server to an other, I 
forgot
> to move the segments_N file (because I use the pattern *.*)
> Unfortunately I've erased the original folder, and I only have these 
files
> in my directory now :
>
> _1rpt.fdt
> _1rpt.fdx
> _1rpt.fnm
> _1rpt.nvd
> _1rpt.nvm
> _1rpt.si
> _1rpt_Lucene50_0.doc
> _1rpt_Lucene50_0.dvd
> _1rpt_Lucene50_0.dvm
> _1rpt_Lucene50_0.pos
> _1rpt_Lucene50_0.tim
> _1rpt_Lucene50_0.tip
> write.lock
>
> I am missing the segments_42u file, and without it I cannot even do an
> org.apache.lucene.index.CheckIndex :
>
> Exception in thread "main" 
org.apache.lucene.index.IndexNotFoundException:
> no segments* file found in MMapDirectory@/solr-5.3.1
> /nodes/node1/core/data/index lockFactory=org.apache.lucene.store.
> NativeFSLockFactory@119d7047: files: [write.lock, _1rpt.fdt, _1rpt.fdx,
> _1rpt.fnm, _1rpt.nvd, _1rpt.nvm, _1rpt.si, _1rpt_Lucene50_0.doc,
> _1rpt_Lucene50_0.dvd, _1rpt_Lucene50_0.dvm, _1rpt_Lucene50_0.pos,
> _1rpt_Lucene50_0.tim, _1rpt_Lucene50_0.tip]
> at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:483)
> at org.apache.lucene.index.CheckIndex.doMain(CheckIndex.java:2354)
> at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:2237)
>
> The index is pretty huge (> 800GB) and it will take weeks to rebuild it.
> Is there a way to generate this missing segment info file ?
>
> Thanks a lot for your help.
>
>
> Khanh-Lam Mai
> khanh-lam.mai@bnf.fr
> Exposition  De Rouge et de Noir. Les vases grecs de la collection de 
Luynes  - jusqu'au 1 er  mars 2016 - BnF - Richelieu Avant d'imprimer, 
pensez à l'environnement.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org



Exposition  De Rouge et de Noir. Les vases grecs de la collection de Luynes  - jusqu'au 31
mars 2016 - BnF - Richelieu Avant d'imprimer, pensez à l'environnement. 

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message