lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Honey George <honey_geo...@yahoo.com>
Subject RE: Restoring a corrupt index
Date Thu, 19 Aug 2004 08:38:20 GMT
This is what I did.

There are 2 classes in the lucene source which are not
public and therefore cannot be accessed from outside
the package. The classes are
1. org.apache.lucene.index.SegmentInfos
   - collection of segments
2. org.apache.lucene.index.SegmentInfo
   -represents a sigle segment

I took these two files and moved to a separate folder.
Then created a class with the following code fragment.

    public void displaySegments(String indexDir)
        throws Exception
    {
        Directory dir =
(Directory)FSDirectory.getDirectory(indexDir, false);
        SegmentInfos segments = new SegmentInfos();
        segments.read(dir);

        StringBuffer str = new StringBuffer();
        int size = segments.size();
        str.append("Index Dir = " + indexDir );
        str.append("\nTotal Number of Segments " +
size);
       
str.append("\n--------------------------------------");
        for(int i=0;i<size;i++)
        {
            str.append("\n");
            str.append((i+1) + ". ");
           
str.append(((SegmentInfo)segments.get(i)).name);
        }
       
str.append("\n--------------------------------------");

        System.out.println(str.toString());
    }


    public void deleteSegment(String indexDir, String
segmentName)
        throws Exception
    {
        Directory dir =
(Directory)FSDirectory.getDirectory(indexDir, false);
        SegmentInfos segments = new SegmentInfos();
        segments.read(dir);

        int size = segments.size();
        String name = null;
        boolean found = false;
        for(int i=0;i<size;i++)
        {
            name =
((SegmentInfo)segments.get(i)).name;
            if (segmentName.equals(name))
            {
                found = true;
                segments.remove(i);
                System.out.println("Deleted the
segment with name " + name
                    + "from the segments file");
                break;
            }
        }
        if (found)
        {
            segments.write(dir);
        }
        else
        {
            System.out.println("Invalid segment name:
" + segmentName);
        }
    }

Use the displaySegments() method to display the
segments and deleteSegment to delete the corrupt
segment.

Thanks,
  George

 --- Karthik N S <karthik@controlnet.co.in> wrote: 
> Hi Guys....
> 
> 
>    In Our Situation we would be indexing  Million &
> Millions of Information
> documents
> 
>   with  Huge Giga Bytes of Data Indexed  and 
> finally would be  put into a
> MERGED INDEX, Categorized accordingly.
> 
>   There may be a possibility of Corruption,  So
> Please do post  the code
> reffrals....
> 
> 
>  Thx
> Karthik
> 
> 
> -----Original Message-----
> From: Honey George [mailto:honey_george@yahoo.com]
> Sent: Wednesday, August 18, 2004 5:51 PM
> To: Lucene Users List
> Subject: Re: Restoring a corrupt index
> 
> 
> Thanks Erik, that worked. I was able to remove the
> corrupt index and now it looks like the index is OK.
> I
> was able to view the number of documents in the
> index.
> Before that I was getting the error,
> java.io.IOException: read past EOF
> 
> I am yet to find out how my index got corrupted.
> There
> is another thread going on about this topic,
>
http://www.mail-archive.com/lucene-user@jakarta.apache.org/msg03165.html
> 
> If anybody is facing similar problem and is
> interested
> in the code I can post it here.
> 
> Thanks,
>   George
> 
> 
> 
>  --- Erik Hatcher <erik@ehatchersolutions.com>
> wrote:
> > The details of the segments file (and all the
> > others) is freely
> > available here:
> >
> >
> >
>
http://jakarta.apache.org/lucene/docs/fileformats.html
> >
> > Also, there is Java code in Lucene, of course,
> that
> > manipulates the
> > segments file which could be leveraged (although
> > probably package
> > scoped and not easily usable in a standalone
> repair
> > tool).
> >
> > 	Erik
> >
> >
> > On Aug 18, 2004, at 6:50 AM, Honey George wrote:
> >
> > > Looks like problem is not with the hexeditor,
> even
> > in
> > > the ultraedit(i had access to a windows box) I
> am
> > > seeing the same display. The problem is I am not
> > able
> > > to identify where a record starts with just 1
> > record
> > > in the file.
> > >
> > > Need to try some alternate approach.
> > >
> > > Thanks,
> > >   George
> 
> 
> 
> 
> 
> 
>
___________________________________________________________ALL-NEW
> Yahoo!
> Messenger - all new features - even more fun! 
> http://uk.messenger.yahoo.com
> 
>
---------------------------------------------------------------------
> To unsubscribe, e-mail:
> lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail:
> lucene-user-help@jakarta.apache.org
> 
> 
>
---------------------------------------------------------------------
> To unsubscribe, e-mail:
> lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail:
> lucene-user-help@jakarta.apache.org
> 
>  


	
	
		
___________________________________________________________ALL-NEW Yahoo! Messenger - all
new features - even more fun!  http://uk.messenger.yahoo.com

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message