lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: integrating RAMDirectory in FSDirectory or other way round
Date Tue, 12 Oct 2010 23:22:50 GMT
I'm not at all sure this is true. We all start by thinking "of course if the
entire
directory were in RAM it would be faster". But lots of work has gone into
making efficient use of RAM for things like caches etc. So it may not
make any difference.

But that doesn't mean you can't try. It's trivial to instantiate a
RAMdirectory
by calling the constructor with a Directory. See Ian's comment.

Best
Erick

On Tue, Oct 12, 2010 at 1:39 AM, <suman.holani@zapak.co.in> wrote:

>
> I am also having a similar requirement .But its other way round.
>
> Basically ,I have indexes in FSDirectory and which is transferred to
> another machine on regular basis. But now for the reason of faster
> searches, it would be better to copy the indexes onto RAM. (RAMDirectory).
>
> Not sure of how it can be achieved.
>
>
> TIA!
>
>
> regds,
> Suman
>
>
>
> On Tue, 12 Oct 2010 08:37:12 +0700, Yakob <jacobian@opensuse-id.org>
> wrote:
> > well actually I am doing a kind of a thesis regarding information
> > retrieval.and my tutor wanted me to be able to create a program that
> > firstly index a document in memory using RAMDirectory and then
> > flushing it to disk using FSDirectory periodically. I was having a
> > hard time implementing it in my source code. so that's why I am asking
> > what's the best way to put RAMDirectory methods in the source code
> > below? maybe you can give me an idea.eventhough I know it'll take
> > longer time to index due to the use of RAMDirectory and FSDirectory
> > but nonetheless I'd still be interested in how to integrate
> > RAMDirectory into FSDirectory.
> >
> > thanks
> >
> > PS : regarding the hijack thread, I thought because I was the one who
> > created it so I think I could hijack it.but thanks for the advice
> > though and now I am creating a new thread. :-)
> >
> > On 10/12/10, Erick Erickson <erickerickson@gmail.com> wrote:
> >> It's a good idea to start a new thread when asking a different
> question,
> >> see:
> >> http://people.apache.org/~hossman/#threadhijack
> >>
> >> <http://people.apache.org/~hossman/#threadhijack>I have to ask why you
> >> want
> >> to integrate the RAM directory. If you're using it
> >> to speed up indexing, you're probably making way more work for yourself
> >> than you need to. If you're trying to do something with Near Real Time,
> >> one
> >> suggestion is to just not bother. Add docs to the RAM directory AND
> your
> >> FSDirectory simultaneously. The data you index to FSDir won't be
> visible
> >> until you reopen the FSDir reader, so your flush could then be just
> >> reopen everything...
> >>
> >> Best
> >> Erick
> >>
> >> On Mon, Oct 11, 2010 at 1:34 PM, Yakob <jacobian@opensuse-id.org>
> wrote:
> >>
> >>> On 9/29/10, Erick Erickson <erickerickson@gmail.com> wrote:
> >>> > Nope, never used jNotify, so I don't have any code handy...
> >>> >
> >>> > Good luck!
> >>> > Erick
> >>> >
> >>>
> >>> so I did try JNotify but there is seems to be some bug in it that I
> >>> find it hards to integrate in my lucene source code.so I had to try a
> >>> looping option instead.
> >>>
> >>>
> >>>
>
> http://stackoverflow.com/questions/3840844/error-exception-access-violation-in-jnotify
> >>>
> >>> so anyway, I had another question now. I was trying to make a lucene
> >>> source code that can do indexing and store them first in a memory
> >>> using RAMDirectory and then flush this index in a memory into a disk
> >>> using FSDirectory. I had done some modifications of this code but to
> >>> no avail. maybe some of you can help me out a bit.
> >>> here is the source code again.
> >>>
> >>> import java.io.File;
> >>> import java.io.FileReader;
> >>> import java.io.IOException;
> >>> import org.apache.lucene.analysis.SimpleAnalyzer;
> >>> import org.apache.lucene.document.Document;
> >>> import org.apache.lucene.document.Field;
> >>> import org.apache.lucene.index.IndexWriter;
> >>> import org.apache.lucene.store.FSDirectory;
> >>>
> >>>
> >>> public class SimpleFileIndexer {
> >>>
> >>>
> >>>        public static void main(String[] args) throws Exception   {
> >>>
> >>>                 int i=0;
> >>>                while(i<10) {
> >>>                 File indexDir = new
> >>> File("C:/Users/Raden/Documents/lucene/LuceneHibernate/adi");
> >>>                File dataDir = new
> >>> File("C:/Users/Raden/Documents/lucene/LuceneHibernate/adi");
> >>>                String suffix = "txt";
> >>>
> >>>                SimpleFileIndexer indexer = new SimpleFileIndexer();
> >>>
> >>>                int numIndex = indexer.index(indexDir, dataDir,
> suffix);
> >>>
> >>>                System.out.println("Total files indexed " + numIndex);
> >>>                 i++;
> >>>                Thread.sleep(10000);
> >>>
> >>>                }
> >>>        }
> >>>
> >>>
> >>>
> >>>        private int index(File indexDir, File dataDir, String suffix)
> >>> throws
> >>> Exception {
> >>>
> >>>                IndexWriter indexWriter = new IndexWriter(
> >>>                                FSDirectory.open(indexDir),
> >>>                                new SimpleAnalyzer(),
> >>>                                true,
> >>>                                IndexWriter.MaxFieldLength.LIMITED);
> >>>                indexWriter.setUseCompoundFile(false);
> >>>
> >>>                indexDirectory(indexWriter, dataDir, suffix);
> >>>
> >>>                int numIndexed = indexWriter.maxDoc();
> >>>                indexWriter.optimize();
> >>>                indexWriter.close();
> >>>
> >>>                return numIndexed;
> >>>
> >>>        }
> >>>
> >>>        private void indexDirectory(IndexWriter indexWriter, File
> >>>        dataDir,
> >>> String suffix) throws IOException {
> >>>                File[] files = dataDir.listFiles();
> >>>                for (int i = 0; i < files.length; i++) {
> >>>                        File f = files[i];
> >>>                        if (f.isDirectory()) {
> >>>                                indexDirectory(indexWriter, f, suffix);
> >>>                        }
> >>>                        else {
> >>>                                indexFileWithIndexWriter(indexWriter,
> f,
> >>> suffix);
> >>>                        }
> >>>                }
> >>>        }
> >>>
> >>>        private void indexFileWithIndexWriter(IndexWriter indexWriter,
> >>>        File
> >>> f, String suffix) throws IOException {
> >>>                if (f.isHidden() || f.isDirectory() || !f.canRead() ||
> >>> !f.exists()) {
> >>>                        return;
> >>>                }
> >>>                if (suffix!=null && !f.getName().endsWith(suffix))
{
> >>>                        return;
> >>>                }
> >>>                System.out.println("Indexing file " +
> >>> f.getCanonicalPath());
> >>>
> >>>                Document doc = new Document();
> >>>                doc.add(new Field("contents", new FileReader(f)));
> >>>                doc.add(new Field("filename", f.getCanonicalPath(),
> >>> Field.Store.YES,
> >>> Field.Index.ANALYZED));
> >>>
> >>>                indexWriter.addDocument(doc);
> >>>        }
> >>>
> >>> }
> >>>
> >>> so what's the best way for me to integrate RAMDirectory in that source
> >>> code before putting them in FSDirectory. any help would be appreciated
> >>> though.
> >>> thanks
> >>>
> >>>
> >>> --
> >>> http://jacobian.web.id
> >>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message