lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shai Erera <>
Subject FSDirectory.copy() impl might be dangerous
Date Thu, 24 Jun 2010 21:07:49 GMT

Today I ran into a weird exception from FSDir.copy(), and while
investigating it, I spotted a potential bug as well. So bug first:

FileChannel.transferFrom documents that it may not copy the number of bytes
requested, however we don't check the return value. So need to fix the code
to read in a loop until all bytes were copied. That's an easy fix.

Now for the dangerous part - I wanted to measure segment merging
performance, so I created two indexes: 10K docs and 100K docs, both are
optimized. I then use IndexWriter.addIndexes(Directory...) method to merge
100 copies of the first into a new directory, and 10 copies of the second
into a new directory (to create an index of 1M docs, but different number of
segments). I then call optimize().

Surprisingly, when calling addIndexes() w/ the 100K-docs segments, I ran
into this exception (Java 1.6 -- Java 1.5's exception was cryptic):

Exception in thread "main" Map failed
    at org.apache.lucene.index.IndexWriter.addIndexes(
Caused by: java.lang.OutOfMemoryError: Map failed
    at Method)
    ... 7 more

I run this on my laptop w/ 4GB RAM. So it's entirely possible there are
memory issues here. BUT - the segment size is only 300 MB, which is still
much much less than my machine's RAM.

What worries me is not that particular test - but what will happen if
someone will try to addIndexes() segments that are 10GB or 100GB or even
more ... then it really won't matter how much RAM you have. So let's take
the RAM availability out of the picture.

This API is dangerous, because if someone will try to merge not so large
segments, on a machine w/ not so much RAM, he'll hit an exception - and it
didn't happen before 'cause we used byte[] copies (which is slower).

I changed FSDir.copy() code to copy in chunks of 64MB:

        long numWritten = 0;
        long numToWrite = input.size();
        long bufSize = 1 << 26;
        while (numWritten < numToWrite) {
          numWritten += output.transferFrom(input, numWritten, bufSize);

And the process completed successfully.

Obviously, 64MB may be too high for other systems, so I'm thinking we should
make it configurable, but still - chunking, using the same API, succeeds. I
guess it's just a "not so friendly impl" of Java's FileChannelImpl, but I
don't know if we can go around it. Maybe we can perf-test and use a smaller
chunk size that is safer for all cases (and yields the same performance as
larger ones) ...

BTW, I don't have FileChannelImpl's source, but Mike found here It
doesn't look like the impl chunks anything ...

What do you think?


View raw message