lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "João Rodrigues" <anar...@gmail.com>
Subject Re: Removing duplicate entries
Date Wed, 30 Apr 2008 16:03:26 GMT
>Probably something very like that, although you see none of that. Just
>doing a deleteDocument(term) does it all for you. And I learned long ago
>that the folks who write this kind of stuff can probably do it more
>efficiently
>than I can <G>.

And probably more efficiently that I can as well :) Thanks for the tip.

>I ask because a lot of people are mistakenly suppose that RAMdirs are
faster
>because they have RAM in front <G>. The indexing process uses RAM
implicitly
>and periodically flushes to FS. You can control this by various parameters
>on
>IndexWriter like setMaxMergeDocs, setMergeFactor etc. and I personally
>prefer letting that do the work and avoiding having to code merging the
two.

>But your point about failure is valid, although you might want to search
the
>mail archive for discussions on this as there's been work done to make this
>less of a concern....

Yep, I'm (was, thanks to you :P) one of those persons. I'd already read
that, but since I had the code written, I was thinking of letting go :) But
then again, since I'm getting a major memory leak, I believe I'll try that
option!

Thanks for the help!
Best Regards,

-- 
João Rodrigues
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message