lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ian Lea <ian....@gmail.com>
Subject Re: a faster way to addDocument and get the ID just added?
Date Thu, 31 Mar 2011 09:32:02 GMT
>> Subject: a faster way to addDocument and get the ID just added?

Might it be possible to come up with a version of
IndexWriter.addDocument() that returns the docid rather than void?
Answering that question is way out of my league, but it would
presumably be quick.


--
Ian.


On Thu, Mar 31, 2011 at 6:34 AM, Trejkaz <trejkaz@trypticon.org> wrote:
> On Wed, Mar 30, 2011 at 8:21 PM, Simon Willnauer
> <simon.willnauer@googlemail.com> wrote:
>> Before trunk (and I think
>> its in 3.1 also) merge only merged continuous segments so the actual
>> per-segment ID might change but the global document ID doesn't if you
>> only add documents. But this should not be considered a feature. In
>> upcoming version this does not work anymore since merges can now be
>> non-continuous.
>
> This myth was busted some time ago:
> https://issues.apache.org/jira/browse/LUCENE-2506?#comment-12935973
>
> Summary: selecting segments to merge is decided by MergePolicy, and a
> MergePolicy which does not upset ordering will be remain in existence.
>
>> Anyway, I strongly discourage to rely on lucene document IDs you
>> should not do this at all. Can't you use your own ID mechanism?
>
> This has pretty much already been covered in my reply to the previous
> person that suggested that solution, not to mention in the initial
> email which started the thread.
>
> Summary: the overheads are simply not acceptable.
>
> So far the only remotely helpful suggestion I have heard anywhere is
> to keep two gigantic int[] arrays in memory, mapping the IDs in each
> direction.  This would work if we had an infinite amount of memory to
> play with, but unfortunately we don't.  1 billion item indexes are
> expected to work, and we can't just tell everyone to buy 8 GB more RAM
> when we update to the next version of our app.  If we were a
> server-side app, *maybe* we could...
>
> TX
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message