lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Antony Bowesman <>
Subject Re: Fields with the same name?? - Was Re: Payloads and tokenizers
Date Mon, 18 Aug 2008 23:15:39 GMT
Doron Cohen wrote:
> The API definitely doesn't promise this.
> AFAIK implementation wise it happens to be like this but I can be wrong and
> plus it might change in the future. It would make me nervous to rely on
> this.

I made some tests and it 'seems' to work, but I agree, it also makes me nervous 
to rely on empirical evidence for the design rather than a clearly documented API!

> Anyhow, for your need I can think of two options:
> Option 1:  just index the owenerID, do not store it, do not index or store
> accessID (unless you wish to search by it, in this case just index it). In
> addition store a dedicated mapping field that maps from ownerID to accessID.
> E.g. with serialization of HashMap or something thinner. At runtime retrieve
> this map from the document and it has all that information.

Hey that's an interesting idea!  I'd not considered storing the mapping, only 
re-creating it from fields at runtime.  I'll explore this.

> Option 2: as you describe above, just index the ownerID with accessID as
> payload, and then for the hitting docid of interest use termPositions to get
> the payload, i.e. something like:
>     TermPositions tp = reader.termPositions();
> Term("ownerID",oid));
>     tp.skipTo(docid);
>     tp.nextPosition();
>     if (tp.isPayloadAvailable()) {
>       byte [] accessIDBytes = tp.getPayload(...);
>       ...

Yes, I was playing with this technique yesterday.  It's not easy to determine 
the performance implications of this method.  I will be using caches, but my 
volumes are potentially so large that I may never be able to cache everything 
(perhaps 500M Docs), so this has to be very quick.

I'll play with both approaches and see which works best.

Thanks for you time and I appreciate your valuable insight Doron.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message