lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Miller <>
Subject Re: New Token API was Re: Payloads and TrieRangeQuery
Date Mon, 15 Jun 2009 23:04:43 GMT
Grant Ingersoll wrote:
> On Jun 14, 2009, at 8:05 PM, Michael Busch wrote:
>> I'd be happy to discuss other API proposals that anybody brings up 
>> here, that have the same advantages and are more intuitive. We could 
>> also beef up the documentation and give a better example about how to 
>> convert a stream/filter from the old to the new API; a constructive 
>> suggestion that Uwe made at the ApacheCon.
> More questions:
> 1. What about Highlighter and MoreLikeThis?  They have not been 
> converted.  Also, what are they going to do if the attributes they 
> need are not available?  Caveat emptor?
> 2. Same for TermVectors.  What if the user specifies with positions 
> and offsets, but the analyzer doesn't produce them?  Caveat emptor? 
> (BTW, this is also true for the new omit TF stuff)
> 3. Also, what about the case where one might have attributes that are 
> meant for downstream TokenFilters, but not necessarily for indexing? 
>  Offsets and type come to mind.  Is it the case now that those 
> attributes are not automatically added to the index?   If they are 
> ignored now, what if I want to add them?  I admit, I'm having a hard 
> time finding the code that specifically loops over the Attributes.  I 
> recall seeing it, but can no longer find it.
> Also, can we add something like an AttributeTermQuery?  Seems like it 
> could work similar to the BoostingTermQuery.
> I'm sure more will come to me.
> -Grant
If you are using a CachingTokenFilter, and you do something like pass it 
to something that hasn't upgraded to the new API (say 
MemoryIndex#addField(String fieldName, TokenStream stream, float boost)) 
and you are trying to use the new API,
you will get an exception when trying to read the tokens from the 
CachingTokenFilter a second time - obviously because
the old API is cached rather than the new, and when you try and use the 
new, kak :( .

We can obviously fix anything internal, but not external.

- Mark

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message