lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wolfgang Hoschek <>
Subject Re: [Performance] Streaming main memory indexing of single strings
Date Wed, 20 Apr 2005 18:26:49 GMT
On Apr 20, 2005, at 9:22 AM, Erik Hatcher wrote:

> On Apr 20, 2005, at 12:11 PM, Wolfgang Hoschek wrote:
>> By the way, by now I have a version against 1.4.3 that is 10-100 
>> times faster (i.e. 30000 - 200000 index+query steps/sec) than the 
>> simplistic RAMDirectory approach, depending on the nature of the 
>> input data and query. From some preliminary testing it returns 
>> exactly what RAMDirectory returns.
> Awesome.  Using the basic StringIndexReader I sent?

Yep, it's loosely based on the empty skeleton you sent.

> I've been fiddling with it a bit more to get other query types.  I'll 
> add it to the contrib area when its a bit more robust.

Perhaps we could merge up once I'm ready and put that into the contrib 
area? My version now supports tokenization with any analyzer and it 
supports any arbitrary Lucene query. I might make the API for adding 
terms a little more general, perhaps allowing arbitrary Document 
objects if that's what other folks really need...

>> As an aside, is there any work going on to potentially support prefix 
>> (and infix) wild card queries ala "*fish"?
> WildcardQuery supports wildcard characters anywhere in the string.  
> QueryParser itself restricts expressions that have leading wildcards 
> from being accepted.

Any particular reason for this restriction? Is this simply a current 
parser limitation or something inherent?

> QueryParser supports wildcard characters in the middle of strings no 
> problem though.  Are you seeing otherwise?

I ment an infix query such as "*fish*"


Wolfgang Hoschek                  |   email:
Distributed Systems Department    |   phone: (415)-533-7610
Berkeley Laboratory               |

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message