lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mathias Dahl <>
Subject Lucene vs Glimpse
Date Mon, 04 Feb 2013 18:01:32 GMT

I have hacked together a small web front end to the Glimpse text
indexing engine (see for information). I am
very happy with how Glimpse indexes and searches data. If I understand
it correctly it uses a combination of an index and searching directly
in the files themselves as grep or other tools. The problem is that I
discovered it is not open source and now that I want to extend the use
from private to company wide I will run into license problems/costs.

So, I decided to try out Lucene. I tried the examples and changed them
a bit to use another analyzer. But when I started to think about it I
realized that I will not be able to build something like Glimpse. At
least not easily.

Why? I will try to explain:

As stated above, Glimpse uses a combination of index and in-file
search. This makes it very powerful in the sense that I can get hits
for things that are not necessarily being indexes as terms. Let's say
I have a file with this content:


With Glimpse, and without telling it how to index the content I can
find the above file using a search string like "foo" or "bar" but
also, and this is important, using

Another example:

We have a lot of PL/SQL source code, and often you can find code like this:


Here too, Glimpse is almost magic since it combines index and normal
search. I can find the file above using "My_Nice_API" or

In a sense I can have the cake and eat it too.

If I want to do similar "free" search stuff with Lucene I think I have
to create analyzers for the different kind of source code files, with
fields for this and that. Quite an undertaking.

Does anyone understand my point here and am I correct in that it would
be hard to implement something as "free" as with Glimpse? I am not
trying to critizise, just understand how Lucene (and Glimpse) works.

Oh, yes, Glimpse has one big drawback: it only supports search strings
up to 32 characters.



To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message