lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mathias Dahl <mathias.d...@gmail.com>
Subject Lucene vs Glimpse
Date Mon, 04 Feb 2013 18:01:32 GMT
Hi,

I have hacked together a small web front end to the Glimpse text
indexing engine (see http://webglimpse.net/ for information). I am
very happy with how Glimpse indexes and searches data. If I understand
it correctly it uses a combination of an index and searching directly
in the files themselves as grep or other tools. The problem is that I
discovered it is not open source and now that I want to extend the use
from private to company wide I will run into license problems/costs.

So, I decided to try out Lucene. I tried the examples and changed them
a bit to use another analyzer. But when I started to think about it I
realized that I will not be able to build something like Glimpse. At
least not easily.

Why? I will try to explain:

As stated above, Glimpse uses a combination of index and in-file
search. This makes it very powerful in the sense that I can get hits
for things that are not necessarily being indexes as terms. Let's say
I have a file with this content:

...
import foo.bar.baz;
...

With Glimpse, and without telling it how to index the content I can
find the above file using a search string like "foo" or "bar" but
also, and this is important, using foo.bar.baz.

Another example:

We have a lot of PL/SQL source code, and often you can find code like this:

...
My_Nice_API.Some_Method
...

Here too, Glimpse is almost magic since it combines index and normal
search. I can find the file above using "My_Nice_API" or
"My_Nice_API.Some_Method".

In a sense I can have the cake and eat it too.

If I want to do similar "free" search stuff with Lucene I think I have
to create analyzers for the different kind of source code files, with
fields for this and that. Quite an undertaking.

Does anyone understand my point here and am I correct in that it would
be hard to implement something as "free" as with Glimpse? I am not
trying to critizise, just understand how Lucene (and Glimpse) works.

Oh, yes, Glimpse has one big drawback: it only supports search strings
up to 32 characters.

Thanks!

/Mathias

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message