lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Jain" <Eric.J...@isb-sib.ch>
Subject Re: taxonomy with lucene
Date Tue, 23 Sep 2003 08:01:36 GMT
> Has anyone tried building taxonomies in Lucene? Any idea what is the
> likely approach to be taken?

I'm storing data with a hierarchical classification in a Lucene index,
if that is what you mean.

The approach is very simple. Every document has a field for a unique
identifier, a field for the identifier of its immediate parent, and a
field for those of all ancestors. This allows you to write queries such
as "name:human ancestor:2759" to find organisms that have "human" in
their name, and are Eukaryotes (but not, say, viruses).

This approach also allows you to efficiently display search results in a
tree, even for very large result sets, as long as the hierarchy doesn't
get to flat.

One drawback of this approach is that doing incremental updates is not
possible, or at least very complicated (duplicated information in the
ancestor field), and you must be careful about the order in which you
add documents to the index (parent before child).

Let me know if you are interested in any further details.

--
Eric Jain


Mime
View raw message