lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joe Attardi" <>
Subject More IP/MAC indexing questions
Date Wed, 01 Aug 2007 15:31:50 GMT
Hi again, everyone. First of all, I want to thank everyone for their
extremely helpful replies so far.
Also, I just started reading the book "Lucene in Action" last night. So far
it's an awesome book, so a big thanks to the authors.

Anyhow, on to my question. As I've mentioned in several of my previous
messages, I am indexing different pieces of information about servers - in
particular, my question is about indexing the IP address and MAC address.

Using the StandardAnalyzer, an IP is kept as a single token (""),
and a MAC is broken up into one token per octet ("00", "17", "fd", "14",
"d3", "2a"). Many searches will be for partial IPs or MACs ("192.168",
"00:17:fd", etc).

Are either of these methods of indexing the addresses (single token vs
per-octet token) more or less efficient than the other when indexing large
numbers of these?

Joe Attardi

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message