lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apostolis Xekoukoulotakis <xekou...@gmail.com>
Subject tries and spatial search
Date Sat, 22 Dec 2012 15:56:16 GMT
I just found out about the blocktree implementation and how it is used to
increase the speed of prefix search.

Have you tried to use it for spatial search?
I will explain to you how i think it will work. Please correct me if I am
wrong.

A radix tree is a structure that you can use to efficiently search terms
with a specific prefix.
Spatial searches are prefix searches. Lets see why.

Lets say we have the surface of the earth and we use as coordinate system
the Geographic coordinate system<
http://en.wikipedia.org/wiki/Geographic_coordinate_system>
 .

We then create a grid of 4 square surfaces and create an id for each one.
...
We split each square surface into 4 square surfaces with an id.

When we want to point to a location, its uid would be of this form:
idlv1idlv2idlv3...

Each id can be encoded with 4 bits.
The surface of the earth (510million kms) would then require 29 bits to
have the precision of 1m^2.

How do we encode arbitrary closed surfaces?
Here is what measure theory does: check page 7. theorem 1.4 figure
3<http://press.princeton.edu/chapters/s8008.pdf>
This theorem says that any surface can be described by a countable amout of
squares.
Lucene can only handle finite number of terms.
We thus  put an upper limit on the number of squares that represent our
surface. The ids of those squares are the terms that need to be searched.


In conclusion, searching documents inside an arbitrary surface means
creating the trie of that surface and then looking through the inverted
index trie for those terms.

The scoring that we will implement will be the surface area of the common
terms,(ie the surface area of the intersection of those surfaces)



*So 2 questions, *
*1)Will this work in general?*
*2) I want to avoid writing new code.*
*
*
* Is the blocktree implementation much different than a radix tree that it
wouldnt not make it work?*
*Does blocktree accept arbitrary byte or bit prefixes?*
*Which classes would I use if all the above answers are positive?*



-- 


Sincerely yours,

     Apostolis Xekoukoulotakis

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message