lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <ted.dunn...@gmail.com>
Subject Re: Using Lucene to index OSM nodes (400M latitude/longitude points)
Date Wed, 24 Jun 2009 06:06:56 GMT
On Tue, Jun 23, 2009 at 8:52 PM, Kelly Jones <kelly.terry.jones@gmail.com>wrote:

> Can Lucene index the openstreetmap.org (OSM) node db (400M
> latitude/longitude pairs), and then find the 20 nodes closest to a
> given latitude/longitude?
>

Probably.  (isn't that definitive!)

 % Can Lucene index numerical data and understand that 16 is close to
>  15, but far away from 160000?
>

As of the recent versions, yes.


> % Is Lucene reasonably fast indexing 400M floating point pairs?
>

Should be.  I have no experience with this kind of indexing.  You will need
a pretty good sized amount of memory or use a sharding system like katta.


>
> % After Lucene creates the 400M index, can it return search results
>  reasonably fast?
>

With the right level of parallelism, absolutely.


> % Is there a guide/tutorial that shows how to use Lucene to index
>  numerical data (I'm using Plucene, but I'll settle for any sort of
>  guide)?
>

Not really.  Efficient numerical search is relatively new in Lucene:

See the slide shows on Michael Busch's linked-in profile:
http://www.linkedin.com/profile?viewProfile=&key=12809985&authToken=-hrn&authType=name

Also, see here:
http://wiki.apache.org/lucene-java/SearchNumericalFields



-- 
Ted Dunning, CTO
DeepDyve

111 West Evelyn Ave. Ste. 202
Sunnyvale, CA 94086
http://www.deepdyve.com
858-414-0013 (m)
408-773-0220 (fax)

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message