lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Assad Jarrahian <jar...@gmail.com>
Subject Lucene and database + location
Date Sun, 30 Oct 2005 23:10:04 GMT
Hi All,
My apologies if this question was asked before. I looked through the
archives and could not find an answer to what I was looking for.

background:
lets say I store documents of a user. Each document has associated with
(meta data) it a DocID, creatorName, a class (private or public), and a
location.
The db (in my case postgres) consists of a table whose columns map to the
above.
So my question is two folds.

1) Lucene and database:
So inherently while building a database it makes sense to index a few things
(similar to Field.keyword). In this case I could make 'class' as an index
(fast retrieval of all public indexes). Now lets say I introduce a search
for documents with a certain keyword (as in search all public+user documents
with a certain keyword.). I get Lucene into the picture and then I am
stumped. Because I will have to maintain an index on the Field.class as well
as text in the document. So that means I have two indexes for the same
thing; one for postgres column (find me all of a user Documents <no keyword
filter> ) and one for Lucene.
It does not make sense to me. So does that mean if one was to use Lucene
they would throw out all the db indexes besides primary key and foreign key
constraints and use Lucene for everything else?

2) Lucene and location:
I know this has been discussed in the past before, but I am not getting a
good handle on it. Lets say I want a query that says find me all relevant
documents (certain search criteria) within my proximity. So the query would
look like
<search criteria on text in doc> & currentLocation

The goal would be to have something that would try and find relevant
documents and rank em based on distance(some radius or equivalent 2-d
space). Furthermore, if there are not sufficient documents, the radius
should expand.

What would be a good way of implementing this. I was going to use POSTGIS,
get all docID's within a radius, then use Lucene for Keyword search to
generate ALL relevant docID's and then intersect the two lists. Something
tells me this is not going to be a good idea.

Any comments would be much appreciated.
-assad

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message