lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Harwood <markharw...@yahoo.co.uk>
Subject Re: Document links
Date Mon, 08 Nov 2010 22:43:57 GMT
What about if we define an id field (like in solr)?


Last time I floated the idea of supporting primary keys as a core concept in Lucene (in the
context of helping doc updates, not linking) there were objections along the lines of "lucene
shouldn't try to be a database" 


On 8 Nov 2010, at 20:47, Ryan McKinley <ryantxu@gmail.com> wrote:

On Mon, Nov 8, 2010 at 2:52 PM, mark harwood <markharw00d@yahoo.co.uk> wrote:
I came to the conclusion that the transient meaning of document ids is too
deeply ingrained in Lucene's design to use them to underpin any reliable
linking.

What about if we define an id field (like in solr)?

Whatever does the traversal would need to make a Map<id,docID>, but
that is still better then then needing to do a query for each link.


While it might work for relatively static indexes, any index with a reasonable
number of updates or deletes will invalidate any stored document references in
ways which are very hard to track. Lucene's compaction shuffles IDs without
taking care to preserve identity, unlike graph DBs like Neo4j (see "recycling
IDs" here: http://goo.gl/5UbJi )


oh ya -- and it is even more akward since each subreader often reuses
the same docId

ryan

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org




      

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message