lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Elschot <>
Subject Re: Document links
Date Tue, 21 Sep 2010 16:25:31 GMT
When the (primary) key values are provided by the user,
one could use additional small documents to only store/index
these relations whenever they change.

Wouldn't that be sufficient?

Paul Elschot

Op dinsdag 21 september 2010 00:35:02 schreef mark harwood:
> I've been looking at Graph Databases recently (neo4j, OrientDb, InfiniteGraph) 
> as a faster alternative to relational stores. I notice they either embed Lucene 
> for indexing node properties or (in the case of OrientDB) are talking about 
> doing this. 
> I think their fundamental performance advantage over relational stores is that 
> they don't have to de-reference foreign keys in a b-tree index to get from a 
> source node to a target node. Instead they use internally-generated IDs to act 
> like pointers with more-or-less direct references between nodes/vertexes.  As a 
> result they can follow links very quickly. This got me thinking could Lucene 
> adopt the idea of creating links between documents that were equally fast using 
> Lucene doc ids?
> Maybe the user API would look something like this...
>     indexWriter.addLink(fromDocId, toDocId);
>     DocIdSet reader.getInboundLinks(docId);
>     DocIdSet reader.getOutboundLinks(docId);
> Internally a new index file structure would be needed to record link info. Any 
> recorded links that connect documents from different segments would need careful 
> adjustment of referenced link IDs when segments merge and Lucene doc IDs are 
> shuffled.
> As well as handling typical graphs (social networks, web data) this could 
> potentially be used to support tagging operations where apps could create "tag" 
> documents and then link them to existing documents that are being tagged without 
> having to update the target doc. There are probably a ton of applications for 
> this stuff.
> I see the Graph DBs busy recreating transactional support, indexes, segment 
> merging etc and it seems to me that Lucene has a pretty good head start with 
> this stuff.
> Anyone else think this might be an area worth exploring?
> Cheers
> Mark
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message