lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From mark harwood <markharw...@yahoo.co.uk>
Subject Re: Document links
Date Tue, 21 Sep 2010 16:30:08 GMT
>>Wouldn't that be sufficient?

Not for some apps. I tried playing the "Kevin Bacon" game using a Lucene index 
of IMDB data using actorID and movieID keys.
The difference between that and Neo4j on the same data and query  is night and 
day. The graph databases are really onto something when resolving a relationship 
doesn't first require an index to look up endpoints.





----- Original Message ----
From: Paul Elschot <paul.elschot@xs4all.nl>
To: dev@lucene.apache.org
Sent: Tue, 21 September, 2010 17:25:31
Subject: Re: Document links

When the (primary) key values are provided by the user,
one could use additional small documents to only store/index
these relations whenever they change.

Wouldn't that be sufficient?

Regards,
Paul Elschot



Op dinsdag 21 september 2010 00:35:02 schreef mark harwood:
> I've been looking at Graph Databases recently (neo4j, OrientDb, InfiniteGraph) 

> as a faster alternative to relational stores. I notice they either embed Lucene 
>
> for indexing node properties or (in the case of OrientDB) are talking about 
> doing this. 
> 
> I think their fundamental performance advantage over relational stores is that 

> they don't have to de-reference foreign keys in a b-tree index to get from a 
> source node to a target node. Instead they use internally-generated IDs to act 

> like pointers with more-or-less direct references between nodes/vertexes.  As a 
>
> result they can follow links very quickly. This got me thinking could Lucene 
> adopt the idea of creating links between documents that were equally fast using 
>
> Lucene doc ids?
> 
> Maybe the user API would look something like this...
> 
>     indexWriter.addLink(fromDocId, toDocId);
>     DocIdSet reader.getInboundLinks(docId);
>     DocIdSet reader.getOutboundLinks(docId);
> 
> 
> Internally a new index file structure would be needed to record link info. Any 

> recorded links that connect documents from different segments would need 
>careful 
>
> adjustment of referenced link IDs when segments merge and Lucene doc IDs are 
> shuffled.
> 
> As well as handling typical graphs (social networks, web data) this could 
> potentially be used to support tagging operations where apps could create "tag" 
>
> documents and then link them to existing documents that are being tagged 
>without 
>
> having to update the target doc. There are probably a ton of applications for 
> this stuff.
> 
> I see the Graph DBs busy recreating transactional support, indexes, segment 
> merging etc and it seems to me that Lucene has a pretty good head start with 
> this stuff.
> Anyone else think this might be an area worth exploring?
> 
> Cheers
> Mark
> 
> 
>      
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
> 
> 
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


      

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message