lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kevin Watters (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-7543) Create GraphQuery that allows graph traversal as a query operator.
Date Tue, 19 May 2015 14:41:00 GMT

    [ https://issues.apache.org/jira/browse/SOLR-7543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550544#comment-14550544
] 

Kevin Watters commented on SOLR-7543:
-------------------------------------

[~yseeley@gmail.com] , right now, it builds against 4.x  If I submit a patch, should be done
for trunk, or is a 4.x branch ok? I'm just finishing up the unit tests, either way, I hope
to have a patch submitted by the end of the week.

> Create GraphQuery that allows graph traversal as a query operator.
> ------------------------------------------------------------------
>
>                 Key: SOLR-7543
>                 URL: https://issues.apache.org/jira/browse/SOLR-7543
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>            Reporter: Kevin Watters
>            Priority: Minor
>
> I have a GraphQuery that I implemented a long time back that allows a user to specify
a "startQuery" to identify which documents to start graph traversal from.  It then gathers
up the edge ids for those documents , optionally applies an additional filter.  The query
is then re-executed continually until no new edge ids are identified.  I am currently hosting
this code up at https://github.com/kwatters/solrgraph and I would like to work with the community
to get some feedback and ultimately get it committed back in as a lucene query.
> Here's a bit more of a description of the parameters for the query / graph traversal:
> q - the initial start query that identifies the universe of documents to start traversal
from.
> fromField - the field name that contains the node id
> toField - the name of the field that contains the edge id(s).
> traversalFilter - this is an additional query that can be supplied to limit the scope
of graph traversal to just the edges that satisfy the traversalFilter query.
> maxDepth - integer specifying how deep the breadth first search should go.
> returnStartNodes - boolean to determine if the documents that matched the original "q"
should be returned as part of the graph.
> onlyLeafNodes - boolean that filters the graph query to only return documents/nodes that
have no edges.
> We identify a set of documents with "q" as any arbitrary lucene query.  It will collect
the values in the fromField, create an OR query with those values , optionally apply an additional
constraint from the "traversalFilter" and walk the result set until no new edges are detected.
 Traversal can also be stopped at N hops away as defined with the maxDepth.  This is a BFS
(Breadth First Search) algorithm.  Cycle detection is done by not revisiting the same document
for edge extraction.  
> This query operator does not keep track of how you arrived at the document, but only
that the traversal did arrive at the document.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message