manifoldcf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Markus Schuch (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CONNECTORS-1379) TinkerPop Output Connector
Date Sat, 25 Feb 2017 21:42:44 GMT

    [ https://issues.apache.org/jira/browse/CONNECTORS-1379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15884388#comment-15884388
] 

Markus Schuch edited comment on CONNECTORS-1379 at 2/25/17 9:42 PM:
--------------------------------------------------------------------

I've learned that the tinkerpop Java API is not build for batch writing large graphs. It's
focus is processing on top of existing graphs.

Writing graphs per se is possible, but only as embedded (in memory) graph that can be persisted
to a file (graphml, graphson, ...) when finished (no session/transactions, hard to handle
in a multi threaded environment, does not scale well)

There is no real graph database agnostic Java API for writing to remote graph databases (neo4j,
titan, ...).
The gremlin-server provides only agnostic access and processing.

One would have to select a specific graph database and implement an output connector specifically
for that.

We will explore this topic with a new direction: output to a large scale document store (e.g.
MongoDB) as staging repository. Graph processing can be done on top of that by importing documents
from there.

So i close this ticket.


was (Author: schuchm):
I've learned that the tinkerpop Java API is not build for batch writing large graphs. It's
focus is processing on top of existing graphs.

Writing graphs per se is possible, but only as embedded (in memory) graph that can be persisted
to a file (graphml, graphson, ...) when finished (no session/transactions, hard to handle
in a multi threaded environment, does not scale well)

There is no real graph database agnostic Java API for writing to remote graph databases (neo4j,
titan, ...).
The gremlin-server provides only agnostic access and processing.

One would have to select a specific graph database and implement an output connector specifically
for that.

So i close this ticket.

> TinkerPop Output Connector
> --------------------------
>
>                 Key: CONNECTORS-1379
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1379
>             Project: ManifoldCF
>          Issue Type: New Feature
>            Reporter: Markus Schuch
>            Assignee: Markus Schuch
>
> An output connector for a https://tinkerpop.apache.org/ graph database.
> Emits {{RepositoryDocuments}} as vertices.
> *Background*
> We experiment pushing docs to TinkerPop instead of pushing to solr directly. This is
very experimental.
> Development will be ignited here: https://github.com/dbsystel/manifoldcf/tree/CONNECTORS-1397
and then committed to manifoldcf, if something good comes out of it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message