incubator-clerezza-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rupert Westenthaler (JIRA)" <j...@apache.org>
Subject [jira] Commented: (CLEREZZA-395) bnodes mapping in JenaGraphAdaptor should not keep growing with every parsing of rdf files
Date Mon, 17 Jan 2011 14:37:44 GMT

    [ https://issues.apache.org/jira/browse/CLEREZZA-395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12982653#action_12982653
] 

Rupert Westenthaler commented on CLEREZZA-395:
----------------------------------------------

Hi Reto, all

What do you mean by "store" and "transfer"?
 (1) persistent storage (e.g. Jena TDB) and export (e.g. RDF/XML serilization), or also
 (2) storage and CRUD operations while working with several MGraphs on API Level.
I completely agree with (1) but I am unsure about (2) because I understand the potential danger
but also used things like that for a lot of stuff in the past years (e.g. by using by using
http://www.openrdf.org/doc/sesame2/api/index.html?org/openrdf/repository/util/RDFInserter.html
that preserves BNode IDs).

Let me point out, that operations described in (2) are possible with the current implementation.
Here a small Example of what I refer to (written here in the TextEditor - so no guarantee
that is would compile)

MGraph graph1 = new SimpleMGraph();
MGraph graph2 = new SimpleMGraph();
//I think it should even work with an Jena Graph because of the Bidi Map providing mappings
for BNodes

//By being able to create a BNode without a Graph there is not something like a Context of
an BNode
BNode rupertInfo = new BNode();
BNode retoInfo = new BNode();
UriRef name = new UriRef(FOAF+"name");
UriRef knows = new UriRef(FOAF+"knows");

//add operations do not create new instances of BNode ... so there is still no context
graph1.add(new TripleImpl(rupertInfo;name,new PlainLiteral("Rupert Westenthaler"));
graph1.add(new TripleImpl(rupertInfo;knows, retoInfo));

graph2.add(new TripleImpl(reto;name,new PlainLiteral("Reto Bachmann-Gmur"));
//"rupertInfo" is now in two graphs (2 contexts?)
graph2.add(new TripleImpl(reto;knows;rupertInfo);

//So now lets have some fun with the BNodes
//search for all knows in graph1 -> OK (because within the same context)
Iterator<Triple> rupertsFriends = graph1.filter(rupertInfo,knows,null);
//query for all information of the results (BNodes) in graph2 -> NOT OK?!
while(rupertsFriends.hasNext()){
  Resource friendBNode = rupertsFriends.getObject();
  Iterator<Triple> friendInfos = graph2.filter(friendBNode,null,null)
  //add them to the BNode in graph1 -> NOT OK?!
  while(friendInfos.hasNext()){
    graph1.add(friendInfos.next); //OK this would not work because it changes graph1 within
the Iteration over rupertsFriends, but it shows the principle
  }
}

This works because
 - BNode does not override equals and the equals implementation of java.lang.Objects checks
for reference
 - one instance of an BNode is shared between the two Graphs
 - the performAdd Method (at least from SimpleTripleCollection) does not create new instances
for added BNodes
So if it is the goal to completely avoid sharing of BNodes between Graph instances one would
need to change the current implementation.

In conclusion I would like to point out that adding an ID to BNode - as suggested in my first
comment - would not change anything out of a technical perspective. However I clearly understand
that adding a Constructor like BNode(String bNodeID) to the public API would encourage wrong
usage of BNodes by users which might cause a lot of troubles if they are not aware of the
consequences.

best
Rupert Westenthaler 

> bnodes mapping in JenaGraphAdaptor should not keep growing with every parsing of rdf
files
> ------------------------------------------------------------------------------------------
>
>                 Key: CLEREZZA-395
>                 URL: https://issues.apache.org/jira/browse/CLEREZZA-395
>             Project: Clerezza
>          Issue Type: Improvement
>            Reporter: Hasan
>            Assignee: Hasan
>
> With every parsing of rdf files free memory is getting less.
> The problem seems to lie in the JenaGraphAdaptor class
> It has a member:
> final BidiMap<BNode, Node> tria2JenaBNodes = new BidiMapImpl<BNode, Node>();
> which grows each time a serialized graph get parsed.
> My experiments with my test data show
> At the end of the 1st parsing: Size of tria2JenaBNodes = 87200
> At the end of the 2nd parsing: Size of tria2JenaBNodes = 130800
> At the end of the 3rd parsing: Size of tria2JenaBNodes = 174400

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message