incubator-clerezza-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Reto Bachmann-Gmür (JIRA) <j...@apache.org>
Subject [jira] Closed: (CLEREZZA-366) Parsing of RDF Data loads everything into memory
Date Sun, 02 Jan 2011 13:42:45 GMT

     [ https://issues.apache.org/jira/browse/CLEREZZA-366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Reto Bachmann-Gmür closed CLEREZZA-366.
---------------------------------------

    Resolution: Fixed

> Parsing of RDF Data loads everything into memory
> ------------------------------------------------
>
>                 Key: CLEREZZA-366
>                 URL: https://issues.apache.org/jira/browse/CLEREZZA-366
>             Project: Clerezza
>          Issue Type: Improvement
>            Reporter: Rupert Westenthaler
>            Assignee: Reto Bachmann-Gmür
>
> The API of the org.apache.clerezza.rdf.core.serializedform.ParsingProvider does not allow
to parse the target MGraph for loading RDF data from the InputStream. Therefore Implementations
need to create there own MGraph instances.
> The org.apache.clerezza.rdf.jena.parser.JenaParserProvider e.g. creates an instance of
SimpleMGraph to store the parsed Data.
> This design does not allow to "stream" parsed RDF data directly into the final destination,
but forces to load everything into an intermediate graph.
> This is a problem when importing big datasets especially because the intermediate graph
is kept in memory.
> Currently one would use
> TCProvider provider;  //e.g. a TdbTcProvider instance
> MGraph veryBigGraph = provider.createMGraph("http://dbPedia.org"); //e.g. loading a dump
of dbPedia.org
> veryBigGraph(parser.parse(is, format, null)); //loads everything into memory and than
adding everything to the TDB store
> A possible solution would be to add a second ParsingProvider.parse(..) Method that allows
to parse an existing MGraph instance.
> This would allow to refactor the above code fragment like:
> TCProvider provider;  //e.g. a TdbTcProvider instance
> MGraph veryBigGraph = provider.createMGraph("http://dbPedia.org"); //e.g. loading a dump
of dbPedia.org
> parser.parse(is, veryBigGraph, format, null); //loads everything directly into the parsed
MGraph 
> best
> Rupert Westenthaler 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message