incubator-clerezza-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Reto Bachmann-Gmuer <reto.bachm...@trialox.org>
Subject Re: commons.rdf
Date Tue, 09 Nov 2010 18:54:25 GMT
On Tue, Nov 9, 2010 at 1:07 PM, Andy Seaborne
<andy.seaborne@epimorphics.com> wrote:
>
>
> On 09/11/10 07:13, Reto Bachmann-Gmuer wrote:
>>
>> On the incubator mailing list a project for commons around the semantic
>> oriented projected has been suggested as a possibility.
>>
>> I'm wondering which parts of Clerezza could be moved to such a project,
>> thinking at:
>> - core graph/mgraph api
>> - serializers
>> - graph isomorphism code (to be improved there)
>>
>> Obviously it only make sense to move things if somebody want's to use
>> these
>> things without using Clerezza.
>
>>
>> Reto
>>
>
> Does that mean these pieces of code work work with other systems?
They all base on the interfaces in the org.apache.clerezza.rdf.core
package and the goal is to provide wrappers for systems that do not
natively expose this (as we do for jena ans sesame)
>
> I'm curious:
> How does the graph isomorphism code compare to Jena's?
CLEREZZA-67 was closed with the comment
"The current algorithm is highly inefficient in some situation, but as
with most real world graphs its reasonably fast I suggest this to be
solved in a sperate issue: CLEREZZA-81. "

CLEREZZA-81 is describe as
"The current GraphMatcher used in AbstarctGraph.equals is efficient
when it can map all bnoded of the two Garphs by computing hash on
them. When the hashes can not be refined further it simply tries all
permutations with the bnodes with the same hash. In some situations
this latter brude force fallback is terribly inefficient. For example
if the compared graphs contain circles of bnodes connceted with the
same property, in this and similar case we should switch back to
hash-code based matching after randomly equating just two node of the
two graphs."

I have a vague remembrance of it being massively slower than jena in
such bnodes circles while being slightly faster where the hash-based
matching succeeds.

Looking at the parser API at
http://incubator.apache.org/clerezza/mvn-site/org.apache.clerezza.rdf.core/apidocs/index.html?org/apache/clerezza/rdf/core/serializedform/package-summary.html
I think that it might be better not to require ParsingProviders to
return Graphs but to allow any TripleCollection which doesn't have the
requirement on the equals and the hashcode method as (inmutable)
graphs, currently its hard for an implementation not to depend on more
stuff in order to provide correct implementations of these methods.

Reto.
>
>        Andy
>

Mime
View raw message