commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (COMMONSRDF-55) Stream of Jena quads use wrong IRI for default graph
Date Mon, 06 Feb 2017 15:08:41 GMT

    [ https://issues.apache.org/jira/browse/COMMONSRDF-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15854179#comment-15854179
] 

ASF GitHub Bot commented on COMMONSRDF-55:
------------------------------------------

Github user stain commented on the issue:

    https://github.com/apache/commons-rdf/pull/32
  
    I guess it's a question of where we put the "inconsistency" barrier. We can probably assume
that in the odd case that `urn:x-arq:DefaultGraph` appear literally in a non-Jena `IRI` or
a non-Jena `Quad` then it must have leaked out of Jena somehow, and will be treated as a real
IRI. It  would magically become the default graph only if such a quad is added to a Jena dataset.
    
    That would mean we let Commons RDF construction by component of a Jena-based quad preserve
`g` just as in other implementations. 
    
    With option **(2)** above we would add JenaRDF-specific recognition of the magic IRI if
it happens to be backed by a Jena `Node` (which might even be because it was made from a string).

    
    It would probably cleaner in Commons RDF for a Quad to magically change only on insertion
to a Jena-backed Dataset, than when making the Quad with a particular back-end - e.g. you
add one quad, but a slightly different one comes back out, which will not be `.equals()` the
inserted one.  This is not very different from stores with inferred rules or blank-node adaptions.
 (Commons RDF Graph/Dataset contracts do not require the exact triple/quad to be returned
back again)
    
    So I think that would be the semantically cleanest solution, where each `RDF` implementation
behaves the same, but each `Dataset` have slight variation.
    
    However, it is not given that a `Quad` made with `JenaRDF` will be added to a Jena-based
`Dataset`, but that is probably most likely. It is not given that a `Node` that is `urn:x-arq:DefaultGraph`
was picked from the constant `Node.defaultNode`, but it is likely. It is not given that a
literal Graph IRI `urn:x-arq:DefaultGraph` has leaked from Jena's `Node.defaultNode, but it
is likely.
    
    
    Therefore the most pragmatic for Commons RDF users, if semantically slightly unclean,
would be the option (2) as @ajs6f says. It means there would be only this inconsistency barrier:
    
    ```java
    RDF simple = new SimpleRDF();
    RDF jena = new JenaRDF();
    
    IRI defaultS = simple.createIRI("urn:x-arq:DefaultGraph")
    IRI defaultJ = jena.createIRI("urn:x-arq:DefaultGraph") // or jena.asRDFTerm(Node.defaultGraph)
    assertEquals(defaultS, defaultJ);
    
    IRI ex = jena.createIRI("http://example.com/");
    
    Quad q1 = jena.createQuad(defaultS, ex, ex, ex);
    assertFalse(q1.getGraphName().isPresent());
    assertEquals(defaultS, q1.getGraphName().get()); // as-s
    Quad q2 = jena.createQuad(defaultJ, ex, ex, ex);
    assertFalse(q2.getGraphName().isPresent()); // INCONSISTENT with q1
    assertFalse(q1.equals(q2)); // INCONSISTENT
    ```
    
    (Adding either `q1` or `q2` to a Jena-backed Dataset would both be transferred to q2-form
with `Optional.empty()` on retrieving -- adding them to any non-Jena Dataset implementation
would look like two different quads).
    
    This will technically break the [SHOULD contract](https://github.com/apache/commons-rdf/blob/0.3.0-incubating/api/src/main/java/org/apache/commons/rdf/api/RDF.java#L234)
of `RDF.createQuad()` which says the parameters should be preserved. 
    
    >      * The returned Quad SHOULD have a {@link Quad#getGraphName()} that is equal
    >     * to the provided graphName, a {@link Quad#getSubject()} that is equal to
    >     * the provided subject, a {@link Quad#getPredicate()} that is equal to the
    >     * provided predicate, and a {@link Quad#getObject()} that is equal to the
    >     * provided object.
    
    but I think this is a valid breaking of SHOULD, particularly if we do it only on "our
own" Jena-backed IRIs.
    



> Stream of Jena quads use wrong IRI for default graph
> ----------------------------------------------------
>
>                 Key: COMMONSRDF-55
>                 URL: https://issues.apache.org/jira/browse/COMMONSRDF-55
>             Project: Apache Commons RDF
>          Issue Type: Bug
>          Components: jena
>    Affects Versions: 0.3.0
>            Reporter: Stian Soiland-Reyes
>            Assignee: Stian Soiland-Reyes
>             Fix For: 1.0.0
>
>
> See https://travis-ci.org/apache/commons-rdf/builds/195548479
> {code}
> org.apache.commons.rdf.jena.DatasetJenaTest
> streamLanguageTagsCaseInsensitive(org.apache.commons.rdf.jena.DatasetJenaTest)  Time
elapsed: 0.012 sec  <<< FAILURE!
> java.lang.AssertionError: expected:<<http://example.com/s1> <http://example.com/greeting>
"Hello"@EN-GB <urn:x-arq:DefaultGraph>.> but was:<<http://example.com/s1>
<http://example.com/greeting> "Hello"@en-GB .>
> {code}
> Jena uses the IRI `<urn:x-arq:DefaultGraph>` internally to represent the default
graph within datasets - we need to recognize that on the way out of a `JenaDatasetImpl.stream()`
and possibly in the `asQuad(JenaQuad)` converter and replace it with `Optional.empty()` so
the default graph appears the same across implementations.
> The `AbstractDatasetTest`  should be augmented to do more tests on the default graph,
including `.stream()`, `.iterate()`, `.contains()` and `.remove()`1.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message