Return-Path: X-Original-To: apmail-incubator-clerezza-dev-archive@minotaur.apache.org Delivered-To: apmail-incubator-clerezza-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 43C72E314 for ; Sat, 8 Dec 2012 17:20:41 +0000 (UTC) Received: (qmail 30232 invoked by uid 500); 8 Dec 2012 17:20:40 -0000 Delivered-To: apmail-incubator-clerezza-dev-archive@incubator.apache.org Received: (qmail 30117 invoked by uid 500); 8 Dec 2012 17:20:38 -0000 Mailing-List: contact clerezza-dev-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: clerezza-dev@incubator.apache.org Delivered-To: mailing list clerezza-dev@incubator.apache.org Received: (qmail 30068 invoked by uid 99); 8 Dec 2012 17:20:37 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 08 Dec 2012 17:20:37 +0000 X-ASF-Spam-Status: No, hits=2.9 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [89.31.73.185] (HELO smtpauthbis.interhost.it) (89.31.73.185) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 08 Dec 2012 17:20:30 +0000 Received: by smtpauthbis.interhost.it (Postfix, from userid 1000) id 1D4DCC44A9; Sat, 8 Dec 2012 18:20:09 +0100 (CET) Received: from [127.0.0.1] (host28-182-dynamic.6-87-r.retail.telecomitalia.it [87.6.182.28]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by smtpauthbis.interhost.it (Postfix) with ESMTPSA id 0422CC4450 for ; Sat, 8 Dec 2012 18:20:05 +0100 (CET) Message-ID: <50C376C6.1040402@innovationengineering.eu> Date: Sat, 08 Dec 2012 18:20:06 +0100 From: Giuseppe Miscione User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/17.0 Thunderbird/17.0 MIME-Version: 1.0 To: clerezza-dev@incubator.apache.org Subject: Re: Bug in SPARQL query serialization References: <50B8E0A0.6090300@innovationengineering.eu> <50BC7020.5060703@innovationengineering.eu> <50BCB38E.8030301@innovationengineering.eu> <50C35279.10102@innovationengineering.eu> <50C36DF9.4080103@innovationengineering.eu> In-Reply-To: Content-Type: multipart/alternative; boundary="------------080500030107070405050902" X-Virus-Checked: Checked by ClamAV on apache.org --------------080500030107070405050902 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit Exactly, I was suggestion only to allocate LinkedHashSets instead of HashSets. I tried this solution on my code version, using LinkedHashSets in this classes: * org.apache.clerezza.rdf.core.sparql.query.impl.SimpleBasicGraphPattern * org.apache.clerezza.rdf.core.sparql.query.impl.SimpleConstructQuery * org.apache.clerezza.rdf.core.sparql.query.impl.SimpleDataSet * org.apache.clerezza.rdf.core.sparql.query.impl.SimpleGroupGraphPattern And the test case that I suggested doesn't fail anymore. On the other side, I checked the generated strings and the FILTER and OPTIONAL sentences are not in the original order. This is the original query: *PREFIX mo: ** **PREFIX list: ** **PREFIX owl: ** **PREFIX rdf: ** **PREFIX rdfs: ** **PREFIX dc: ** **SELECT ?property ?range ?property_description ?subproperty ?subproperty_description** **WHERE** **{** ** ?property a owl:ObjectProperty .** ** FILTER (?property != owl:bottomObjectProperty) .** ** {** ** {** ** ?property rdfs:domain ?superclass .** ** mo:Company rdfs:subClassOf ?superclass .** ** }** ** UNION** ** {** ** ?property rdfs:domain ?dunion .** ** ?dunion owl:unionOf ?dlist .** ** ?dlist list:member ?superclass .** ** mo:Company rdfs:subClassOf ?superclass .** ** }** ** }** ** {** ** {** ** ?property rdfs:range ?superrange .** ** ?range rdfs:subClassOf ?superrange .** ** FILTER (!isBlank(?range)) .** ** }** ** UNION** ** {** ** ?property rdfs:range ?range .** ** FILTER (!isBlank(?range)) .** ** }** ** } .** ** FILTER (?range != owl:Nothing) .** ** OPTIONAL { ?somesub rdfs:subClassOf ?range . FILTER(?somesub != owl:Nothing && ?somesub != ?range)}** ** FILTER (!bound(?somesub)) .** ** OPTIONAL** ** {** ** ?subproperty rdfs:subPropertyOf ?property .** ** FILTER(?subproperty != owl:bottomObjectProperty && ?subproperty != ?property)** ** OPTIONAL { ?subproperty dc:description ?subproperty_description . }** ** }** ** OPTIONAL { ?property dc:description ?property_description . }** **} * And this is the serialized string: *SELECT ?property ?range ?property_description ?subproperty ?subproperty_description ** **WHERE** **{** ** ?property .** ** {** ** {** ** ?property ?superclass .** ** ?superclass .** ** } ** ** UNION** ** {** ** ?property ?dunion .** ** ?dunion ?dlist .** ** ?dlist ?superclass .** ** ?superclass .** ** }** ** }** ** {** ** {** ** ?property ?superrange .** ** ?range ?superrange .** ** FILTER (! (isBLANK(?range)))** ** }** ** UNION** ** {** ** ?property ?range .** ** FILTER (! (isBLANK(?range)))** ** }** ** }** ** OPTIONAL** ** {** ** ?somesub ?range .** ** FILTER (((?somesub) != ()) && ((?somesub) != (?range)))** ** }** ** OPTIONAL** ** {** ** ?subproperty ?property .** ** OPTIONAL { ?subproperty ?subproperty_description . }** ** FILTER (((?subproperty) != ()) && ((?subproperty) != (?property)))** ** }** ** OPTIONAL { ?property ?property_description . }** ** FILTER ((?property) != ())** ** FILTER ((?range) != ())** ** FILTER (! (BOUND(?somesub)))** **} * I checked this version of the serialized query on the reasoner powered graph that raised the problem and it worked fine. Il 08/12/2012 17:54, Reto Bachmann-Gm�r ha scritto: > Ok, so you're suggesting not to change any interface but simly the > implemenentation to preserve the order, if the order has no relevane by the > sparql spec then I would prefer that solution. > > Cheers, > Reto > > On Sat, Dec 8, 2012 at 5:42 PM, Giuseppe Miscione < > g.miscione@innovationengineering.eu> wrote: > >> Hi Reto, >> in revision 1353713 SimpleBasicGraphPattern contains this code: >> >> public SimpleBasicGraphPattern(Set<**TriplePattern> triplePatterns) { >> this.triplePatterns = (triplePatterns == null) >> ? new HashSet() >> : triplePatterns; >> } >> >> This implementation uses an HashSet that will mess up the order of the >> added elements. By simply allocating a LinkedHashSet you'll keep the Set >> logic (no duplicates) and you'll preserve the order in which the elements >> are added to the set. BasicGraphPattern won't be affected at all, it will >> continue to espose a Set, but the underlying LinkedHashSet implementation >> will keep the order of the triple patterns. >> I don't know why Sets were used before (maybe to avoid the presence of a >> duplicate triple pattern in the same graph pattern?), but the solution >> implemented by Hasan completely changed the interface of BasicGraphPattern >> , deprecating a method and adding an equivalent one. >> >> Il 08/12/2012 17:13, Reto Bachmann-Gm�r ha scritto: >> >>> Hi Giuseppe and Hasan >>> >>> If the order of the order of patter is relevant then this should model >>> this >>> as a list. Using LinkeddHashSet in BasicGraphPattern would tie this to a >>> particular implementation. As far as I know the order of the clauses has >>> no >>> relevance by the sparql spec (like the order of triples in a graph). But >>> we >>> could maybe change our implementation so that it no longer supports >>> querying by queries described as object trees but only as string, the >>> parsing necessary for the fastlane could be limited to detecting the type >>> of query (to parse the result in the right way) and the graphs agains >>> which >>> the query is directed. >>> >>> Cheers, >>> Reto >>> >>> >>> On Sat, Dec 8, 2012 at 3:45 PM, Giuseppe Miscione < >>> g.miscione@**innovationengineering.eu> >>> wrote: >>> >>> Hi Hasan, >>>> I had a look at the code changes that you've made. >>>> I saw that you introduced in the parser produced objects methods that now >>>> works with Lists and you've deprecated the methods working with Sets. >>>> Now, >>>> I have a personal cosideration: wouldn't it be easier to restore the old >>>> code and use LinkedHashSets instead of HashSets, without changing so much >>>> the class interfaces with the introduction of deprecated methods? >>>> >>>> Il 08/12/2012 14:27, Hasan Hasan ha scritto: >>>> >>>> Hi Giuseppe >>>> >>>>> I have resolved the issue >>>>> CLEREZZA-725 >>>>> >>>>> which >>>>> reflects the problem you raised. >>>>> >>>>> Kind regards >>>>> Hasan >>>>> >>>>> On Tue, Dec 4, 2012 at 11:34 PM, Hasan Hasan wrote: >>>>> >>>>> Thanks Giuseppe >>>>> >>>>>> * >>>>>> * >>>>>> I'll try the test as soon as I have time during this week. >>>>>> >>>>>> Cheers >>>>>> Hasan >>>>>> >>>>>> >>>>>> On Mon, Dec 3, 2012 at 3:13 PM, Giuseppe Miscione < >>>>>> g.miscione@**innovationenginee**ring.eu >>>>>> >>>>>> wrote: >>>>>> >>>>>> Hi Hasan, >>>>>> >>>>>>> I prepared a JUnit test method that clarifies the problem: >>>>>>> >>>>>>> @Test >>>>>>> public void testParseMultipleTimes() throws Exception { >>>>>>> String queryString = >>>>>>> "PREFIX mo: >>>>>> project.eu/ontologies/market_******ontology.owl# >>>>>>> >>>>>>> >>>>>> market_ontology.** >>>>>>> >>>>>>> owl#>>>>>> ontology.owl# >>>>>>>> \n" >>>>>>>> + >>>>>>> "PREFIX list: >>>>>>> >>>>>>> < >>>>>>> http://jena.hpl.hp.com/ARQ/**list# >>>>>>>> \n" >>>>>>>> + >>>>>>> "PREFIX owl: >>>>>>> >>>>>>> < >>>>>>> http://www.w3.org/2002/07/**owl# >>>>>>>>>> \n" >>>>>>> + >>>>>>> "PREFIX rdf: >>>>>> ****rdf-syntax-ns# < >>>>>>> http://www.**w3.org/1999/02/22-**rdf-**syntax-ns# >>>>>>> >>>>>>> >>>>>>>> \n" >>>>>>>> + >>>>>>> "PREFIX rdfs: >>>>>>> >>>>>>> >>>>>>> >>>>>>>> \n" >>>>>>>> + >>>>>>> "PREFIX dc: >>>>>>> >>>>>>> < >>>>>>> http://purl.org/dc/elements/**1.1/ >>>>>>> \n" >>>>>>>> + >>>>>>> "SELECT ?property ?range ?property_description ?subproperty >>>>>>> ?subproperty_description\n" + >>>>>>> "WHERE {\n" + >>>>>>> " ?property a owl:ObjectProperty .\n" + >>>>>>> " FILTER (?property != owl:bottomObjectProperty) .\n" + >>>>>>> " {\n" + >>>>>>> " {\n" + >>>>>>> " ?property rdfs:domain ?superclass .\n" + >>>>>>> " mo:Company rdfs:subClassOf ?superclass .\n" + >>>>>>> " }\n" + >>>>>>> " UNION\n" + >>>>>>> " {\n" + >>>>>>> " ?property rdfs:domain ?dunion .\n" + >>>>>>> " ?dunion owl:unionOf ?dlist .\n" + >>>>>>> " ?dlist list:member ?superclass .\n" + >>>>>>> " mo:Company rdfs:subClassOf ?superclass .\n" + >>>>>>> " }\n" + >>>>>>> " }\n" + >>>>>>> " {\n" + >>>>>>> " {\n" + >>>>>>> " ?property rdfs:range ?superrange .\n" + >>>>>>> " ?range rdfs:subClassOf ?superrange .\n" + >>>>>>> " FILTER (!isBlank(?range)) .\n" + >>>>>>> " }\n" + >>>>>>> " UNION\n" + >>>>>>> " {\n" + >>>>>>> " ?property rdfs:range ?range .\n" + >>>>>>> " FILTER (!isBlank(?range)) .\n" + >>>>>>> " }\n" + >>>>>>> " } .\n" + >>>>>>> " FILTER (?range != owl:Nothing) .\n" + >>>>>>> " OPTIONAL { ?somesub rdfs:subClassOf ?range . >>>>>>> FILTER(?somesub >>>>>>> != owl:Nothing && ?somesub != ?range)}\n" + >>>>>>> " FILTER (!bound(?somesub)) .\n" + >>>>>>> " OPTIONAL {\n" + >>>>>>> " ?subproperty rdfs:subPropertyOf ?property .\n" + >>>>>>> " FILTER(?subproperty != owl:bottomObjectProperty && >>>>>>> ?subproperty != ?property)\n" + >>>>>>> " OPTIONAL { ?subproperty dc:description >>>>>>> ?subproperty_description . }\n" + >>>>>>> " }\n" + >>>>>>> " OPTIONAL { ?property dc:description >>>>>>> ?property_description >>>>>>> . >>>>>>> }\n" + >>>>>>> "} "; >>>>>>> >>>>>>> Query query1 = QueryParser.getInstance().****** >>>>>>> parse(queryString); >>>>>>> System.out.println(query1.******toString()); >>>>>>> >>>>>>> System.out.println("----------******-------------"); >>>>>>> >>>>>>> Thread.sleep(5000l); >>>>>>> >>>>>>> Query query2 = QueryParser.getInstance().****** >>>>>>> parse(queryString); >>>>>>> System.out.println(query2.******toString()); >>>>>>> >>>>>>> Assert.assertEquals(query1.******toString(), >>>>>>> query2.toString()); >>>>>>> >>>>>>> } >>>>>>> >>>>>>> By separating the two parse() calls with a 5 seconds sleep, you'll see >>>>>>> that the two parsed objects will produce different strings. Without >>>>>>> the >>>>>>> Thread.sleep() call the test method doesn't fail, so I think there's >>>>>>> something time-related in the javacc parser that will mix up the >>>>>>> parsed >>>>>>> statements. >>>>>>> >>>>>>> Regards, >>>>>>> Giuseppe >>>>>>> >>>>>>> Il 03/12/2012 10:25, Giuseppe Miscione ha scritto: >>>>>>> >>>>>>> Hi Hasan, >>>>>>> >>>>>>> this is the query on which I was working: >>>>>>>> PREFIX mo: >>>>>>> ject.eu/ontologies/market_** >>>>>>> market_** > >>>>>>>> ontology.owl# >>>>>>>> market_ontology.owl#>>>>>>> ontologies/market_ontology.**owl# >>>>>>>> PREFIX list: >>>>>>>> >>>>>>>> < >>>>>>>> http://jena.hpl.hp.com/ARQ/**list# >>>>>>>> PREFIX owl: >>>>>>>> >>>>>>>> < >>>>>>>> http://www.w3.org/2002/07/**owl# >> >>>>>>>> PREFIX rdf: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> PREFIX rdfs: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> PREFIX dc: >>>>>>>> >>>>>>>> < >>>>>>>> http://purl.org/dc/elements/**1.1/ >>>>>>>> SELECT ?property ?range ?property_description ?subproperty >>>>>>>> ?subproperty_description >>>>>>>> WHERE { >>>>>>>> ?property a owl:ObjectProperty . >>>>>>>> FILTER (?property != owl:bottomObjectProperty) . >>>>>>>> { >>>>>>>> { >>>>>>>> ?property rdfs:domain ?superclass . >>>>>>>> mo:Company rdfs:subClassOf ?superclass . >>>>>>>> } >>>>>>>> UNION >>>>>>>> { >>>>>>>> ?property rdfs:domain ?dunion . >>>>>>>> ?dunion owl:unionOf ?dlist . >>>>>>>> ?dlist list:member ?superclass . >>>>>>>> mo:Company rdfs:subClassOf ?superclass . >>>>>>>> } >>>>>>>> } >>>>>>>> { >>>>>>>> { >>>>>>>> ?property rdfs:range ?superrange . >>>>>>>> ?range rdfs:subClassOf ?superrange . >>>>>>>> FILTER (!isBlank(?range)) . >>>>>>>> } >>>>>>>> UNION >>>>>>>> { >>>>>>>> ?property rdfs:range ?range . >>>>>>>> FILTER (!isBlank(?range)) . >>>>>>>> } >>>>>>>> } . >>>>>>>> FILTER (?range != owl:Nothing) . >>>>>>>> OPTIONAL { ?somesub rdfs:subClassOf ?range . FILTER(?somesub != >>>>>>>> owl:Nothing && ?somesub != ?range)} >>>>>>>> FILTER (!bound(?somesub)) . >>>>>>>> OPTIONAL { >>>>>>>> ?subproperty rdfs:subPropertyOf ?property . >>>>>>>> FILTER(?subproperty != owl:bottomObjectProperty && >>>>>>>> ?subproperty >>>>>>>> != ?property) >>>>>>>> OPTIONAL { ?subproperty dc:description >>>>>>>> ?subproperty_description >>>>>>>> . } >>>>>>>> } >>>>>>>> OPTIONAL { ?property dc:description ?property_description . } >>>>>>>> } >>>>>>>> >>>>>>>> Il 03/12/2012 07:53, Hasan Hasan ha scritto: >>>>>>>> >>>>>>>> Hi Giuseppe >>>>>>>> >>>>>>>>> can you please provide an example of the query that you use and >>>>>>>>> that I >>>>>>>>> can >>>>>>>>> reproduce easily? >>>>>>>>> I will try to take some time this week to have a look. >>>>>>>>> >>>>>>>>> Kind regards >>>>>>>>> Hasan >>>>>>>>> >>>>>>>>> On Fri, Nov 30, 2012 at 5:36 PM, Giuseppe Miscione < >>>>>>>>> g.miscione@****innovationenginee**ring.eu>>>>>>>> p://innovationengineering.eu > >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>> Hi all, >>>>>>>>> >>>>>>>>> I found a bug in the SPARQL query execution chain, specifically in >>>>>>>>>> the >>>>>>>>>> *org.apache.clerezza.rdf.core.********sparql.query.Query* objects >>>>>>>>>> >>>>>>>>>> serialization >>>>>>>>>> made with the *org.apache.clerezza.rdf.core.**** >>>>>>>>>> ****sparql.query.impl.**** >>>>>>>>>> SimpleStringQuerySerializer* >>>>>>>>>> methods. >>>>>>>>>> The problem comes from the fact that the javacc objects used for >>>>>>>>>> mapping >>>>>>>>>> triple patterns are not listed in the same order as in the original >>>>>>>>>> query >>>>>>>>>> string. *SimpleStringQuerySerializer* serializes patterns into the >>>>>>>>>> ouput >>>>>>>>>> string in the order returned by the javacc parser, and so the >>>>>>>>>> output >>>>>>>>>> string >>>>>>>>>> won't always be equivalent to the source one. Moreover, parsing >>>>>>>>>> multiple >>>>>>>>>> times the same query string will result in different output >>>>>>>>>> strings. >>>>>>>>>> >>>>>>>>>> This problem is even more annoying when executing (like in my case) >>>>>>>>>> queries on graphs enanched with Pellet reasoner, because it has >>>>>>>>>> obviuos >>>>>>>>>> difficulties in inferencing relations if the order of triple >>>>>>>>>> patterns >>>>>>>>>> in >>>>>>>>>> the query is not the provided one. >>>>>>>>>> >>>>>>>>>> I solved the problem in my environment by simply saving the >>>>>>>>>> original >>>>>>>>>> string into the parsed *Query *object and then making >>>>>>>>>> *SimpleStringQuerySerializer* returns this string, without any >>>>>>>>>> processing. >>>>>>>>>> >>>>>>>>>> Can anyone take a look at the serializer to find a maybe better >>>>>>>>>> solution >>>>>>>>>> to avoid this weird behaviour? >>>>>>>>>> >>>>>>>>>> Regards, >>>>>>>>>> Giuseppe Miscione >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> --------------080500030107070405050902--