stanbol-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrea Giovanni Nuzzolese <nuzzo...@cs.unibo.it>
Subject Re: SPARQL serialization of FieldQuery
Date Mon, 13 Jun 2011 17:30:08 GMT
Hi Rupert, all

my impression is that the limit in the construct should work only at the level of the constraints
that the user have provided in the FieldQuery.
To be to be as clearly as possible I take again my previous query example in which I want
to find at most 3 same as entities of <http://www4.wiwiss.fu-berlin.de/factbook/resource/United_States>

{
 "selected": ["http:\/\/www.w3.org\/2000\/01\/rdf-schema#label", "http:\/\/www.w3.org\/1999\/02\/22-rdf-syntax-ns#type"],

 "offset": "0", 
 "limit": "3", 
 "constraints": [ {"type": "reference", "field": "http:\/\/www.w3.org\/2002\/07\/owl#sameAs",
"value": "http:\/\/www4.wiwiss.fu-berlin.de\/factbook\/resource\/United_States"}] 
}

To select at most the right number of entities the query should be converted into a construct
like the following

CONSTRUCT {
 ?id <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> ?v_1 .
 ?id <http://www.w3.org/2000/01/rdf-schema#label> ?v_2 .
 <http://www.iks-project.eu/ontology/rick/query/QueryResultSet>
 <http://www.iks-project.eu/ontology/rick/query/queryResult> ?id .
}
WHERE {
{ 
   SELECT ?id 
   WHERE  { ?id <http://www.w3.org/2002/07/owl#sameAs><http://www4.wiwiss.fu-berlin.de/factbook/resource/United_States>
} 
   LIMIT 3 
 }
OPTIONAL { ?id <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> ?v_1 . } .
OPTIONAL { ?id <http://www.w3.org/2000/01/rdf-schema#label> ?v_2 . } 
}
ORDER BY DESC ( <LONG::IRI_RANK> (?id) )  

In this way the limit works on the number of entities to fetch and not the number of solutions.

Best.

--
Andrea Giovanni Nuzzolese
Semantic Technology Laboratory (STLab)
Institute for Cognitive Science and Technology (ISTC)
National Research Council  (CNR)
Via Nomentana 56, Roma - Itay

On Jun 13, 2011, at 6:47 PM, Rupert Westenthaler wrote:

> Hi Andrea, all
> 
> While working on this I discovered something else:
> 
> The "LIMIT" of SPARQL is based on solutions. For the Entityhub its
> semantic is that is defines the number of entities to select.
> 
> When selecting multiple variables there is the possibility (actually
> it is very likely) that there are multiple solutions/entity. Therefore
> it can no longer be predicted how many entities are selected by a
> SPARQL query. This also affects CONSTRUCT queries as it will restrict
> the numbers of triples added to the resulting graph.
> 
> Is if there is any possibility to work around this problem?
> 
> best
> Rupert Westenthaler
> 
> On Thu, Jun 9, 2011 at 3:57 PM, Andrea Giovanni Nuzzolese
> <nuzzoles@cs.unibo.it> wrote:
>> The issue is STANBOL-222.
>> 
>> Best.
>> Andrea
>> 
>> On Jun 9, 2011, at 9:54 AM, Rupert Westenthaler wrote:
>> 
>>> Hi
>>> 
>>> Thx a lot. I was aware of the fact that optional constraints do cause
>>> this problem, but I have not known that rewriting the query like this
>>> can solve this problem.
>>> I will try to change the generation of SPARQL queries tomorrow on the
>>> way back from Berlin to Salzburg.
>>> 
>>> would be nice if you could create a JIRA issue for that
>>> 
>>> best
>>> Rupert
>>> 
>>> On Wed, Jun 8, 2011 at 7:20 PM, Andrea Giovanni Nuzzolese
>>> <nuzzoles@cs.unibo.it> wrote:
>>>> Hi,
>>>> I am trying to use the query service provided by the EntityHub.
>>>> The JSON field query I wrote is the following
>>>> {
>>>> "selected": ["http:\/\/www.w3.org\/2000\/01\/rdf-schema#label",
>>>> "http:\/\/www.w3.org\/1999\/02\/22-rdf-syntax-ns#type"],
>>>> "offset": "0",
>>>> "limit": "3",
>>>> "constraints": [
>>>> {
>>>> "type": "reference",
>>>> "field": "http:\/\/www.w3.org\/2002\/07\/owl#sameAs",
>>>> "value":
>>>> "http:\/\/www4.wiwiss.fu-berlin.de\/factbook\/resource\/United_States",
>>>> }]
>>>> }
>>>> and aims to find all the owl:sameAs entities of the entity
>>>> <http://www4.wiwiss.fu-berlin.de/factbook/resource/United_States> in
>>>> DBpedia.
>>>> But instead of the graph with the expected results I always receive the
>>>> error
>>>> javax.ws.rs.WebApplicationException:
>>>> org.apache.stanbol.entityhub.servicesapi.site.ReferencedSiteException:
>>>> Unable execute Query on remote site http://dbpedia.org/sparql
>>>> The sparql construct serialized by the EntityHub is the following
>>>> CONSTRUCT {
>>>> ?id <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> ?v_1 .
>>>> ?id <http://www.w3.org/2000/01/rdf-schema#label> ?v_2 .
>>>> <http://www.iks-project.eu/ontology/rick/query/QueryResultSet>
>>>> <http://www.iks-project.eu/ontology/rick/query/queryResult> ?id .
>>>> }
>>>> WHERE {
>>>> { {?id <http://www.w3.org/2002/07/owl#sameAs>
>>>> <http://www4.wiwiss.fu-berlin.de/factbook/resource/United_States> }}
>>>> { OPTIONAL { ?id <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
?v_1 . }
>>>> }
>>>> { OPTIONAL { ?id <http://www.w3.org/2000/01/rdf-schema#label> ?v_2
. } }
>>>> }
>>>> ORDER BY DESC ( <LONG::IRI_RANK> (?id) )  LIMIT 3
>>>> If you try to run this construct directly on the SPARQL endpoint of dbpedia
>>>> the response is always "Virtuoso 42000 Error The estimated execution time
0
>>>> (sec) exceeds the limit of 1500 (sec)."
>>>> It seems that the problem is in the graph patterns containing optional as
>>>> they are bound to the whole default graph ( {OPTIONAL{triple pattern} } is
>>>> equivalent to { {} OPTIONAL {triple pattern} }.
>>>> In fact the query works properly both if you substitute the three graph
>>>> patterns into one graph pattern as follows
>>>> ...
>>>> WHERE {
>>>> ?id <http://www.w3.org/2002/07/owl#sameAs>
>>>> <http://www4.wiwiss.fu-berlin.de/factbook/resource/United_States> .
>>>> OPTIONAL { ?id <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> ?v_1
. }
>>>> OPTIONAL { ?id <http://www.w3.org/2000/01/rdf-schema#label> ?v_2 .
}
>>>> }
>>>> ...
>>>> 
>>>> and if you bind the optional to the global solution of the WHERE clause,
>>>> i.e.
>>>> ...
>>>> WHERE {
>>>> {{?id <http://www.w3.org/2002/07/owl#sameAs>
>>>> <http://www4.wiwiss.fu-berlin.de/factbook/resource/United_States> .
}}
>>>> OPTIONAL { ?id <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> ?v_1
. } .
>>>> OPTIONAL { ?id <http://www.w3.org/2000/01/rdf-schema#label> ?v_2 .
}
>>>> }
>>>> ...
>>>> 
>>>> 
>>>> Regards.
>>>> --
>>>> Andrea Giovanni Nuzzolese
>>>> Semantic Technology Laboratory (STLab)
>>>> Institute for Cognitive Science and Technology (ISTC)
>>>> National Research Council  (CNR)
>>>> Via Nomentana 56, Roma - Itay
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>> 
>>> 
>>> 
>>> --
>>> | Rupert Westenthaler             rupert.westenthaler@gmail.com
>>> | Bodenlehenstra├če 11                             ++43-699-11108907
>>> | A-5500 Bischofshofen
>> 
>> --
>> Andrea Giovanni Nuzzolese
>> Semantic Technology Laboratory (STLab)
>> Institute for Cognitive Science and Technology (ISTC)
>> National Research Council  (CNR)
>> Via Nomentana 56, Roma - Itay
>> 
>> 
>> 
>> 
>> 
>> 
>> 
> 
> 
> 
> -- 
> | Rupert Westenthaler             rupert.westenthaler@gmail.com
> | Bodenlehenstra├če 11                             ++43-699-11108907
> | A-5500 Bischofshofen









Mime
View raw message