stanbol-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrea Giovanni Nuzzolese <nuzzo...@cs.unibo.it>
Subject Re: SPARQL serialization of FieldQuery
Date Tue, 14 Jun 2011 08:47:04 GMT
Hi,

I am fear that right now subqueries are the only walk around to this sort of problem with
a single query.

A possible solution in SPARQL1.0 is to split the query into two different queries: 
(i) select the right number of entities according to the limit
(ii) construct the graph with the required information related to the selected entities

Just to detail the problem, since 2009 there is already a proposal of a new feature for an
object-oriented like limit [1].
I do not know any concrete implementation of that sort of limit.

Regards.

--
Andrea

[1] http://www.w3.org/2009/sparql/wiki/Feature:LimitPerResource


On Jun 13, 2011, at 10:54 PM, Rupert Westenthaler wrote:

> Hi
> 
> Is there also a possibility to do a similar thing by using SPARQL 1.0?
> It is already a pain to adapt SPARQL 1.0 queries to implementation
> specific features. I guess that is even worse with SPARQL 1.1.
> 
> regards
> Rupert
> 
> 
> On Mon, Jun 13, 2011 at 7:45 PM, Andrea Giovanni Nuzzolese
> <nuzzoles@cs.unibo.it> wrote:
>> Hi Rupert, all
>> 
>> I want to add only one consideration that I missed in the previous email.
>> Subqueries are completely in SPARQL1.1, but both Jena [1] (with syntaxARQ) and Virtuoso
support it as you can easily see querying DBpedia.
>> 
>> Best.
>> 
>> --
>> Andrea Giovanni Nuzzolese
>> Semantic Technology Laboratory (STLab)
>> Institute for Cognitive Science and Technology (ISTC)
>> National Research Council  (CNR)
>> Via Nomentana 56, Roma - Itay
>> 
>> 
>> [1] http://jena.sourceforge.net/ARQ/sub-select.html
>> 
>> 
>> On Jun 13, 2011, at 7:30 PM, Andrea Giovanni Nuzzolese wrote:
>> 
>>> Hi Rupert, all
>>> 
>>> my impression is that the limit in the construct should work only at the level
of the constraints that the user have provided in the FieldQuery.
>>> To be to be as clearly as possible I take again my previous query example in
which I want to find at most 3 same as entities of <http://www4.wiwiss.fu-berlin.de/factbook/resource/United_States>
>>> 
>>> {
>>> "selected": ["http:\/\/www.w3.org\/2000\/01\/rdf-schema#label", "http:\/\/www.w3.org\/1999\/02\/22-rdf-syntax-ns#type"],
>>> "offset": "0",
>>> "limit": "3",
>>> "constraints": [ {"type": "reference", "field": "http:\/\/www.w3.org\/2002\/07\/owl#sameAs",
"value": "http:\/\/www4.wiwiss.fu-berlin.de\/factbook\/resource\/United_States"}]
>>> }
>>> 
>>> To select at most the right number of entities the query should be converted
into a construct like the following
>>> 
>>> CONSTRUCT {
>>> ?id <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> ?v_1 .
>>> ?id <http://www.w3.org/2000/01/rdf-schema#label> ?v_2 .
>>> <http://www.iks-project.eu/ontology/rick/query/QueryResultSet>
>>> <http://www.iks-project.eu/ontology/rick/query/queryResult> ?id .
>>> }
>>> WHERE {
>>> {
>>>   SELECT ?id
>>>   WHERE  { ?id <http://www.w3.org/2002/07/owl#sameAs><http://www4.wiwiss.fu-berlin.de/factbook/resource/United_States>
}
>>>   LIMIT 3
>>> }
>>> OPTIONAL { ?id <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> ?v_1 .
} .
>>> OPTIONAL { ?id <http://www.w3.org/2000/01/rdf-schema#label> ?v_2 . }
>>> }
>>> ORDER BY DESC ( <LONG::IRI_RANK> (?id) )
>>> 
>>> In this way the limit works on the number of entities to fetch and not the number
of solutions.
>>> 
>>> Best.
>>> 
>>> --
>>> Andrea Giovanni Nuzzolese
>>> Semantic Technology Laboratory (STLab)
>>> Institute for Cognitive Science and Technology (ISTC)
>>> National Research Council  (CNR)
>>> Via Nomentana 56, Roma - Itay
>>> 
>>> On Jun 13, 2011, at 6:47 PM, Rupert Westenthaler wrote:
>>> 
>>>> Hi Andrea, all
>>>> 
>>>> While working on this I discovered something else:
>>>> 
>>>> The "LIMIT" of SPARQL is based on solutions. For the Entityhub its
>>>> semantic is that is defines the number of entities to select.
>>>> 
>>>> When selecting multiple variables there is the possibility (actually
>>>> it is very likely) that there are multiple solutions/entity. Therefore
>>>> it can no longer be predicted how many entities are selected by a
>>>> SPARQL query. This also affects CONSTRUCT queries as it will restrict
>>>> the numbers of triples added to the resulting graph.
>>>> 
>>>> Is if there is any possibility to work around this problem?
>>>> 
>>>> best
>>>> Rupert Westenthaler
>>>> 
>>>> On Thu, Jun 9, 2011 at 3:57 PM, Andrea Giovanni Nuzzolese
>>>> <nuzzoles@cs.unibo.it> wrote:
>>>>> The issue is STANBOL-222.
>>>>> 
>>>>> Best.
>>>>> Andrea
>>>>> 
>>>>> On Jun 9, 2011, at 9:54 AM, Rupert Westenthaler wrote:
>>>>> 
>>>>>> Hi
>>>>>> 
>>>>>> Thx a lot. I was aware of the fact that optional constraints do cause
>>>>>> this problem, but I have not known that rewriting the query like
this
>>>>>> can solve this problem.
>>>>>> I will try to change the generation of SPARQL queries tomorrow on
the
>>>>>> way back from Berlin to Salzburg.
>>>>>> 
>>>>>> would be nice if you could create a JIRA issue for that
>>>>>> 
>>>>>> best
>>>>>> Rupert
>>>>>> 
>>>>>> On Wed, Jun 8, 2011 at 7:20 PM, Andrea Giovanni Nuzzolese
>>>>>> <nuzzoles@cs.unibo.it> wrote:
>>>>>>> Hi,
>>>>>>> I am trying to use the query service provided by the EntityHub.
>>>>>>> The JSON field query I wrote is the following
>>>>>>> {
>>>>>>> "selected": ["http:\/\/www.w3.org\/2000\/01\/rdf-schema#label",
>>>>>>> "http:\/\/www.w3.org\/1999\/02\/22-rdf-syntax-ns#type"],
>>>>>>> "offset": "0",
>>>>>>> "limit": "3",
>>>>>>> "constraints": [
>>>>>>> {
>>>>>>> "type": "reference",
>>>>>>> "field": "http:\/\/www.w3.org\/2002\/07\/owl#sameAs",
>>>>>>> "value":
>>>>>>> "http:\/\/www4.wiwiss.fu-berlin.de\/factbook\/resource\/United_States",
>>>>>>> }]
>>>>>>> }
>>>>>>> and aims to find all the owl:sameAs entities of the entity
>>>>>>> <http://www4.wiwiss.fu-berlin.de/factbook/resource/United_States>
in
>>>>>>> DBpedia.
>>>>>>> But instead of the graph with the expected results I always receive
the
>>>>>>> error
>>>>>>> javax.ws.rs.WebApplicationException:
>>>>>>> org.apache.stanbol.entityhub.servicesapi.site.ReferencedSiteException:
>>>>>>> Unable execute Query on remote site http://dbpedia.org/sparql
>>>>>>> The sparql construct serialized by the EntityHub is the following
>>>>>>> CONSTRUCT {
>>>>>>> ?id <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> ?v_1
.
>>>>>>> ?id <http://www.w3.org/2000/01/rdf-schema#label> ?v_2 .
>>>>>>> <http://www.iks-project.eu/ontology/rick/query/QueryResultSet>
>>>>>>> <http://www.iks-project.eu/ontology/rick/query/queryResult>
?id .
>>>>>>> }
>>>>>>> WHERE {
>>>>>>> { {?id <http://www.w3.org/2002/07/owl#sameAs>
>>>>>>> <http://www4.wiwiss.fu-berlin.de/factbook/resource/United_States>
}}
>>>>>>> { OPTIONAL { ?id <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
?v_1 . }
>>>>>>> }
>>>>>>> { OPTIONAL { ?id <http://www.w3.org/2000/01/rdf-schema#label>
?v_2 . } }
>>>>>>> }
>>>>>>> ORDER BY DESC ( <LONG::IRI_RANK> (?id) )  LIMIT 3
>>>>>>> If you try to run this construct directly on the SPARQL endpoint
of dbpedia
>>>>>>> the response is always "Virtuoso 42000 Error The estimated execution
time 0
>>>>>>> (sec) exceeds the limit of 1500 (sec)."
>>>>>>> It seems that the problem is in the graph patterns containing
optional as
>>>>>>> they are bound to the whole default graph ( {OPTIONAL{triple
pattern} } is
>>>>>>> equivalent to { {} OPTIONAL {triple pattern} }.
>>>>>>> In fact the query works properly both if you substitute the three
graph
>>>>>>> patterns into one graph pattern as follows
>>>>>>> ...
>>>>>>> WHERE {
>>>>>>> ?id <http://www.w3.org/2002/07/owl#sameAs>
>>>>>>> <http://www4.wiwiss.fu-berlin.de/factbook/resource/United_States>
.
>>>>>>> OPTIONAL { ?id <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
?v_1 . }
>>>>>>> OPTIONAL { ?id <http://www.w3.org/2000/01/rdf-schema#label>
?v_2 . }
>>>>>>> }
>>>>>>> ...
>>>>>>> 
>>>>>>> and if you bind the optional to the global solution of the WHERE
clause,
>>>>>>> i.e.
>>>>>>> ...
>>>>>>> WHERE {
>>>>>>> {{?id <http://www.w3.org/2002/07/owl#sameAs>
>>>>>>> <http://www4.wiwiss.fu-berlin.de/factbook/resource/United_States>
. }}
>>>>>>> OPTIONAL { ?id <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
?v_1 . } .
>>>>>>> OPTIONAL { ?id <http://www.w3.org/2000/01/rdf-schema#label>
?v_2 . }
>>>>>>> }
>>>>>>> ...
>>>>>>> 
>>>>>>> 
>>>>>>> Regards.
>>>>>>> --
>>>>>>> Andrea Giovanni Nuzzolese
>>>>>>> Semantic Technology Laboratory (STLab)
>>>>>>> Institute for Cognitive Science and Technology (ISTC)
>>>>>>> National Research Council  (CNR)
>>>>>>> Via Nomentana 56, Roma - Itay
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> | Rupert Westenthaler             rupert.westenthaler@gmail.com
>>>>>> | Bodenlehenstra├če 11                             ++43-699-11108907
>>>>>> | A-5500 Bischofshofen
>>>>> 
>>>>> --
>>>>> Andrea Giovanni Nuzzolese
>>>>> Semantic Technology Laboratory (STLab)
>>>>> Institute for Cognitive Science and Technology (ISTC)
>>>>> National Research Council  (CNR)
>>>>> Via Nomentana 56, Roma - Itay
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> 
>>>> --
>>>> | Rupert Westenthaler             rupert.westenthaler@gmail.com
>>>> | Bodenlehenstra├če 11                             ++43-699-11108907
>>>> | A-5500 Bischofshofen
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
> 
> 
> 
> -- 
> | Rupert Westenthaler             rupert.westenthaler@gmail.com
> | Bodenlehenstra├če 11                             ++43-699-11108907
> | A-5500 Bischofshofen









Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message