Return-Path: X-Original-To: apmail-incubator-stanbol-dev-archive@minotaur.apache.org Delivered-To: apmail-incubator-stanbol-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 754D36A93 for ; Mon, 13 Jun 2011 16:39:22 +0000 (UTC) Received: (qmail 32810 invoked by uid 500); 13 Jun 2011 16:39:22 -0000 Delivered-To: apmail-incubator-stanbol-dev-archive@incubator.apache.org Received: (qmail 32736 invoked by uid 500); 13 Jun 2011 16:39:22 -0000 Mailing-List: contact stanbol-dev-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: stanbol-dev@incubator.apache.org Delivered-To: mailing list stanbol-dev@incubator.apache.org Received: (qmail 32712 invoked by uid 99); 13 Jun 2011 16:39:22 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 13 Jun 2011 16:39:22 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,RFC_ABUSE_POST,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of rupert.westenthaler@gmail.com designates 74.125.82.175 as permitted sender) Received: from [74.125.82.175] (HELO mail-wy0-f175.google.com) (74.125.82.175) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 13 Jun 2011 16:39:17 +0000 Received: by wye20 with SMTP id 20so3800300wye.6 for ; Mon, 13 Jun 2011 09:38:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type:content-transfer-encoding; bh=kNMTlI7+oCuyUQNjHUDpijOrd9xfQ62Bj71fbAMMwE4=; b=pL9x8bn/d+ul0pzO3krNGnemx24ZqgyYt3Tsh1L6r0xBATfti3CX1quYIqptqIqh50 VuSmfD0QutL/1BsRh3eycZG/lT+qUObqYBowhFd9m4F4rCXEnSR44Zyhps6QzvNDM5Nb wVEk/eMeSjfLbcA4TIDECE8ETrw2sDsqsyzJo= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=cpkFhzJpuh/vDJEMhz82x+mVmS6SBl5RtCrcu2BY0UXhtPiYnvCv9HC2qm5Em6ZP+A q81wRwq/crTcuM4lDkgT3suMDAVYFFh56vjVVLDfWvxVsboqp/zN1SWAIFzkpYK08BmE U/LJhUnviAMFkjhNIjFzt9HkVSneHD1zGnoiA= MIME-Version: 1.0 Received: by 10.216.14.31 with SMTP id c31mr2822781wec.57.1307983135607; Mon, 13 Jun 2011 09:38:55 -0700 (PDT) Received: by 10.216.29.210 with HTTP; Mon, 13 Jun 2011 09:38:55 -0700 (PDT) In-Reply-To: <4DF63295.9000602@apache.org> References: <4DF5EF53.8090908@4sengines.com> <4DF63295.9000602@apache.org> Date: Mon, 13 Jun 2011 18:38:55 +0200 Message-ID: Subject: Re: Entityhub : Can't retrieve entity with a # From: Rupert Westenthaler To: stanbol-dev@incubator.apache.org, florent@apache.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On Mon, Jun 13, 2011 at 5:53 PM, Florent Andr=C3=A9 wr= ote: > Yep, > > And for continue about the "# case", I observe this "strange" thing : > > when request with # or %23 : I always have good metadatas values, but > - with # : representation field is not good > - With %23 representation field is ok > For referencedSites metadata are generated automatically based on metadata defined for the site (e.g. copyright, attribution, cache status ...). The type "foaf:Document" is used as rdf:type for Metadata. The "dc:subject" relation is currently used to link metadata with the entity. However this is already changed in my local version to "entityhub:about" because it caused problems with entities that also defined this property. > (Note : I use a full cached referenced site create with indexing utility.= ) > > Request and answer details : > > A) When requested with # > $ curl > "http://localhost:8080/entityhub/sites/entity?id=3Dhttp://www.test.fr/ter= minology#entity_gradient_1306341921902" > { > =C2=A0 =C2=A0"id": "http:\/\/www.test.fr\/terminology", > =C2=A0 =C2=A0"site": "gasoil", > =C2=A0 =C2=A0"representation": {"id": "http:\/\/www.test.fr\/terminology"= }, > =C2=A0 =C2=A0"metadata": { > =C2=A0 =C2=A0 =C2=A0 =C2=A0"id": "http:\/\/www.test.fr\/terminology.meta"= , > =C2=A0 =C2=A0 =C2=A0 =C2=A0"http:\/\/www.iks-project.eu\/ontology\/rick\/= model\/isChached": [{ > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0"type": "value", > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0"value": "true" > =C2=A0 =C2=A0 =C2=A0 =C2=A0}], > =C2=A0 =C2=A0 =C2=A0 =C2=A0"http:\/\/www.w3.org\/1999\/02\/22-rdf-syntax-= ns#type": [{ > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0"type": "reference", > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0"value": "http:\/\/xmlns.com\/fo= af\/0.1\/Document" > =C2=A0 =C2=A0 =C2=A0 =C2=A0}], > =C2=A0 =C2=A0 =C2=A0 =C2=A0"http:\/\/purl.org\/dc\/terms\/subject": [{ > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0"type": "reference", > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0"value": "http:\/\/www.test.fr\/= terminology" > =C2=A0 =C2=A0 =C2=A0 =C2=A0}] > =C2=A0 =C2=A0} > } > As noted in the first response everything after the '#' gets ignored. Therefore this request returns the entity with the id "http:\/\/www.test.fr\/terminology". It looks like that this entity actually exists, but does not define any data. Most likely because this URI is referenced in your SKOS file and is therefore returned by the Triple Store as "entity" while indexing. > > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > B) when requested with %23 > > $ curl > "http://localhost:8080/entityhub/sites/entity?id=3Dhttp://www.test.f/term= inology%23entity_gradient_1306341921902" > { > =C2=A0 =C2=A0"id": "http:\/\/www.test.fr\/terminology#entity_gradient_130= 6341921902", > =C2=A0 =C2=A0"site": "gasoil", > =C2=A0 =C2=A0"representation": { > =C2=A0 =C2=A0 =C2=A0 =C2=A0"id": > "http:\/\/www.test.fr\/terminology#entity_gradient_1306341921902", > =C2=A0 =C2=A0 =C2=A0 =C2=A0"http:\/\/www.w3.org\/2004\/02\/skos\/core#bro= ader": [{ > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0"type": "reference", > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0"value": > "http:\/\/www.test.fr\/terminology#entity_operateur_mathematique_13063419= 18995" > =C2=A0 =C2=A0 =C2=A0 =C2=A0}], > =C2=A0 =C2=A0 =C2=A0 =C2=A0"http:\/\/www.w3.org\/1999\/02\/22-rdf-syntax-= ns#type": [{ > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0"type": "reference", > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0"value": "http:\/\/www.w3.org\/2= 004\/02\/skos\/core#Concept" > =C2=A0 =C2=A0 =C2=A0 =C2=A0}], > =C2=A0 =C2=A0 =C2=A0 =C2=A0"http:\/\/www.w3.org\/2004\/02\/skos\/core#inS= cheme": [{ > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0"type": "reference", > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0"value": > "http:\/\/www.test.fr\/terminology#space_mathematiques_1306341820765" > =C2=A0 =C2=A0 =C2=A0 =C2=A0}], > =C2=A0 =C2=A0 =C2=A0 =C2=A0"http:\/\/www.w3.org\/2000\/01\/rdf-schema#lab= el": [{ > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0"type": "text", > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0"value": "GRADIENT" > =C2=A0 =C2=A0 =C2=A0 =C2=A0}], > =C2=A0 =C2=A0 =C2=A0 =C2=A0"http:\/\/www.w3.org\/2004\/02\/skos\/core#pre= fLabel": [{ > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0"type": "text", > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0"value": "GRADIENT" > =C2=A0 =C2=A0 =C2=A0 =C2=A0}] > =C2=A0 =C2=A0}, > =C2=A0 =C2=A0"metadata": { > =C2=A0 =C2=A0 =C2=A0 =C2=A0"id": > "http:\/\/www.test.fr\/terminology#entity_gradient_1306341921902.meta", > =C2=A0 =C2=A0 =C2=A0 =C2=A0"http:\/\/www.iks-project.eu\/ontology\/rick\/= model\/isChached": [{ > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0"type": "value", > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0"value": "true" > =C2=A0 =C2=A0 =C2=A0 =C2=A0}], > =C2=A0 =C2=A0 =C2=A0 =C2=A0"http:\/\/www.w3.org\/1999\/02\/22-rdf-syntax-= ns#type": [{ > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0"type": "reference", > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0"value": "http:\/\/xmlns.com\/fo= af\/0.1\/Document" > =C2=A0 =C2=A0 =C2=A0 =C2=A0}], > =C2=A0 =C2=A0 =C2=A0 =C2=A0"http:\/\/purl.org\/dc\/terms\/subject": [{ > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0"type": "reference", > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0"value": > "http:\/\/www.edf.fr\/terminology#entity_gradient_1306341921902" > =C2=A0 =C2=A0 =C2=A0 =C2=A0}] > =C2=A0 =C2=A0} > } This is the actual entity as requested. BTW: Rather than using the ReferencedSiteManager http://localhost:8080/entityhub/sites/entity?id=3D{id} it would be better to directly use the ReferencedSite http://localhost:8080/entityhub/site/{siteId}/entity?id=3D{id} because if you would have other ReferencedSites that do not define Entity prefixes that the Requests would be actually sent to more than one site before answered. If one knows what site do hold the searched entity, than it is always better to use directly this site. best Rupert Westenthaler > > > ++ > > > On 06/13/2011 03:54 PM, Rupert Westenthaler wrote: >> >> Hi florent >> >> Using a '#' in the URI has the disadvantages, that browsers will not >> send the part behind the hash to the server because they assume, that >> they need to download the whole document and navigate to the anchor >> within the document. >> >> Using curl (or javascript) I think the full URL should be sent to the >> server (was not able to find some good information about this, but at >> least "curl -v" says that it sends the whole URL to the server). >> However on the server side Jersey does also not provide the #{anchor} >> part of the URL. >> Sending >>> >>> >>> "http://localhost:8080/entityhub/sites/entity?id=3Dhttp://www.test.fr/t= erminology#entity_gradient_1306341921902" >> >> will parse only "http://www.test.fr/terminology" to a method annotated >> with >> >> =C2=A0 =C2=A0 @GET >> =C2=A0 =C2=A0 @Path("/entity") >> =C2=A0 =C2=A0 public Response getEntity(@QueryParam(value =3D "id") Stri= ng id) { >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 // get the Entity >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 ... >> >> URL encoding the '#' to '%23' causes Jersey to parse >> "http://www.test.fr/terminology#entity_gradient_1306341921902". >> >> In this case the query for an entity with this ID is correctly parsed >> to the ReferencedSite ( '#' not '%23'). So if you parse '%23' and the >> indexed Entity uses '#' it should work as long as Entities are cached >> locally. If a remote service is used, than the same problem of the '#' >> reappears for the remote service. >> >> To test on my side I have done the following: >> * renamed the Entities of the IPTC worldregions from >> "http://cv.iptc.org/newscodes/worldregion/r001" to >> "http://cv.iptc.org/newscodes/worldregion#r001" >> * indexed the IPTC using the indexing tools >> * installed the index to the entityhub >> * curl -v >> "http://localhost:8080/entityhub/sites/entity?id=3Dhttp://cv.iptc.org/ne= wscodes/worldregion%23r001" >> >> Assuming that >>> >>> curl >>> >>> "http://localhost:8080/entityhub/sites/entity?id=3Dhttp://www.test.fr/t= erminology%23space_mathematiques_1306341820765" >>> =3D=3D> =C2=A0answer is >>> Entity with ID >>> 'http://www.test.fr/terminology#space_mathematiques_1306341820765' not >>> found >>> an any referenced site >>> >> happend on a referenced site with a full cache (e.g. as created by the >> Indexing Utility. I was not able to reproduce the Error. If the >> referenced site uses a remote service to dereferenced entity ids (e.g. >> the Cool URI) this might happen. In this case I suggest to directly >> test the remote service. >> >> best >> Rupert Westenthaler >> >> >> On Mon, Jun 13, 2011 at 1:06 PM, florent andr=C3=A9 >> =C2=A0wrote: >>> >>> Hi Rupert, >>> Hope you are fine. >>> >>> I have another problem... >>> In my skos, entity are identify by an #, like this : >>> >>> =C2=A0>> rdf:about=3D"http://www.test.fr/terminology#entity_gradient_13063419219= 02"> >>> =C2=A0 =C2=A0>> >>> rdf:resource=3D"http://www.test.fr/terminology#entity_operateur_mathema= tique_1306341918995"/> >>> =C2=A0 =C2=A0GRADIENT >>> =C2=A0 =C2=A0>> >>> rdf:resource=3D"http://www.test.fr/terminology#space_mathematiques_1306= 341820765"/> >>> =C2=A0 =C2=A0 >>> =C2=A0 >>> >>> And I can't arrive to find the entity with the entity endpoint. >>> >>> * With the # char : >>> curl >>> >>> "http://localhost:8080/entityhub/sites/entity?id=3Dhttp://www.test.fr/t= erminology#entity_gradient_1306341921902" >>> =3D=3D> =C2=A0answer is >>> Entity with ID 'http://www.test.fr/terminology' not found an any >>> referenced >>> site >>> >>> =3D=3D> =C2=A0the part after the # is remove >>> >>> * With replacement of the # by %23 (the urlencode equivalent) : >>> curl >>> >>> "http://localhost:8080/entityhub/sites/entity?id=3Dhttp://www.test.fr/t= erminology%23space_mathematiques_1306341820765" >>> =3D=3D> =C2=A0answer is >>> Entity with ID >>> 'http://www.test.fr/terminology#space_mathematiques_1306341820765' not >>> found >>> an any referenced site >>> >>> =3D=3D> =C2=A0all the id is keep, but still not found... >>> The result is the same if I urlencode all the entity id. >>> >>> This is related to a bug or something I do wrong ? >>> >>> Thanks. >>> ++ >>> >>> >> >> >> > --=20 | Rupert Westenthaler=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=C2=A0 rupert= .westenthaler@gmail.com | Bodenlehenstra=C3=9Fe 11=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=C2=A0 ++43-699-11108907 | A-5500 Bischofshofen