jena-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Heidi McClure <heidi.mccl...@issinc.com>
Subject RE: Creating Spatial Lucene index from existing TDB data store
Date Wed, 26 Feb 2014 16:20:35 GMT
Thanks Andy - I'll look at the jena.spatialindexer for creating an index from existing TDB
data.  

In case it helps others, below is the config I used to start a TDB backed store with both
text and spatial indexes configured.  I used this command to start:

fuseki-server --config=C:/jena/jena-fuseki-1.0.0/config-text-spatial-myTDBStore.ttl /ds

And I added the JTS classes to my fuseki-server.jar.

config-text-spatial-myTDBStore.ttl is a modified version of the config-text.ttl in the jena
documentation and the text-spatial one contains:

## Example of a TDB dataset and text index published using Fuseki

@prefix :        <#> .
@prefix fuseki:  <http://jena.apache.org/fuseki#> .
@prefix rdf:     <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#> .
@prefix tdb:     <http://jena.hpl.hp.com/2008/tdb#> .
@prefix ja:      <http://jena.hpl.hp.com/2005/11/Assembler#> .
@prefix text:    <http://jena.apache.org/text#> .
@prefix spatial: <http://jena.apache.org/spatial#> .
@prefix geosparql: <http://www.opengis.net/ont/geosparql#> .

[] rdf:type fuseki:Server ;
   # Timeout - server-wide default: milliseconds.
   # Format 1: "1000" -- 1 second timeout
   # Format 2: "10000,60000" -- 10s timeout to first result, then 60s timeout to for rest
of query.
   # See java doc for ARQ.queryTimeout
   # ja:context [ ja:cxtName "arq:queryTimeout" ;  ja:cxtValue "10000" ] ;
   # ja:loadClass "your.code.Class" ;

   fuseki:services (
     <#service_text_tdb>
	 <#service_spatial_tdb>
   ) .

# TDB
[] ja:loadClass "com.hp.hpl.jena.tdb.TDB" .
tdb:DatasetTDB  rdfs:subClassOf  ja:RDFDataset .
tdb:GraphTDB    rdfs:subClassOf  ja:Model .

# Text
[] ja:loadClass "org.apache.jena.query.text.TextQuery" .
text:TextDataset      rdfs:subClassOf   ja:RDFDataset .
#text:TextIndexSolr    rdfs:subClassOf   text:TextIndex .
text:TextIndexLucene  rdfs:subClassOf   text:TextIndex .

# Spatial
[] ja:loadClass "org.apache.jena.query.spatial.SpatialQuery" .
spatial:SpatialtDataset  rdfs:subClassOf  ja:RDFDataset .
#spatial:SpatialIndexSolr  rdfs:subClassOf  spatial:SpatialIndex .
spatial:SpatialIndexLucene  rdfs:subClassOf   spatial:SpatialIndex .

## ---------------------------------------------------------------

<#service_text_tdb> rdf:type fuseki:Service ;
    rdfs:label                      "TDB/text service" ;
    fuseki:name                     "ds" ;
    fuseki:serviceQuery             "query" ;
    fuseki:serviceQuery             "sparql" ;
    fuseki:serviceUpdate            "update" ;
    fuseki:serviceUpload            "upload" ;
    fuseki:serviceReadGraphStore    "get" ;
    fuseki:serviceReadWriteGraphStore    "data" ;
    fuseki:dataset                  <#text_dataset> ;
    .

<#text_dataset> rdf:type     text:TextDataset ;
    text:dataset   <#dataset> ;
    ##text:index   <#indexSolr> ;
    text:index     <#indexTextLucene> ;
    .

<#dataset> rdf:type      tdb:DatasetTDB ;
    tdb:location "Data/myTDBStore" ;
    ##tdb:unionDefaultGraph true ;
    .

<#indexSolr> a text:TextIndexSolr ;
    #text:server <http://localhost:8983/solr/COLLECTION> ;
    text:server <embedded:SolrARQ> ;
    text:entityMap <#entMap> ;
    .

<#indexTextLucene> a text:TextIndexLucene ;
    text:directory <file:Data/myTDBStore_text_index> ;
    ##text:directory "mem" ;
    text:entityMap <#entMap> ;
    .

<#entMap> a text:EntityMap ;
    text:entityField      "uri" ;
    text:defaultField     "text" ;        ## Should be defined in the text:map.
    text:map (
         # rdfs:label            
         [ text:field "text" ; text:predicate rdfs:label ]
         ) .

##---------------------------------------------------------------
<#service_spatial_tdb> rdf:type fuseki:Service ;
    rdfs:label                      "TDB/spatial service" ;
    fuseki:name                     "ds" ;
    fuseki:serviceQuery             "query" ;
    fuseki:serviceQuery             "sparql" ;
    fuseki:serviceUpdate            "update" ;
    fuseki:serviceUpload            "upload" ;
    fuseki:serviceReadGraphStore    "get" ;
    fuseki:serviceReadWriteGraphStore    "data" ;
    fuseki:dataset                  :spatial_dataset ;
	.
	
:spatial_dataset rdf:type     spatial:SpatialDataset ;
    spatial:dataset   <#dataset> ;
    ##spaital:index   <#indexSolr> ;
    spatial:index     <#indexSpatialLucene> ;
    .

<#indexSpatialLucene> a spatial:SpatialIndexLucene ;
    spatial:directory <file:Data/myTDBStore_spatial_index> ;
    #spatial:directory "mem" ;
    spatial:definition <#definition> ;
    .

<#definition> a spatial:EntityDefinition ;
    spatial:entityField      "uri" ;
    spatial:geoField     "geo" ;
    # custom geo predicates for 1) Latitude/Longitude Format
    spatial:hasSpatialPredicatePairs (
         [ spatial:latitude :latitude_1 ; spatial:longitude :longitude_1 ]
         [ spatial:latitude :latitude_2 ; spatial:longitude :longitude_2 ]
         ) ;
    # custom geo predicates for 2) Well Known Text (WKT) Literal
    spatial:hasWKTPredicates (:wkt_1 :wkt_2 geosparql:asWKT) ;
    # custom SpatialContextFactory for 2) Well Known Text (WKT) Literal
    spatial:spatialContextFactory
         "com.spatial4j.core.context.jts.JtsSpatialContextFactory"
    .

-----Original Message-----
From: Andy Seaborne [mailto:andy@apache.org] 
Sent: Wednesday, February 26, 2014 7:35 AM
To: users@jena.apache.org
Subject: Re: Creating Spatial Lucene index from existing TDB data store

On 25/02/14 19:41, Heidi McClure wrote:
> I have an existing TDB data store that I would like to create a spatial index for.  Are
there utilities or API's to do this?

jena.spatialindexer creates an index - but also when you load data into a spatial dataset,
it gets indexed automatically when configured ...


> I have successfully followed the examples for reading in from .ttl files and creating
the TDB and spatial index data at the same time.
>
> My TDB has GeoSPARQL nodes in it like:
>
> <http://issinc.com/events#event_901112240019> <http://www.opengis.net/ont/geosparql#asWKT>
"POINT(1.7488388 40.05863)"^^<http://www.opengis.net/ont/geosparql#wktLiteral>

You'll need to configure in the JTS library

http://jena.apache.org/documentation/query/spatial-query.html#supported-geo-data-for-indexing-and-querying

(and I'm going on the documentation here...)

You will need to configure the EntityDefinition to correspond to the data.

	Andy
>
> thanks,
> -heidi
>
>
>


Mime
View raw message