incubator-jena-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andy Seaborne <>
Subject Re: JenaPerf and datasets...
Date Sat, 15 Oct 2011 16:26:08 GMT
On 11/10/11 08:12, Paolo Castagna wrote:
> Hi Andy,
> are you planning to put a few datasets in SVN together with the queries in JenaPerf?
> I saw a data directory for LUBM but not data in it:
>  From a user perspective it would be great to just do:
>    svn co
>    cd JenaPerf
>    ./run
> Installing any of LUBM, BSBM or SP2B (although not incredibly complicate) isn't trivial.

LUBM: The generator and test driver code is GPL.  The queries I have are 
taken from the published paper, translated by me to SPARQL so can they 
be distributed.  Data can be generated.

BSBM: The queries are actually templates and instantiated at runtime 
using a configuration file which is generated when the data is 
generated.  Generating data isn't just creating RDF triples.

The queries templates exist in the code base (bsbmtools on SF).  I have 
been talking to the creators and the license has changed from GPL to AL 
(thanks guys).  So it will be possible to include queries from the 
codebase - the templating will have to be written.  (the license change 
affects JenaPerf becuase it is redistributing, unlike downloading and 

SP2B is published under BSD.


>  From a community and project perspective, it's quite good and helpful
> to have a standard set of datasets. Although, I realize that if datasets
> are not small, it might take a while to download them.
> Can we use .gz datasets with JenaPerf?
> We could also include small-medium size dataset together with JenaPerf
> and have a separate checkout/download for larger ones.
> What do you think?
> Paolo

View raw message