jena-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andy Seaborne <andy.seabo...@epimorphics.com>
Subject Re: tdb loading speeds
Date Tue, 01 Feb 2011 10:46:37 GMT


On 31/01/11 22:44, Benson Margulies wrote:
> On Mon, Jan 31, 2011 at 5:13 PM, Andy Seaborne
> <andy.seaborne@epimorphics.com>  wrote:
>>
>>
>> On 31/01/11 19:36, Benson Margulies wrote:
>>>
>>> This call:
>>>
>>>   TDBLoader.load(tdbGraph, uncompressed, false);
>>>
>>> on a stream of nquads, seems to be considerably faster than the
>>> tdbloader2 command.
>>
>> How much data?
>
> 633574

I found the cross over, when execurting standalone tdblaoder1, was in 
the 800K to 1M range so that looks likely given java classloading foo.

>> TDBLoader.load is calling the core of tdbloader1
>
> I had trouble working out how to get 'tdbloader' to eat a file of nquads.

TDBLoader.load(DatasetGraphTDB, String)

and it guesses the syntax from the string filename (.nq).

Maybe it should expose

TDBLoader.load(DatasetGraphTDB, String, Lang)

as well -- added to my ToDo (which is not a short list :-()

	Andy

>>
>> tdbloader2 is comparable or slower than tdbloader1 for 100K's of triples
>> because it's exec'ing external processes.  What's more, TDBLoader.load does
>> not incur all the classloading overhead because it's already been done.
>>
>>         Andy
>>

Mime
View raw message