incubator-jena-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paolo Castagna <>
Subject Re: Blank nodes and MapReduce
Date Tue, 28 Jun 2011 13:37:44 GMT

Andy Seaborne wrote:
>>   public Node create(String label) {
>>       return Node.createAnon(new AnonId(filename + "-" + label)) ;
>>   }
> The way I thought was to allocate a UUID per parser run (or any other
> sufficiently large random number), xor the label into the UUID to
> produce the bNode label.  This is a non-localised label allocation scheme.

Hi Andy,
I am not sure this would work with MapReduce as filers are split into multiple
chunks and different machines can process splits from the same file.

Let's say I have this file, split into two chunks:


  <foo:bar> <foo:p> _:bnode1 .      split 1
  _:bnode1 <foo:q> "1" .


  _:bnode1 <foo:r> "2" .            split 2


I need to ensure the 'bnode1' label in split 1 and 2 refers to the same blank
node even if the splits are parsed separately. However, the same 'bnode1' label
from a different file must represent a different blank node. In practice, with
MapReduce, I cannot assume that a file is parsed in a single "parser run".

>> Therefore, I would like to have my own
>> LabelToNode implementation with an Allocator<String, Node>  which
>> takes into
>> account the filename (or an hash of it) when it creates a new blank node.
>> But LabelToNode constructor is private.
>> Could we make it protected?
> Now public.



>> Or, alternatively, how can I construct a LabelToNode object which will
>> be using
>> my MapReduceAllocator?
> LabelToNode createUseLabelAsGiven()
>     Andy

View raw message