apex-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pramod Immaneni <pra...@datatorrent.com>
Subject Block reading and data locality
Date Mon, 09 May 2016 21:32:50 GMT
The file splitter, block reader combination allows for parallel reading of
files by multiple partitions by dividing the files into blocks. Does anyone
have any ideas on how to have the block readers be data local to the blocks
they are reading.

I think we will need to spawn block readers on all nodes where the block
are present and if the readers are reading multiple files this could mean
all the nodes in the cluster and route the block meta information to the
appropriate block reader.


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message