We are in the early stages of thinking about a project that needs to store data that will be accessed by Hadoop. One of the concerns we have is around the Latency of HDFS as our use case is is not for reading all the data and hence we will need custom RecordReaders etc.

I've seen a couple of comments that you shouldn't put large chunks in to a value - however 'large' is not well defined for the range of people using these solutions ;-)

Doe anyone have a rough rule of thumb for how big a single value can be before we are outside sanity?



Franc Carter | Systems architect | Sirca Ltd

franc.carter@sirca.org.au | www.sirca.org.au

Tel: +61 2 9236 9118

Level 9, 80 Clarence St, Sydney NSW 2000

PO Box H58, Australia Square, Sydney NSW 1215