accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Terry P." <>
Subject How to pre-split a table for UUID rowkeys
Date Fri, 02 Aug 2013 21:41:38 GMT
Greetings folks,
Have a bit of a non-typical Accumulo use case using Accumulo as a backend
data store for a search index to provide fault tolerance should the index
get corrupted.  Max docs stored in Accumulo will be under 1 billion at full

The search index is used to "find" the data a user is interested in, and
the search index then retrieves the document from Accumulo using its RowKey
which was gotten from the search index.  The RowKey is a java.util.UUID
string that has had the '-' dashes stripped out.

I have a 3 node cluster and as a quick test have ingested 5 million 1K
documents into it, yet they all went to a single TabletServer.  I was kind
of surprised -- I knew this would be the case for a row key using a
monotonically increasing number, but I thought with a UUID type rowkey the
entries would have been spread across the TabletServers at least some, even
without pre-splitting the table.

Clearly my understanding of how Accumulo spreads the data out is lacking.
 Can anyone shed more light on it?  And possibly recommend a table split
strategy for a 3-node cluster such as I have described?

Many thanks in advance,

View raw message