accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dickson, Matt MR" <>
Subject RE: Removing splits [SEC=UNCLASSIFIED]
Date Wed, 10 Apr 2013 22:27:13 GMT

Thanks for all the replies on this.

Based on the feedback, particularly considering the high number of splits maintainable per
server, I'll leave the splits in place.  I'm not keen on merging tablets due to its impact
on query performance.

Thanks again.


From: Eric Newton []
Sent: Tuesday, 9 April 2013 11:57
Subject: Re: Removing splits [SEC=UNCLASSIFIED]

Is there a maximum number of splits a table can have?

There are a few theoretical limits to the number of tablets you can have.

1) a row cannot be split over tablets: if you only have a billion rows, you can only have
a billion tablets
2) tablet servers track some bits of overhead about a tablet in memory: typically this is
only a 1-2K per tablet, so a gigabyte JVM would only be able to have a 500K-1M tablets per
3) there's a limit to the number of files/directories that can be stored in your NameNode.
 More tablets tend to create more files and directories.

Performance is likely to be poor at these limits, and it would not be helpful to approach

I have seen stable clusters with over 500K tablets.

  How can splits be removed once they are nolonger required, I can't see any command in the

With version 1.4, you can merge tablets together.  In the shell, you can merge ranges, or
have the shell merge ranges based on size.

With version 1.5, you will be able to merge METADATA tablets together.


IMPORTANT: This email remains the property of the Department of Defence and is subject to
the jurisdiction of section 70 of the Crimes Act 1914. If you have received this email in
error, you are requested to contact the sender and delete the email.

View raw message