hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Purtell (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HBASE-2007) handle overly large column family in one row
Date Tue, 25 May 2010 18:48:38 GMT

     [ https://issues.apache.org/jira/browse/HBASE-2007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Andrew Purtell updated HBASE-2007:

    Priority: Minor  (was: Major)

> handle overly large column family in one row
> --------------------------------------------
>                 Key: HBASE-2007
>                 URL: https://issues.apache.org/jira/browse/HBASE-2007
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Andrew Purtell
>            Priority: Minor
> From a team in TM:
> {quote}
> I tried to create the large column in one row and one family. The value of each column
has only 10 bytes. When I created the 804226850th column, the region server was crashed. I
find that all columns of one row and one family will in the same region. The region server
will crash, because this region is too large. And then I want to reboot HBase, after several
minutes, the region servers will crash one by one.
> If one row and one family cannot split, then no matter how many machines are in HBase
system, the capacity of HBase will be limited by one machine. I want to know whether this
problem is a bug. If the column quantity in one row and one family is limited, can you tell
me the safe range? 
> {quote}
> Currently a row cannot be split. So an individual row can expand only to some finite
limit constrained by the region server capability. 
> I am impressed that a row was able to successfully contain 804,226,849 columns. 
> The HBase storage capability goals are currently "billions of rows, millions of columns,
thousands of tables". A test involving hundreds of millions of columns is very challenging.

> Most important, HBase should not accept input beyond some limit which produces a cascading
> I think we also do want to have the architectural discussion about rows that must span
region servers due to immensity. 

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message