hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Purtell (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-2007) handle overly large column family in one row
Date Wed, 21 Apr 2010 06:42:53 GMT

    [ https://issues.apache.org/jira/browse/HBASE-2007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12859237#action_12859237
] 

Andrew Purtell commented on HBASE-2007:
---------------------------------------

Right, this issue was about maybe preventing people from doing dumb things like that which
can bring down a RS, putting a gate down when overlimit. But I see this as something minor,
good to have.



> handle overly large column family in one row
> --------------------------------------------
>
>                 Key: HBASE-2007
>                 URL: https://issues.apache.org/jira/browse/HBASE-2007
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: Andrew Purtell
>
> From a team in TM:
> {quote}
> I tried to create the large column in one row and one family. The value of each column
has only 10 bytes. When I created the 804226850th column, the region server was crashed. I
find that all columns of one row and one family will in the same region. The region server
will crash, because this region is too large. And then I want to reboot HBase, after several
minutes, the region servers will crash one by one.
> If one row and one family cannot split, then no matter how many machines are in HBase
system, the capacity of HBase will be limited by one machine. I want to know whether this
problem is a bug. If the column quantity in one row and one family is limited, can you tell
me the safe range? 
> {quote}
> Currently a row cannot be split. So an individual row can expand only to some finite
limit constrained by the region server capability. 
> I am impressed that a row was able to successfully contain 804,226,849 columns. 
> The HBase storage capability goals are currently "billions of rows, millions of columns,
thousands of tables". A test involving hundreds of millions of columns is very challenging.

> Most important, HBase should not accept input beyond some limit which produces a cascading
failure. 
> I think we also do want to have the architectural discussion about rows that must span
region servers due to immensity. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message