hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Taylor, Ronald C" <ronald.tay...@pnl.gov>
Subject RE: How do you increase the max cell size in Hbase?
Date Sun, 03 Oct 2010 06:35:22 GMT
 
FYI - there is nothing in the string itself that would cause an error. It's just a concatenated
list of integer values, with colon separators between the numbers. I've checked it visually
- and, yep, the string is indeed a simple ASCII list of integers with colon delimiters. Nothing
weird in it. So the problem is something in regard to the length - not the string contents.

Ron

-----Original Message-----
From: Taylor, Ronald C 
Sent: Saturday, October 02, 2010 11:25 PM
To: 'Ryan Rawson'; user@hbase.apache.org
Cc: Taylor, Ronald C
Subject: RE: How do you increase the max cell size in Hbase?

 
Ryan,

I just tried chopping the string to a max of 5 Meg, down from about 12 Meg at its largest,
and the insertions appear to work fine. I do a scan afterwards and the fields and their contents
appear to be all there. So - it would appear that I'm violating *some* length limit, somewhere,
when I use the full 12 Meg string.

Here's the relevant code for the version that worked, with the replacement of the full string
with the 5 Meg string instead:

         rowID = CURRENT_SPECIES_ID_ABBREV + "_" + RNASEQ_RUNID + "_chromo_pos_strand";
         p = new Put(Bytes.toBytes(rowID));

         chromo_pos_strand_counts = buf.toString();
         
         System.out.println("size of chromo_pos_strand_counts = " + chromo_pos_strand_counts.length());
         System.out.println("writing chromo positive strand data to table ...\n\n");
         
         temp = chromo_pos_strand_counts.substring(0,5000000);
         // p.add(Bytes.toBytes(colFamily),   
         //       Bytes.toBytes("Chromo_Positive_Strand_rnaSeq_Counts"),Bytes.toBytes(chromo_pos_strand_counts));
         p.add(Bytes.toBytes(colFamily),Bytes.toBytes("Chromo_Positive_Strand_rnaSeq_Counts"),Bytes.toBytes(temp));

         p.add(Bytes.toBytes(colFamily),Bytes.toBytes("Strand"),Bytes.toBytes("positive"));
         p.add(Bytes.toBytes(colFamily),Bytes.toBytes("Source"),Bytes.toBytes("chromosome"));
         p.add(Bytes.toBytes(colFamily),Bytes.toBytes("Species"),Bytes.toBytes(CURRENT_SPECIES_ID));
         p.add(Bytes.toBytes(colFamily),Bytes.toBytes("SpeciesAbbrev"),Bytes.toBytes(CURRENT_SPECIES_ID_ABBREV));
         p.add(Bytes.toBytes(colFamily),Bytes.toBytes("RunID"),Bytes.toBytes(RNASEQ_RUNID));
         rnaSeqCountTable.put(p);


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

Going back to using the full string, I get this error msg:

size of chromo_pos_strand_counts = 12897748 writing chromo positive strand data to table ...

Exception java.lang.IllegalArgumentException: KeyValue size too large in   write_rnaSeqCountData_Into_rnaSeqCountTable()

e = 'java.lang.IllegalArgumentException: KeyValue size too large'

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

Any thoughts?

Ron

___________________________________________
Ronald Taylor, Ph.D.
Computational Biology & Bioinformatics Group Pacific Northwest National Laboratory
902 Battelle Boulevard
P.O. Box 999, Mail Stop J4-33
Richland, WA  99352 USA
Office:  509-372-6568
Email: ronald.taylor@pnl.gov

-----Original Message-----
From: Ryan Rawson [mailto:ryanobjc@gmail.com]
Sent: Saturday, October 02, 2010 11:07 PM
To: Taylor, Ronald C
Cc: user@hbase.apache.org
Subject: Re: How do you increase the max cell size in Hbase?

Hey,

The max sizes of these things are determined by how many bits we use to specify lengths in
the file format.  Changing it isn't an option because the file format depends on it and done
by code.

However you shouldn't be hitting limits based on the data you told us... perhaps if you could
paste the exception backtrace it might indicate something useful.

-ryan

On Sat, Oct 2, 2010 at 10:56 PM, Taylor, Ronald C <ronald.taylor@pnl.gov> wrote:
>
> Hi Ryan,
>
> Well, the key is only 50 chars or so, and the data in the cell is 
> about 12 Meg, in one string. So - obviously this is way less than the
> 2 Gb limit for each. Does this have anything to do with the row 
> length, which is supposed to be kept to less than, from what you say 
> below,
>    Short.MAX_LENGTH ?
>
> I do not see MAX_LENGTH being set anywhere in the conf files (just did a grep), so -
what is the default row max  length, and where would I reset it, if needed?
>
> And if that's *not* the problem, any other possibilities for an error 
> msg of
>  KeyValue size too large
>
>  being given on a rather prosaic Put insertion?
>
> Ron
>
> -----Original Message-----
> From: Ryan Rawson [mailto:ryanobjc@gmail.com]
> Sent: Saturday, October 02, 2010 10:42 PM
> To: user@hbase.apache.org
> Cc: Taylor, Ronald C
> Subject: Re: How do you increase the max cell size in Hbase?
>
> Hey,
>
> The limits are due to code/data limits, eg: how many bits of space we use to indicate
lengths and such.  This is something like ~ 2gb for the "key" part and the "value" part each.
 Furthermore the row can only be Short.MAX_LENGTH.
>
> There is specific exceptions for each one of these in trunk/0.89, do you have a specific
exception text?
>
> -ryan
>
> On Sat, Oct 2, 2010 at 10:31 PM, Taylor, Ronald C <ronald.taylor@pnl.gov> wrote:
>>
>> Hello,
>>
>> I would like to increase the max cell size in one of my Hbase tables.
>> Just got an error msg when trying to insert something about 12 Meg in 
>> size that said
>>
>>  KeyValue size too large
>>
>> I presume that I'm using the default cell max size at present - I cannot find anything
regarding cell size in the conf files, so the default setting must be being used.
>>
>> How do I increase the max size allowed? For example, if I want to allow a string
of up to 20 Meg in size, which conf file do I change, and what is the precise wording?
>>
>> I googled around and found a note on MAX_LENGTH, but it is unclear 
>> how to set it and where. Do I do something like
>>
>>  MAX_LENGTH=20
>>
>>  in the conf//hbase-env.sh  file?
>>
>> And, if that works, do I need to restart Hbase and then recreate all the tables in
which I want to use the new max cell size?
>>
>>  Cheers,
>>  Ron
>>
>> ___________________________________________
>> Ronald Taylor, Ph.D.
>> Computational Biology & Bioinformatics Group Pacific Northwest 
>> National Laboratory
>> 902 Battelle Boulevard
>> P.O. Box 999, Mail Stop J4-33
>> Richland, WA  99352 USA
>> Office:  509-372-6568
>> Email: ronald.taylor@pnl.gov
>>
>>
>>
>

Mime
View raw message