hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ray Duong <ray.du...@gmail.com>
Subject Re: Hive-Hbase with large number of columns
Date Tue, 15 Jun 2010 22:37:15 GMT
Yes, I think I might have to do that. I was trying to avoid multiple Hbase
scans with one big table.

BTW, would it better to store the column SERDE properties for each column
versus at the table level to avoid the 767 or 4000 byte limitation?

Thanks again,
-ray


On Tue, Jun 15, 2010 at 2:42 PM, Edward Capriolo <edlinuxguru@gmail.com>wrote:

>
>
> On Tue, Jun 15, 2010 at 5:04 PM, Ray Duong <ray.duong@gmail.com> wrote:
>
>> Thanks for all the help.
>>
>> -ray
>>
>>
>> On Tue, Jun 15, 2010 at 1:26 PM, Carl Steinbach <carl@cloudera.com>wrote:
>>
>>> Hi Ray,
>>>
>>> 4000 bytes is the maximum VARCHAR size allowed on Oracle 9i/10g/11g. As
>>> far as I know this is the smallest maximum VARCHAR size out of the databases
>>> we currently try to support (MySQL, Oracle, Derby, etc).
>>>
>>> Carl
>>>
>>>
>>> On Tue, Jun 15, 2010 at 1:15 PM, Ray Duong <ray.duong@gmail.com> wrote:
>>>
>>>> Thank John/Carl,
>>>>
>>>> Yep, there seems to be a limit on the 767 byte size.  So I see the patch
>>>> HIVE-1364 to set it to 4000 bytes.  I'm using Db-derby, do you know if there
>>>> is a limit beyond 4000 bytes?
>>>>
>>>> -ray
>>>>
>>>> Error:
>>>> Caused by: ERROR 22001: A truncation error was encountered trying to
>>>> shrink VARCHAR
>>>> 'segment:ITC_10#ITC_1001,segment:CITC_10#ITC_1001,segment:ITC&' to length
>>>> 767.
>>>>
>>>>
>>>>
>>>>
>>>> On Tue, Jun 15, 2010 at 12:26 PM, Carl Steinbach <carl@cloudera.com>wrote:
>>>>
>>>>> Hi Ray,
>>>>>
>>>>> There is currently a 767 byte size limit on SERDEPROPERTIES values
>>>>> (see http://issues.apache.org/jira/browse/HIVE-1364). It's possible
>>>>> that you're bumping into this limitation (assuming you abbreviated the
>>>>> column names in your example).
>>>>>
>>>>>
>>>>> On Tue, Jun 15, 2010 at 12:03 PM, John Sichi <jsichi@facebook.com>wrote:
>>>>>
>>>>>> That exception is coming from the metastore (trying to write the
table
>>>>>> definition).  Could you dig down into the Hive logs to see if you
can get
>>>>>> the underlying cause?
>>>>>>
>>>>>>  You can get the logs to spew on console by adding "-hiveconf
>>>>>> hive.root.logger=DEBUG,console" to your Hive CLI invocation.
>>>>>>
>>>>>> JVS
>>>>>>
>>>>>> On Jun 15, 2010, at 11:57 AM, Ray Duong wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I'm trying to map a Hbase table in Hive that contains large number
of
>>>>>> columns.  Since Hbase is designed to be a wide table, does Hive/Hbase
>>>>>> integration have any set limitation on the number of columns it can
map in
>>>>>> one table?  I seem to hit a limit at 10 columns.
>>>>>>
>>>>>> Thanks,
>>>>>> -ray
>>>>>>
>>>>>> create external table hbase_t1
>>>>>> (
>>>>>> key string,
>>>>>> f1_a string,
>>>>>> f2_a string,
>>>>>> f1_b string,
>>>>>> f2_b string,
>>>>>> ...
>>>>>> ...
>>>>>> f1_m string,
>>>>>> f2_m string,
>>>>>>
>>>>>>  )
>>>>>>  STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
>>>>>>  WITH SERDEPROPERTIES ("hbase.columns.mapping" =
>>>>>> ":key,f1:a,f2:a,f1:b,f2:b,f1:c,f2:c,f1:d,f2:d,f1:e,f2:e,f1:f,f2:f,f1:g,f2:g,f1:h,f2:h,f1:i,f2:i,f1:j,f2:j,f1:k,f2:k,f1:l,f2:l,f1:m,f2:m"
>>>>>> )
>>>>>>  TBLPROPERTIES("hbase.table.name" = "t1");
>>>>>>
>>>>>> Error Message:
>>>>>>
>>>>>> FAILED: Error in metadata: javax.jdo.JDODataStoreException: Put
>>>>>> request failed : INSERT INTO `SERDE_PARAMS`
>>>>>> (`PARAM_VALUE`,`SERDE_ID`,`PARAM_KEY`) VALUES (?,?,?)
>>>>>> NestedThrowables:
>>>>>> org.datanucleus.store.mapped.exceptions.MappedDatastoreException:
>>>>>> INSERT INTO `SERDE_PARAMS` (`PARAM_VALUE`,`SERDE_ID`,`PARAM_KEY`)
VALUES
>>>>>> (?,?,?)
>>>>>> FAILED: Execution Error, return code 1 from
>>>>>> org.apache.hadoop.hive.ql.exec.DDLTask
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
> You have probably thought of this, but in the short term you can create two
> vertically partitioned tables and do a 1 to 1 join on their key.
> Edward
>
>

Mime
View raw message