hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Elliot West <tea...@gmail.com>
Subject Re: VARCHAR or STRING fields in Hive
Date Mon, 16 Jan 2017 17:00:54 GMT
Internally it looks as though Hive simply represents CHAR/VARCHAR values
using a Java String and so I would not expect a significant change in
execution performance. The Hive JIRA suggests that these types were added
to 'support for more SQL-compliant behavior, such as SQL string comparison
semantics, max length, etc.' rather than for performance reasons.

   - https://issues.apache.org/jira/browse/HIVE-4844
   - https://issues.apache.org/jira/browse/HIVE-5191

In terms of storage I expect it depends on the underlying file format and
the types that these values are encoded to. Parquet does appear to support
the specific encoding of both CHAR/VARCHAR, however I'm skeptical that
there would be significant storage efficiencies gained by using the CHAR
types, over String for comparable values. I'd be keen to hear otherwise.

   - https://issues.apache.org/jira/browse/HIVE-7735

Thanks,

Elliot.

On 16 January 2017 at 15:37, Mich Talebzadeh <mich.talebzadeh@gmail.com>
wrote:

>
> Coming from DBMS background I tend to treat the columns in Hive similar to
> an RDBMS table. For example if a table created in Hive as Parquet I will
> use VARCHAR(30) for column that has been defined as VARCHAR(30) as source.
> If a column is defined as TEXT in RDBMS table I use STRING in Hive with a
> max size of 2GB I believe.
>
> My view is that it is more efficient storage wise to have Hive table
> created as VARCHA as opposed to STRING.
>
> I have not really seen any performance difference if one uses VARCHAR or
> STRING. However, I believe there is a reason why one has VARCH in Hive as
> opposed to STRRING.
>
> What is the thread view on this?
>
> Thanks
>
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>
>
>
> http://talebzadehmich.wordpress.com
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>

Mime
View raw message