orc-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Owen O'Malley" <owen.omal...@gmail.com>
Subject Re: Hive - Json Serde - ORC
Date Wed, 06 Dec 2017 19:22:52 GMT
I agree this is mostly a Hive question. However, I'll take a pass at it.

You can make Hive tables where different partitions have different storage.
Typically, this happens when you start with one format and switch to a new
one and the new partitions get stored in the new format. There isn't
support for mixing two different file formats in the same partition.

Typically, you create two tables Tbl and Tbl_staging where Tbl is stored as
ORC and Tbl_staging is stored as JSON. Put the files into Tbl_staging and
then use insert to copy and translate the data.

.. Owen

On Wed, Dec 6, 2017 at 4:04 AM, kaducangica . <kaducangica@gmail.com> wrote:

> Hi all,
> i have a very complex json that i need to insert in a hive table. A json
> example follws attached.
> First of all i read a json file with Spark to make some data processing
> and then i write to a stage table with no Serde and with no any kind of
> compression and format.
> Then i do an INSERT/SELECT into the "jsonTable" (create table attached)
> with no problems. This table use a json Serde (org.openx.data.jsonserde.JsonSerDe)
> and a ORC format and is also particioned by date and timezone.
> The problem is that after all this process every time a try to make a
> simple "select * from jsonTable" query i got this error message:
> "Failed with exception java.io.IOException:java.io.IOException: Error
> reading file: hdfs://ip-xxx-xxx-xxx-xxx.sa-east-1.compute.int
> ernal:8020/user/hive/warehouse/jsonTable/data_posicao_short=
> 2017-12-02/veitimezone=America-Sao_Paulo/000000_0"
> Actually i do not know if it is possible to use Serde, ORC and partition
> in the same table.
> Someone could help me?
> Thanks in advance.
> Best regards
> Carlos.

View raw message