hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jörn Franke <>
Subject Re: Handling LZO files
Date Thu, 03 Dec 2015 13:32:37 GMT

How many nodes, cores and memory do you have?
What hive version?

Do you have the opportunity to use tez as an execution engine?
Usually  I use external tables only for reading them and inserting them into a table in Orc
or parquet format for doing analytics.
This is much more performant than json or any other text-based format.

> On 03 Dec 2015, at 14:20, Harsha HN <> wrote:
> Hi,
> We have LZO compressed JSON files in our HDFS locations. I am creating an "External"
table on the data in HDFS for the purpose of analytics. 
> There are 3 LZO compressed part files of size 229.16 MB, 705.79 MB, 157.61 MB respectively
along with their index files. 
> When I run count(*) query on the table I observe only 10 mappers causing performance
> I even tried following, (going for 30MB split)
>  1)  set mapreduce.input.fileinputformat.split.maxsize=31457280;
> 2) set dfs.blocksize=31457280;
> But still I am getting 10 mappers.
> Can you please guide me in fixing the same?
> Thanks,
> Sree Harsha

View raw message