incubator-hcatalog-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alan Gates <>
Subject Re: When to use HCatReader/HCatWriter API ?
Date Tue, 04 Dec 2012 01:56:27 GMT
The intended use case for the HCatReader and HCatWriter APIs is when you have a system that
wants to read from or write to HCat/Hive in parallel but wants to drive the parallelism from
its side.  If for example you had a parallel data processing system that wanted each of its
worker nodes to be able to pull parts of a Hive partition, then this is the appropriate interface.

The advantage of this API over directly using HDFS is abstraction.  The client need not understand
the file location or file format and need not change if either of those change.


On Dec 2, 2012, at 10:15 PM, 大西高史 wrote:

> Dear All,
> I’m trying to use HCatalog 0.4.0.
> In the javadoc, I've found the HCatReader/HCatWriter API and the online
> documentation, too (following url).
> I’ve read the documentation, but not figured out the usage scene of this
> API…
> Does anyone know what’s the purpose of this API ?
> When to use this API instead of HDFS API or “hadoop dfs” command?
> Regards,
> -Takashi

View raw message