hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Edward Capriolo <edlinuxg...@gmail.com>
Subject Re: Hive production layout suggestions
Date Sun, 13 May 2012 14:28:54 GMT
 Xiaobo,

I believe you misunderstand some basic parts of hive.

1) You do not need to run the metastore server. It is an optional
component. Many people use JDBC and this allows multiple users to
concurrently use hive without having separate installs.

2) CLI does not have an embedded hive server

3) Hive servers can handle more then one connection at once but they
have a few subtle concurrency issues being worked on.

On Sun, May 13, 2012 at 8:36 AM, Xiaobo Gu <guxiaobo1982@gmail.com> wrote:
> Hi,
>
> To let multiple users share a single Hive instance, we know that we
> should use stand alone metastore services, but what about cli (and
> other clients) and hiveserver services, what's the best pratice for
> the server layout for a production Hive instance?
>
> 1. I think hive metastore, hwi, and hiveserver services are all hadoop
> clients, they should be running on servers which are not part of the
> Hadoop cluster, so we should prepare a dedicated server for them, or
> one server for each service, this is dependent on workloads.
> 2. For cli users, because cli has embedded hiveserver, which can
> connect to metastore service directlly, we can install hive clis on
> their workstations, with the same Hadoop/Hive binaries and
> configuration files on their workstations.
> 3. For JDBC and ODBC clients, because they must connect to a
> hiveserver, which can only handle one query at a time, so we must
> start one hiveserver service for each client, only the JDBC,ODBC
> driver is needed on the client, no Hive or Hadoop binaries are needed
> on them.
>
> Do I miss anything?
>
> Regards,
>
> Xiaobo Gu

Mime
View raw message