ignite-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ivan V." <iveselovs...@gridgain.com>
Subject Re: HDP, Hive + Ignite
Date Mon, 24 Apr 2017 16:44:23 GMT
Hi, Aloha,
First of all, Hadoop Accelerator consists of 2 parts that are independent
and can be used one without the other: (1) IGFS and (2) map-reduce
execution engine.

IGFS is not used in your case because default file system in your cluster
is still hdfs://.... (specified by global property "fs.default.name").
The 2 properties you set (*.igfs.impl=..) define the IGFS implementation
classes, but they come into play only when igfs:// schema encounters.
Idea to set fs.default.name=igfs://myhost:10500/ is not so good as it may
appear, because HDFS daemons (namenode, datanode) cannot run with such
property value, while you probably need HDFS as the underlying (secondary)
file system.

So, to use IGFS you should either use explicit URI with igfs:// schema as
you do in your example above "hadoop fs -ls igfs:///user/hive", or try to
instruct Hive to use igfs as default property, like this:
hive-1.2/bin/beeline \
--hiveconf fs.default.name=igfs://myhost:10500/ \
--hiveconf hive.rpc.query.plan=true \
--hiveconf mapreduce.framework.name=ignite \
--hiveconf mapreduce.jobtracker.address=myhost:11211 -u jdbc:hive2://

Also , in order to use Ignite Map-Reduce engine with Hive,  in HDP 2.4+ the
Hive execution engine (property "hive.execution.engine") should explicitly
be set to "mr", because the default value is different.

On Mon, Apr 24, 2017 at 3:09 PM, <aloha@74.ru> wrote:

> Hi,
> I have a cluster HDP 2.6 (High Available, 8 nodes) and like to try using
> Hive+Orc+Tez with Ignite. I guess I should use IFGS as cache layer for HDFS.
> I installed Hadoop Accelerator  1.9 on all cluster nodes and run one
> ignite-node on every cluster node.
> I added these settings using Ambari  and then restarted HDFS, MapReduce,
> Yarn, Hive.
> HDFS, add 2 new properties to Custom core-site
> fs.igfs.impl=org.apache.ignite.hadoop.fs.v1.IgniteHadoopFileSystem
> fs.AbstractFileSystem.igfs.impl=org.apache.ignite.hadoop.
> fs.v2.IgniteHadoopFileSystem
> Mapred, Custom mapred-site
> mapreduce.framework.name=ignite
> mapreduce.jobtracker.address=dev-nn1:11211
> Hive, Custom hive-site
> hive.rpc.query.plan=true
> Now I can get access to HDFS through IGFS
> hadoop fs -ls igfs:///user/hive
> Found 3 items
> drwx------  - hive hdfs          0 2017-04-19 21:00
> igfs:///user/hive/.Trash
> drwxr-xr-x  - hive hdfs          0 2017-04-19 10:07
> igfs:///user/hive/.hiveJars
> drwx------  - hive hdfs          0 2017-04-22 14:27
> igfs:///user/hive/.staging
> I thought that Hive read data from HDFS first time and then read the same
> data from IFGS.
> But when I run HIVE (cli or beeline) it still reads data from HDFS (I
> tried a few times), in igniteVisor "Avg. free heap" remains the same
> before/during/after running query (about 80%).
> What is wrong? May be I should load data to IFGS manually for every query?

View raw message