impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vincent Tran <vtt...@cloudera.com>
Subject bin/remote_data_load.py fails to load on full CDH cluster
Date Wed, 24 Jan 2018 04:16:30 GMT
While working on some data loading to a CDH cluster, I noticed that the
aforementioned script requires that the cluster must contain the exact
following services and nothing else:
HDFS,YARN,HIVE,IMPALA,MAPREDUCE,KUDU,HBASE,ZOOKEEPER

It also requires that all services on the cluster must be in STARTED state.

Which are in-line with the original prequisites from IMPALA-4031
<https://issues.apache.org/jira/browse/IMPALA-4031?focusedCommentId=15920517&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15920517>

Though these appears too strict to me. There shouldn't be any reason why
the script can't continue on a cluster with extra services as long as the
REQUIRED_SERVICES.issubset(services.keys()), should there?
I patched the script locally to allow it to continue. So far I've not
observed any problem. Are there any caveats to having extraneous services?

Vincent

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message