impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Amos Bird <amosb...@gmail.com>
Subject Re: test errors in local env
Date Sun, 18 Sep 2016 15:13:45 GMT

> On Fri, Sep 16, 2016 at 9:06 PM, Amos Bird <amosbird@gmail.com> wrote:
>
>>
>> Hi there,
>>
>> I followed the wiki
>> https://cwiki.apache.org/confluence/display/IMPALA/How+
>> to+load+and+run+Impala+tests
>> carefully but still have some problems in my local env.
>>
>> 1. I need to manually execute "hdfs dfs -mkdir /test-warehouse/emptytable"
>> to get rid of some fe test error.
>>
>>
> Ideally, you should not have to do this. Could you tell me what errors you
> encountered? Sounds like there may be a test or data loading bug we should
> fix.

The error is :

TestLoadData(com.cloudera.impala.analysis.AnalyzeStmtsTest)  Time elapsed: 0.033 sec  <<<
FAILURE!
java.lang.AssertionError: got error:
INPATH location 'hdfs://localhost:20500/test-warehouse/emptytable' does not exist.
expected:
INPATH location 'hdfs://localhost:20500/test-warehouse/emptytable' contains no visible files.
  at org.junit.Assert.fail(Assert.java:88)
  at org.junit.Assert.assertTrue(Assert.java:41)
  at com.cloudera.impala.common.FrontendTestBase.AnalysisError(FrontendTestBase.java:312)
  at com.cloudera.impala.common.FrontendTestBase.AnalysisError(FrontendTestBase.java:292)
  at com.cloudera.impala.analysis.AnalyzeStmtsTest.TestLoadData(AnalyzeStmtsTest.java:2860)

>>
>> 2. I have authz-policy.ini in HDFS, but I still get authorization errors.
>>
>> TestSelect[0](com.cloudera.impala.analysis.AuthorizationTest)  Time
>> elapsed: 0.333 sec  <<< FAILURE!
>> java.lang.AssertionError: got error:
>> User 'amos' does not have privileges to execute 'SELECT' on: default.nodb
>> expected:
>> User 'amos' does not have privileges to execute 'SELECT' on: nodb.alltypes
>>   at org.junit.Assert.fail(Assert.java:88)
>>   at org.junit.Assert.assertTrue(Assert.java:41)
>>   at com.cloudera.impala.analysis.AuthorizationTest.AuthzError(
>> AuthorizationTest.java:2220)
>>   at com.cloudera.impala.analysis.AuthorizationTest.AuthzError(
>> AuthorizationTest.java:2203)
>>   at com.cloudera.impala.analysis.AuthorizationTest.AuthzError(
>> AuthorizationTest.java:2197)
>>   at com.cloudera.impala.analysis.AuthorizationTest.TestSelect(
>> AuthorizationTest.java:512)
>>
>> TestSelect[1](com.cloudera.impala.analysis.AuthorizationTest)  Time
>> elapsed: 0.324 sec  <<< FAILURE!
>> java.lang.AssertionError: got error:
>> User 'amos' does not have privileges to execute 'SELECT' on: default.nodb
>> expected:
>> User 'amos' does not have privileges to execute 'SELECT' on: nodb.alltypes
>>   at org.junit.Assert.fail(Assert.java:88)
>>   at org.junit.Assert.assertTrue(Assert.java:41)
>>   at com.cloudera.impala.analysis.AuthorizationTest.AuthzError(
>> AuthorizationTest.java:2220)
>>   at com.cloudera.impala.analysis.AuthorizationTest.AuthzError(
>> AuthorizationTest.java:2203)
>>   at com.cloudera.impala.analysis.AuthorizationTest.AuthzError(
>> AuthorizationTest.java:2197)
>>   at com.cloudera.impala.analysis.AuthorizationTest.TestSelect(
>> AuthorizationTest.java:512)
>>
>>
>> Results :
>>
>> Failed tests:
>>   AuthorizationTest.TestSelect:512->AuthzError:2197->
>> AuthzError:2203->AuthzError:2220 got error:
>> User 'amos' does not have privileges to execute 'SELECT' on: default.nodb
>> expected:
>> User 'amos' does not have privileges to execute 'SELECT' on: nodb.alltypes
>>   AuthorizationTest.TestSelect:512->AuthzError:2197->
>> AuthzError:2203->AuthzError:2220 got error:
>> User 'amos' does not have privileges to execute 'SELECT' on: default.nodb
>> expected:
>> User 'amos' does not have privileges to execute 'SELECT' on: nodb.alltypes
>>
>>
>>
> Strange. In this test, we register two authorization requests, and it seems
> like those are not checked in the expected order. However, that should not
> be possible because we store them in a LinkedHashSet.
> Could you dig into this a little further to see if you can figure out why
> the order is wrong?
>
> This is where we register the authorization requests:
> https://github.com/cloudera/Impala/blob/cdh5-trunk/fe/src/main/java/com/cloudera/impala/analysis/Analyzer.java#L544
>
> This is where we check the authorization requests:
> https://github.com/cloudera/Impala/blob/cdh5-trunk/fe/src/main/java/com/cloudera/impala/analysis/AnalysisContext.java#L391
>
>

I tried directly executing "select 1 from nodb.alltypes" in
impala-shell, leading to this error:
ERROR: AnalysisException: Could not resolve table reference: 'nodb.alltypes'

How can I reproduce the authorization tests in impala-shell so I can
debug it?


>
>>
>>
>> 3. For end-to-end tests, I encountered two kinds of errors
>>
>>   a) connection refused.
>>
>>   SET sync_ddl=False;
>> -- executing against localhost:21000
>> DROP DATABASE `test_drop_cleans_hdfs_dirs_fdfd4f8` CASCADE;
>>
>> ___________________ ERROR at setup of TestLoadData.test_load[exec_option:
>> {'disable_codegen': False, 'abort_on_error': 1, 'exec_single_node_rows_threshold':
>> 0, 'batch_size': 0, 'num_nodes': 0} | table_format: text/none]
>> ___________________
>> [gw5] linux2 -- Python 2.6.6 /home/amos/incubator-impala/
>> bin/../infra/python/env/bin/python
>> metadata/test_load.py:77: in setup_method
>>     "{0}/{1}/100101.txt".format(STAGING_PATH, i))
>> util/hdfs_util.py:122: in copy
>>     data = self.read_file(src)
>> ../infra/python/env/lib/python2.6/site-packages/pywebhdfs/webhdfs.py:183:
>> in read_file
>>     response = requests.get(uri, allow_redirects=True)
>> ../infra/python/env/lib/python2.6/site-packages/requests/api.py:69: in get
>>     return request('get', url, params=params, **kwargs)
>> ../infra/python/env/lib/python2.6/site-packages/requests/api.py:50: in
>> request
>>     response = session.request(method=method, url=url, **kwargs)
>> ../infra/python/env/lib/python2.6/site-packages/requests/sessions.py:465:
>> in request
>>     resp = self.send(prep, **send_kwargs)
>> ../infra/python/env/lib/python2.6/site-packages/requests/sessions.py:594:
>> in send
>>     history = [resp for resp in gen] if allow_redirects else []
>> ../infra/python/env/lib/python2.6/site-packages/requests/sessions.py:196:
>> in resolve_redirects
>>     **adapter_kwargs
>> ../infra/python/env/lib/python2.6/site-packages/requests/sessions.py:573:
>> in send
>>     r = adapter.send(request, **kwargs)
>> ../infra/python/env/lib/python2.6/site-packages/requests/adapters.py:415:
>> in send
>>     raise ConnectionError(err, request=request)
>> E   ConnectionError: ('Connection aborted.', error(111, 'Connection
>> refused'))
>>
>>
> The connection refused issue is very bizarre. One thing that I noticed is
> that your Python does not seem to match what we use (Python 2.7.3).
> Could you re-run infra/python/bootstrap_virtualenv.py and see if you get
> the expected version into infra/python/env/local/bin?
>
> Alternatively, maybe there's a problem with your /etc/hosts? You can try
> searching online for WebHdfs and /etc/hosts
>

well, I find this 'find_py26.py' file under deps. Is it normal?
[amos@nobida143 incubator-impala]$ ls infra/python/deps/
download_requirements  find_py26.py  pip_download.py  requirements.txt
[amos@nobida143 incubator-impala]$ cat infra/python/deps/download_requirements
#!/bin/bash

# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements.  See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership.  The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License.  You may obtain a copy of the License at
#
#   http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied.  See the License for the
# specific language governing permissions and limitations
# under the License.

set -euo pipefail

DIR="$(dirname "$0")"

pushd "$DIR"
PY26="$(./find_py26.py)"
# Directly download packages listed in requirements.txt, but don't install them.
"$PY26" pip_download.py
# For virtualenv, other scripts rely on the .tar.gz package (not a .whl package).
"$PY26" pip_download.py virtualenv 13.1.0
# kudu-python is downloaded separately because pip install attempts to execute a
# setup.py subcommand for kudu-python that can fail even if the download succeeds.
"$PY26" pip_download.py kudu-python 0.1.1
popd



>    b) stats not match
>>
>> [gw4] linux2 -- Python 2.6.6 /home/amos/incubator-impala/
>> bin/../infra/python/env/bin/python
>> metadata/test_metadata_query_statements.py:67: in test_show_stats
>>     self.run_test_case('QueryTest/show-stats', vector, "functional")
>> common/impala_test_suite.py:342: in run_test_case
>>     self.__verify_results_and_errors(vector, test_section, result, use_db)
>> common/impala_test_suite.py:234: in __verify_results_and_errors
>>     replace_filenames_with_placeholder)
>> common/test_result_verifier.py:398: in verify_raw_results
>>     VERIFIER_MAP[verifier](expected, actual)
>> common/test_result_verifier.py:231: in verify_query_result_is_equal
>>     assert expected_results == actual_results
>>
>> ...
>>
>>    -- executing against localhost:21000
>> show column stats alltypes_clone;
>>
>> MainThread: Comparing QueryTestResults (expected vs actual):
>> 'bigint_col','BIGINT',10,-1,8,8 == 'bigint_col','BIGINT',10,-1,8,8
>> 'bool_col','BOOLEAN',2,-1,1,1 == 'bool_col','BOOLEAN',2,-1,1,1
>> 'date_string_col','STRING',736,-1,8,8 == 'date_string_col','STRING',
>> 736,-1,8,8
>> 'double_col','DOUBLE',-1,-1,8,8 == 'double_col','DOUBLE',-1,-1,8,8
>> 'float_col','FLOAT',10,-1,4,4 == 'float_col','FLOAT',10,-1,4,4
>> 'id','INT',7505,-1,4,4 == 'id','INT',7505,-1,4,4
>> 'int_col','INT',-1,-1,4,4 == 'int_col','INT',-1,-1,4,4
>> 'month','INT',12,0,4,4 == 'month','INT',12,0,4,4
>> 'smallint_col','SMALLINT',10,-1,2,2 == 'smallint_col','SMALLINT',10,-1,2,2
>> 'string_col','STRING',10,-1,-1,-1 == 'string_col','STRING',10,-1,-1,-1
>> 'timestamp_col','TIMESTAMP',7554,-1,16,16 != 'timestamp_col','TIMESTAMP',
>> 7552,-1,16,16
>> 'tinyint_col','TINYINT',10,-1,1,1 == 'tinyint_col','TINYINT',10,-1,1,1
>> 'year','INT',2,0,4,4 == 'year','INT',2,0,4,4
>>
>>
>> Very strange. Can you do a compute stats on functional.alltypes and
> confirm that the NDV for timestamp_col are 7552 in your setup?

Yes.

>
>
>
>> I'm using CentOS 6.8 final. I have no idea what goes wrong. Any help is
>> much appreciated!
>
>
>
>
>>
>> Best regards,
>> Amos
>>


Mime
View raw message