impala-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Geetika Gupta <geetika.gu...@knoldus.in>
Subject Re: Problem in Querying data on Impala Cluster
Date Wed, 25 Apr 2018 12:54:32 GMT
Hi Community,

We are getting the following logs in the query details tab of the executor
UI

*Status: File
'hdfs://hadoop-master:54311/opt/spark-2.1.0-bin-hadoop2.7/spark-warehouse/parquet500.db/customer/part-00010-12db8b7a-02ae-463f-8af6-667e62dca833.snappy.parquet'
has an incompatible Parquet schema for column
'parquet500.customer.c_acctbal'. Column type: DECIMAL(15,2), Parquet
schema: optional int64 C_ACCTBAL [i:5 d:1 r:0]*

We created this table through Spark using parquet as file format.




On Wed, Apr 25, 2018 at 5:32 PM, Geetika Gupta <geetika.gupta@knoldus.in>
wrote:

> Thanks Sailesh Mukil.
>
> It resolved the issue.
> But now we are encountering some other error:
>
> *I0425 17:20:35.209463  7481 Frontend.java:987] Analyzing query: select
> c_custkey, c_name, sum(l_extendedprice * (1 - l_discount)) as revenue,
> c_acctbal, n_name, c_address, c_phone, c_comment from lineitem, orders,
> customer, nation where o_custkey=c_custkey and l_orderkey = o_orderkey and
> c_nationkey = n_nationkey group by c_custkey, c_name, c_acctbal, c_phone,
> n_name, c_address, c_comment order by revenue desc limit 20*
> *I0425 17:20:35.214584  7481 Frontend.java:999] Analysis finished.*
> *I0425 17:20:36.308864  1333 coordinator.cc:783] Release admission control
> resources for query_id=73443760aff6bec9:4f2fcc1e00000000*
> *I0425 17:20:36.313413 10127 query-state.cc:288] Cancelling fragment
> instances as directed by the coordinator. Returned status:
> ReportExecStatus(): Received report for unknown query ID (probably closed
> or cancelled): 73443760aff6bec9:4f2fcc1e00000000*
>
> These are the logs from the *impalad* process on that machine. We are
> encountering this error only for some of the queries.
>
>
> On Wed, Apr 25, 2018 at 11:59 AM, Sailesh Mukil <sailesh@cloudera.com>
> wrote:
>
>> Hi Geetika,
>>
>> It looks like you're using unencrypted connections that don't fall under
>> the local subnet or a private network, which means you're potentially
>> trying to send unencrypted data over a public network between nodes.
>>
>> We explicitly disallow these kinds of connections by default. However, if
>> you still feel like you want to go ahead with this configuration, or that
>> the above explanation is a mistake, this might help you:
>> https://github.com/apache/impala/blob/6f2ebadf8d119b1486f54b
>> 911ba3c7ecc1921d55/be/src/kudu/rpc/server_negotiation.cc#L70-L80
>>
>> You can set the 'trusted_subnet' startup flag to whitelist the subnet
>> that your impala nodes' IP addresses fall under.
>>
>> I hope this helps.
>>
>> - Sailesh
>>
>> On Tue, Apr 24, 2018 at 11:12 PM, Geetika Gupta <geetika.gupta@knoldus.in
>> > wrote:
>>
>>> Hello Community,
>>>
>>> We were trying to query parquet data stored in hdfs through impala
>>> cluster. but when we execute our query it shows the following error in the
>>> *impalad* logs of the machine:
>>>
>>> *W0424 18:32:40.928611  7655 negotiation.cc:306] Unauthorized connection
>>> attempt: Server connection negotiation failed: server connection from
>>> <node_ip_of_different_machine>:43448: unencrypted connections from publicly
>>> routable IPs are prohibited. See --trusted_subnets flag for more
>>> information.: <node_ip_of_different_machine>:43448*
>>>
>>> We are encountering this problem only when we have multiple nodes in
>>> impala. It works fine on single machine.
>>>
>>>
>>>
>>> --
>>> Regards,
>>> Geetika Gupta
>>>
>>
>>
>
>
> --
> Regards,
> Geetika Gupta
>



-- 
Regards,
Geetika Gupta

Mime
View raw message