hawq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gagan Brahmi <gaganbra...@gmail.com>
Subject Re: HAWQ YARN RPC Errors
Date Sat, 14 May 2016 17:24:04 GMT
Hi Wen,

Please find attached logs which has a few instances of the occurrence
of the error.


Regards,
Gagan Brahmi

On Thu, May 12, 2016 at 7:33 PM, Wen Lin <wlin@pivotal.io> wrote:
> Hi, Gagan,
>
> It seems a sync failure between QD and Resource Manager. Not related to
> libyarn 's RPC.
> Would you like to attach the master's log file? Thanks!
>
> On Fri, May 13, 2016 at 12:58 AM, Gagan Brahmi <gaganbrahmi@gmail.com>
> wrote:
>
>> Hi Team,
>>
>> Do we have some recommended tuning for the RPC warning/errors
>> encountered intermittently?
>>
>> The error which is seen is the following:
>>
>> WARNING:  Sync RPC framework (inet) finds exception raised.
>> ERROR:  failed to return resource to resource manager, failed to
>> receive content (pquery.c:991)
>>
>> This error however, disappears when we retry the query. There are
>> cases when the query is to be retried more than once.
>>
>> The error looks to be invoked when COMM2RM_CLIENT_FAIL_RECV is encountered.
>>
>> The setup is using YARN resource manager. And the following is the
>> yarn-client configuration used:
>>
>> <configuration>
>>
>>     <property>
>>       <name>hadoop.security.authentication</name>
>>       <value>kerberos</value>
>>     </property>
>>
>>     <property>
>>       <name>rpc.client.connect.retry</name>
>>       <value>10</value>
>>     </property>
>>
>>     <property>
>>       <name>rpc.client.connect.tcpnodelay</name>
>>       <value>true</value>
>>     </property>
>>
>>     <property>
>>       <name>rpc.client.connect.timeout</name>
>>       <value>600000</value>
>>     </property>
>>
>>     <property>
>>       <name>rpc.client.max.idle</name>
>>       <value>10000</value>
>>     </property>
>>
>>     <property>
>>       <name>rpc.client.ping.interval</name>
>>       <value>10000</value>
>>     </property>
>>
>>     <property>
>>       <name>rpc.client.read.timeout</name>
>>       <value>3600000</value>
>>     </property>
>>
>>     <property>
>>       <name>rpc.client.socket.linger.timeout</name>
>>       <value>-1</value>
>>     </property>
>>
>>     <property>
>>       <name>rpc.client.timeout</name>
>>       <value>3600000</value>
>>     </property>
>>
>>     <property>
>>       <name>rpc.client.write.timeout</name>
>>       <value>3600000</value>
>>     </property>
>>
>>     <property>
>>       <name>yarn.client.failover.max.attempts</name>
>>       <value>15</value>
>>     </property>
>>
>>   </configuration>
>>
>> I would appreciate some recommendations.
>>
>>
>> Regards,
>> Gagan Brahmi
>>

Mime
  • Unnamed multipart/mixed (inline, None, 0 bytes)
View raw message