hawq-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gagan Brahmi <gaganbra...@gmail.com>
Subject Re: HAWQ YARN RPC Errors
Date Tue, 17 May 2016 03:08:53 GMT
it is a 152kb file.

I have renamed the file as hawq_master_rm_error.txt. Please find it attached.


Regards,
Gagan Brahmi

On Mon, May 16, 2016 at 7:59 PM, Wen Lin <wlin@pivotal.io> wrote:
> Hi, Gagan,
>
> Where is the log? There is no attachment in your email.
>
> Thanks!
>
> On Sun, May 15, 2016 at 1:24 AM, Gagan Brahmi <gaganbrahmi@gmail.com> wrote:
>
>> Hi Wen,
>>
>> Please find attached logs which has a few instances of the occurrence
>> of the error.
>>
>>
>> Regards,
>> Gagan Brahmi
>>
>> On Thu, May 12, 2016 at 7:33 PM, Wen Lin <wlin@pivotal.io> wrote:
>> > Hi, Gagan,
>> >
>> > It seems a sync failure between QD and Resource Manager. Not related to
>> > libyarn 's RPC.
>> > Would you like to attach the master's log file? Thanks!
>> >
>> > On Fri, May 13, 2016 at 12:58 AM, Gagan Brahmi <gaganbrahmi@gmail.com>
>> > wrote:
>> >
>> >> Hi Team,
>> >>
>> >> Do we have some recommended tuning for the RPC warning/errors
>> >> encountered intermittently?
>> >>
>> >> The error which is seen is the following:
>> >>
>> >> WARNING:  Sync RPC framework (inet) finds exception raised.
>> >> ERROR:  failed to return resource to resource manager, failed to
>> >> receive content (pquery.c:991)
>> >>
>> >> This error however, disappears when we retry the query. There are
>> >> cases when the query is to be retried more than once.
>> >>
>> >> The error looks to be invoked when COMM2RM_CLIENT_FAIL_RECV is
>> encountered.
>> >>
>> >> The setup is using YARN resource manager. And the following is the
>> >> yarn-client configuration used:
>> >>
>> >> <configuration>
>> >>
>> >>     <property>
>> >>       <name>hadoop.security.authentication</name>
>> >>       <value>kerberos</value>
>> >>     </property>
>> >>
>> >>     <property>
>> >>       <name>rpc.client.connect.retry</name>
>> >>       <value>10</value>
>> >>     </property>
>> >>
>> >>     <property>
>> >>       <name>rpc.client.connect.tcpnodelay</name>
>> >>       <value>true</value>
>> >>     </property>
>> >>
>> >>     <property>
>> >>       <name>rpc.client.connect.timeout</name>
>> >>       <value>600000</value>
>> >>     </property>
>> >>
>> >>     <property>
>> >>       <name>rpc.client.max.idle</name>
>> >>       <value>10000</value>
>> >>     </property>
>> >>
>> >>     <property>
>> >>       <name>rpc.client.ping.interval</name>
>> >>       <value>10000</value>
>> >>     </property>
>> >>
>> >>     <property>
>> >>       <name>rpc.client.read.timeout</name>
>> >>       <value>3600000</value>
>> >>     </property>
>> >>
>> >>     <property>
>> >>       <name>rpc.client.socket.linger.timeout</name>
>> >>       <value>-1</value>
>> >>     </property>
>> >>
>> >>     <property>
>> >>       <name>rpc.client.timeout</name>
>> >>       <value>3600000</value>
>> >>     </property>
>> >>
>> >>     <property>
>> >>       <name>rpc.client.write.timeout</name>
>> >>       <value>3600000</value>
>> >>     </property>
>> >>
>> >>     <property>
>> >>       <name>yarn.client.failover.max.attempts</name>
>> >>       <value>15</value>
>> >>     </property>
>> >>
>> >>   </configuration>
>> >>
>> >> I would appreciate some recommendations.
>> >>
>> >>
>> >> Regards,
>> >> Gagan Brahmi
>> >>
>>

Mime
View raw message