ambari-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Onischuk" <aonis...@hortonworks.com>
Subject Re: Review Request 42296: HistoryServer upgrade times out when /app-logs is too large
Date Thu, 14 Jan 2016 20:45:41 GMT


> On Jan. 14, 2016, 8:27 p.m., Jayush Luniya wrote:
> > ambari-common/src/main/python/resource_management/libraries/providers/hdfs_resource.py,
line 374
> > <https://reviews.apache.org/r/42296/diff/1/?file=1196621#file1196621line374>
> >
> >     _fill_directories_list is already making multiple WebHDFS LISTSTATUS calls.
We are recursively collecting list of directories in the path which itself is slow. 
> >     
> >     We should use GETCONTENTSUMMARY to get the count of sub-directories in a directory
> >     
> >      curl -sS -L  -X GET 'http://jay-2.c.pramod-thangali.internal:50070/webhdfs/v1/app-logs?op=GETCONTENTSUMMARY&user.name=hdfs'
> >     {"ContentSummary":{"directoryCount":18,"fileCount":15,"length":144288,"quota":-1,"spaceConsumed":432864,"spaceQuota":-1}}

so what if we know how many sub directories is there. The bottleneck is files count most often.

Also I don't think this is a good idea since it will add up more extra call, to slow down
normal scenarios (which we have executed a lot of times in different places)


- Andrew


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/42296/#review114559
-----------------------------------------------------------


On Jan. 14, 2016, 10:33 a.m., Andrew Onischuk wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/42296/
> -----------------------------------------------------------
> 
> (Updated Jan. 14, 2016, 10:33 a.m.)
> 
> 
> Review request for Ambari and Myroslav Papirkovskyy.
> 
> 
> Bugs: AMBARI-14660
>     https://issues.apache.org/jira/browse/AMBARI-14660
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> **Issue: Upgrading History Server timed out**
> 
>   * On investigation, it was noticed that we do some curl operations over WebHDFS on
/app-logs HDFS folder.
>   * Since the /app-logs folder was too large the script was taking too long (>1200
secs).
>   * Retry of the upgrade step succeeded.
> 
> 
> Diffs
> -----
> 
>   ambari-common/src/main/python/resource_management/libraries/providers/hdfs_resource.py
59a04ece 
> 
> Diff: https://reviews.apache.org/r/42296/diff/
> 
> 
> Testing
> -------
> 
> mvn clean test
> 
> 
> Thanks,
> 
> Andrew Onischuk
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message