hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vihang Karajgaonkar <vih...@cloudera.com>
Subject Re: Review Request 58936: HIVE-16143 : Improve msck repair batching
Date Fri, 12 May 2017 21:35:56 GMT


> On May 12, 2017, 2:18 p.m., Aihua Xu wrote:
> > common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
> > Lines 3357-3361 (patched)
> > <https://reviews.apache.org/r/58936/diff/3/?file=1714847#file1714847line3357>
> >
> >     Vihang and Sahil,
> >     
> >     Typically what would cause the batch to fail? Is that because the batch could
be too large? 
> >     
> >     Right now, we are hard coding decayingFactor to 2. I have another thought: maybe
with the retries, we will  calculate such decayingFactor so the last retry will always process
one partition at a time just like what we are doing. So given batch size 100 and retries 4,
100, 66, 33, 1? 
> >     
> >     How do you think?

The batch could fail when the network is flaky or if the processing time of the batch is higher
than socket timeout value of metastore client. This could be more common  in cloud based datastores
like S3. I think what you are proposing is a linearly decaying batchsize which may work fine
for smaller batch sizes but may not converge very fast if the batch size is (mis)configured
to be much higher or at default value of 0. Eg. consider numPartitions = 10,000 and maxRetries
= 10 so batch sizes with your approach will be 10k, 9k, 8k, 7k.. which all may be too high.
If we decay exponentially the batches will be 10k, 5k, 2.5k, 1.25k.. which is more likely
to succeed.


- Vihang


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/58936/#review174792
-----------------------------------------------------------


On May 12, 2017, 9:35 p.m., Vihang Karajgaonkar wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/58936/
> -----------------------------------------------------------
> 
> (Updated May 12, 2017, 9:35 p.m.)
> 
> 
> Review request for hive, Aihua Xu, Sergio Pena, and Sahil Takiar.
> 
> 
> Bugs: HIVE-16143
>     https://issues.apache.org/jira/browse/HIVE-16143
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> HIVE-16143 : Improve msck repair batching
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java d3ea824c21f2fbf98177cb12a18019416f36a3f9

>   common/src/java/org/apache/hive/common/util/RetryUtilities.java PRE-CREATION 
>   common/src/test/org/apache/hive/common/util/TestRetryUtilities.java PRE-CREATION 
>   itests/hive-blobstore/src/test/queries/clientpositive/create_like.q 38f384e4c547d3c93d510b89fccfbc2b8e2cba09

>   itests/hive-blobstore/src/test/results/clientpositive/create_like.q.out 0d362a716291637404a3859fe81068594d82c9e0

>   itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java 2ae1eacb68cef6990ae3f2050af0bed7c8e9843f

>   ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 917e565f28b2c9aaea18033ea3b6b20fa41fcd0a

>   ql/src/test/org/apache/hadoop/hive/ql/exec/TestMsckCreatePartitionsInBatches.java PRE-CREATION

>   ql/src/test/queries/clientpositive/msck_repair_0.q 22542331621ca4ce5277c2f46a4264b7540a4d1e

>   ql/src/test/queries/clientpositive/msck_repair_1.q ea596cbbd2d4c230f2b5afbe379fc1e8836b6fbd

>   ql/src/test/queries/clientpositive/msck_repair_2.q d8338211e970ebac68a7471ee0960ccf2d51cba3

>   ql/src/test/queries/clientpositive/msck_repair_3.q fdefca121a2de361dbd19e7ef34fb220e1733ed2

>   ql/src/test/queries/clientpositive/msck_repair_batchsize.q e56e97ac36a6544f3e20478fdb0e8fa783a857ef

>   ql/src/test/results/clientpositive/msck_repair_0.q.out 2e0d9dc423071ebbd9a55606f196cf7752e27b1a

>   ql/src/test/results/clientpositive/msck_repair_1.q.out 3f2fe75b194f1248bd5c073dd7db6b71b2ffc2ba

>   ql/src/test/results/clientpositive/msck_repair_2.q.out 3f2fe75b194f1248bd5c073dd7db6b71b2ffc2ba

>   ql/src/test/results/clientpositive/msck_repair_3.q.out 3f2fe75b194f1248bd5c073dd7db6b71b2ffc2ba

>   ql/src/test/results/clientpositive/msck_repair_batchsize.q.out ba99024163a1f2c59d59e9ed7ea276c154c99d24

>   ql/src/test/results/clientpositive/repair.q.out c1834640a35500c521a904a115a718c94546df10

> 
> 
> Diff: https://reviews.apache.org/r/58936/diff/4/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Vihang Karajgaonkar
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message