phoenix-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] karanmehta93 commented on issue #419: PHOENIX-4009 Run UPDATE STATISTICS command by using MR integration on…
Date Fri, 11 Jan 2019 18:50:14 GMT
karanmehta93 commented on issue #419: PHOENIX-4009 Run UPDATE STATISTICS command by using MR
integration on…
URL: https://github.com/apache/phoenix/pull/419#issuecomment-453618632
 
 
   > There are always some jobs (might be a few in a day) failing, because few mappers
in a job continuously failing and even retries can't get over the issue which causes the whole
job to fail -- this is the case I'm talking about, and it happens more frequently when some
bad thing happen in the cluster. 
   
   I understand your concern and I also agree that it can happen often. As you already pointed
out, the simplest way to combat that is to retry the whole job again (or at certain intervals)
and hope that it eventually succeeds. If not, we can raise appropriate alerts using monitoring
infrastructure.
   
   > In this case, I want the retry job to skip the regions whose stats have already been
updated and only do minimal work, so it wouldn't worsen the bad situation in the cluster and
we can easily catch up to avoid missing SLA. As the current phase, I'm ok to proceed without
this skip check.
   
   I understand the idea. Determining which regions data is missing from SYSTEM.STATS table
is not possible (as part of this code) since the snapshot might have changed between the two
jobs. 
   A better way (in my understanding) of implementing this feature would be wrapper class
for this tool which is aware about the job id and other details for the previous job. It can
ensure that the job runs on the same snapshot everytime and mappers are only spawned accordingly
(or even if mappers are launched, most of them are no-op). At this point, I feel that we should
skip it, however feel free to add this as an potential enhancement to PHOENIX-5091. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

Mime
View raw message