ambari-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nate Cole" <nc...@hortonworks.com>
Subject Re: Review Request 29298: RU: Cannot Retry on failure
Date Tue, 23 Dec 2014 12:49:00 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/29298/#review65935
-----------------------------------------------------------

Ship it!


Ship It!

- Nate Cole


On Dec. 22, 2014, 10:52 p.m., Tom Beerbower wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/29298/
> -----------------------------------------------------------
> 
> (Updated Dec. 22, 2014, 10:52 p.m.)
> 
> 
> Review request for Ambari, Alejandro Fernandez and Nate Cole.
> 
> 
> Bugs: AMBARI-8852
>     https://issues.apache.org/jira/browse/AMBARI-8852
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> During RU, a failure occurred on "Client Components" group, "Service Check HBASE, MAPREDUCE2,
HDFS, YARN" item.
> The UI presented me with a Retry button.  However, the server rejected this request:
> 
> PUT /api/v1/clusters/ysru2/upgrades/5/upgrade_groups/4/upgrade_items/30
> {"UpgradeItem":{"status":"PENDING"}}
> 
> {
>   "status" : 400,
>   "message" : "java.lang.IllegalArgumentException: Can not transition a stage from FAILED
to PENDING"
> }
> 
> I believe this is the current expected behavior since the failure is not marked to hold.
 
> However, on any service check failure, the user should be able to retry (or maybe on
any failure?  actions should be idempotent).
> 
> ----
> 
> Allow Retry - mark a stage (upgrade item) to allow any failed task to be retried. This
means that if a failure occurs during the execution of the task then the stage & task
will transition to HOLDING_FAILED. Once in the HOLDING_FAILED state, the stage can be pushed
to PENDING (retry) or FAILED. Transitioning the stage to FAILED will cause the remaining tasks
in that stage to be ABORTED. It never makes sense to allow the remaining tasks of a stage
to continue executing after the stage has been accepted as FAILED. However, the remaining
stages of the upgrade request may be allowed execute...
> 
> Skippable - mark a stage to allow it to be skipped in the event of a failure so that
the remaining stages may still execute. This means that when a stage state is set to FAILED,
it will not trigger the remaining stages of the request to abort.
> By separating the concepts of retry and skippable, we can be more flexible in how we
define the behavior of the upgrade. For example, the core masters upgrade item should be marked
as allow_retry = true and skippable = false. If a failure occurs during this stage you should
be able to retry. If the failure can not be resolved then the entire upgrade request should
be aborted.
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/main/java/org/apache/ambari/server/actionmanager/ActionScheduler.java
ccecad9 
>   ambari-server/src/main/java/org/apache/ambari/server/actionmanager/HostRoleCommand.java
f71e2d5 
>   ambari-server/src/main/java/org/apache/ambari/server/actionmanager/Stage.java 57fadf7

>   ambari-server/src/main/java/org/apache/ambari/server/controller/AmbariActionExecutionHelper.java
17d5782 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/AmbariCustomCommandExecutionHelper.java
c8ae61d 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/AmbariManagementControllerImpl.java
19ee6d9 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/KerberosHelper.java
562ce9e 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/internal/ClusterStackVersionResourceProvider.java
9329ea9 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/internal/HostStackVersionResourceProvider.java
3b1b462 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/internal/RequestResourceProvider.java
3c4524a 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/internal/StageResourceProvider.java
c174a9c 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/internal/UpgradeGroupResourceProvider.java
47f6237 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/internal/UpgradeResourceProvider.java
efc3713 
>   ambari-server/src/main/java/org/apache/ambari/server/orm/entities/StageEntity.java
a7bc948 
>   ambari-server/src/main/java/org/apache/ambari/server/upgrade/UpgradeCatalog200.java
d59d8a1 
>   ambari-server/src/main/java/org/apache/ambari/server/utils/StageUtils.java e6e51a1

>   ambari-server/src/main/resources/Ambari-DDL-MySQL-CREATE.sql d6229b3 
>   ambari-server/src/main/resources/Ambari-DDL-Oracle-CREATE.sql cb8f776 
>   ambari-server/src/main/resources/Ambari-DDL-Postgres-CREATE.sql 4599390 
>   ambari-server/src/main/resources/Ambari-DDL-Postgres-EMBEDDED-CREATE.sql 1e6631e 
>   ambari-server/src/main/resources/Ambari-DDL-SQLServer-CREATE.sql 8836f04 
>   ambari-server/src/test/java/org/apache/ambari/server/actionmanager/ExecutionCommandWrapperTest.java
948f137 
>   ambari-server/src/test/java/org/apache/ambari/server/actionmanager/TestActionDBAccessorImpl.java
7e4f850 
>   ambari-server/src/test/java/org/apache/ambari/server/actionmanager/TestActionManager.java
01a40f4 
>   ambari-server/src/test/java/org/apache/ambari/server/actionmanager/TestActionScheduler.java
edbb71d 
>   ambari-server/src/test/java/org/apache/ambari/server/actionmanager/TestStage.java bde19a1

>   ambari-server/src/test/java/org/apache/ambari/server/agent/TestHeartbeatHandler.java
a6df0db 
>   ambari-server/src/test/java/org/apache/ambari/server/controller/AmbariManagementControllerTest.java
72a22e6 
>   ambari-server/src/test/java/org/apache/ambari/server/controller/internal/StageResourceProviderTest.java
4c47e94 
>   ambari-server/src/test/java/org/apache/ambari/server/serveraction/ServerActionExecutorTest.java
96c0539 
>   ambari-server/src/test/java/org/apache/ambari/server/stageplanner/TestStagePlanner.java
dd2a519 
>   ambari-server/src/test/java/org/apache/ambari/server/upgrade/UpgradeCatalog200Test.java
f8d061a 
> 
> Diff: https://reviews.apache.org/r/29298/diff/
> 
> 
> Testing
> -------
> 
> Results :
> 
> Tests run: 2447, Failures: 0, Errors: 0, Skipped: 13
> 
> [INFO] ------------------------------------------------------------------------
> [INFO] BUILD SUCCESS
> [INFO] ------------------------------------------------------------------------
> [INFO] Total time: 27:49 min
> [INFO] Finished at: 2014-12-21T23:52:28-05:00
> [INFO] Final Memory: 42M/496M
> [INFO] ------------------------------------------------------------------------
> 
> 
> Thanks,
> 
> Tom Beerbower
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message