hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Haibo Chen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-5269) Bubble exceptions and errors all the way up the calls, including to clients.
Date Thu, 13 Apr 2017 04:45:41 GMT

    [ https://issues.apache.org/jira/browse/YARN-5269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15967126#comment-15967126
] 

Haibo Chen commented on YARN-5269:
----------------------------------

bq.  let's restrict the focus for this jira to showing any exception back to the client as
and when we determine that a problem occurred writing to HBase
One the client side, any error in the response will be wrapped in an exception (after retries)
and thrown correctly for TImelineV2Client.putEntities() calls. On the server side, we wrap
timelinecollector.putEntities() and timelinecollector.putEntitiesAsyn() calls with try-catch
clause in TimelineCollectorWebService, so any problem writing to HBase, that TimelineCollector
indicates in the form of exceptions thrown by putEntities()/putEntitiesAsync(), will be returned
to client as 500 error response. 

What we are missing though is checking TimelineWriteResponse returned by TimelineCollector.putEntities().
Per discussion above with Varun and its javadoc, TimelineWriteResponse allows details indications
of errors and individual entities that they are associated with. To comply with current TimelineCollector
API, it seems that we should inspect TimelineWriteResponse and return all found errors to
clients.  

However, TimelineWriteResponse is hardly useful given our current implementation for two reasons:

1) TimelineWriteResponse returned by TimelineCollector.putEntities() is what is returned by
TimelineWriter.write().  HBaseTImelineWriter.write() is essentially an async operation, the
returned TimelineResponse never contains any error. Problems are solely indicated as IOException
{code}
public class HBaseTimelineWriterImpl extends AbstractService implements
    TimelineWriter {
 public TimelineWriteResponse write(String clusterId, String userId,
      String flowName, String flowVersion, long flowRunId, String appId,
      TimelineEntities data) throws IOException {

    TimelineWriteResponse putStatus = new TimelineWriteResponse();
    // defensive coding to avoid NPE during row key construction
    if ((flowName == null) || (appId == null) || (clusterId == null)
        || (userId == null)) {
      LOG.warn("Found null for one of: flowName=" + flowName + " appId=" + appId
          + " userId=" + userId + " clusterId=" + clusterId
          + " . Not proceeding with writing to hbase");
      return putStatus;
    }
    ...
    return putStatus;
}
{code}
In cases where flowName/appId/userId/clusterId/ is null, for example, no error is included
in the response.
2) We can no longer just return 500 with an exception message to clients. More involved structure
is needed to return errors from TimelineCollector to TimelineV2Client, and callers of timelinev2Client.putEntity(TimelineEntity...
entities) will now be expected to handle all cases of errors unless we change timelinev2Client.putEntity()
to only accept one entity per call

It is OK for now to not handle TimelineWriteResponse given that HBaseTimelineWriter always
return a dummy one and it is the only backend implementation. Thus, I am removing the yarn5355-merge-blocker
label. 
But once we expect alternative backend/TimelineWriter implementations, we need to either add
support in TimelineCollectorWebService and TimelineV2Client to handle detailed error response,
or remove TimelineWriteResponse from TimelineWriter.write() signature completely if we cannot
really make use of it like we desire to.

> Bubble exceptions and errors all the way up the calls, including to clients.
> ----------------------------------------------------------------------------
>
>                 Key: YARN-5269
>                 URL: https://issues.apache.org/jira/browse/YARN-5269
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>    Affects Versions: YARN-2928
>            Reporter: Joep Rottinghuis
>            Assignee: Haibo Chen
>              Labels: YARN-5355, yarn-5355-merge-blocker
>
> Currently we ignore (swallow) exception from the HBase side in many cases (reads and
writes).
> Also, on the client side, neither TimelineClient#putEntities (the v2 flavor) nor the
#putEntitiesAsync method return any value.
> For the second drop we may want to consider how we properly bubble up exceptions throughout
the write and reader call paths and if we want to return a response in putEntities and some
future kind of result for putEntitiesAsync.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message