edgent-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dale LaBossiere (JIRA)" <j...@apache.org>
Subject [jira] [Created] (EDGENT-382) A RuntimeException thrown while processing a tuple brings down the whole topology
Date Mon, 20 Feb 2017 22:55:44 GMT
Dale LaBossiere created EDGENT-382:

             Summary: A RuntimeException thrown while processing a tuple brings down the whole
                 Key: EDGENT-382
                 URL: https://issues.apache.org/jira/browse/EDGENT-382
             Project: Edgent
          Issue Type: Bug
          Components: Runtime
            Reporter: Dale LaBossiere

I encountered the above in the context of the WIoTP connector, and
there may be a problem there as well, but it’s trivial to demonstrate the
problem in a more general context.

i.e., a RuntimeException thrown from a Topology.poll(), generate(), source() or from an unisolated
user function implementation downstream of the source, like a map() or sink()'s function,
causes the topology to immediately terminate.  That typically causes the process to terminate.

It's unclear to me which parts of the runtime should be doing what with respect to this.

Things need to be more resilient in the face of transient errors, particularly wrt transient
connector problems.  As an example MqttPublisher.accept() achieved resiliency in the face
of transient connection problems by logging instead of throwing.  IotpDevice connector just
throws... which at a certain level is OK/desirable... if the runtime were to handle resiliency

Note, a RuntimeException from a Topology.events() supplier or even a downstream function doesn't
result in topology termination.  That's because the runtime thread blocking awaiting the next
supplied tuple doesn't see the RuntimeException.  And for the downstream case, the stream
is Isolated so again the runtime thread doesn't see the exception.  That said, the thread
internal to Isolate silently terminates in the face of a downstream exception.  ugh.  (Barrier
looks to have a similar problem).

There needs to be some clear / prominent doc on all of this, what the design / behavior is
supposed to be, and then we can address any issues in the light of that understanding.

This message was sent by Atlassian JIRA

View raw message