Nathan,
We have a 1 node (one machine total) Storm cluster based on Storm 0.8.2. We are using drpc.execute
from mule to execute two different topologies (one after another) with text and a POJO (serialized
to a string). We're having an issue where Storm appears to be rarely and randomly dropping
jobs after we hit the timeout (conf.put(Config.TOPOLOGY_MESSAGE_TIMEOUT_SECS, 600)), where
we'll get an exception on the client (mule) side producing this:
org.apache.thrift7.TApplicationException: execute failed: unknown result
at backtype.storm.generated.DistributedRPC$Client.recv_execute(DistributedRPC.java:82)
at backtype.storm.generated.DistributedRPC$Client.execute(DistributedRPC.java:61)
at backtype.storm.utils.DRPCClient.execute(DRPCClient.java:54)
Storm log analysis of the worker jvms indicate that the job completes just fine, they just
don't come back from the drpc.execute() until the timeout, and then with the above mentioned
non-useful exception .
We are wondering if maybe trying to use Storm 0.9.0.1 would help with this? Have you seen
it before? We looked at the code for DistributedRPC and can't imagine a message not having
a result.
Thank you,
Randy
|