cloudstack-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ding Yuan <y...@ece.utoronto.ca>
Subject Re: Review Request 19917: Improvements on exception handlers (JIRA-6242)
Date Wed, 02 Apr 2014 21:17:01 GMT
Thanks all for the quick comments!
If i understand the discussion correctly, I will just change all the added log printing statements
to debug verbosity. I will upload a new patch for that shortly. 

Now a bit back story: the reason we are doing this is that we recently did an analysis on
many bugs collected from JIRA to understand why today’s cloud system fails. And we found
that almost all of the cluster-wide failures are caused by buggy exception handling, which
would often turn a component failure into a cluster-wide one. One of the common bug pattern
is ignoring some exceptions — allowing them to propagate and finally turn into disaster.
Therefore we built a simple static checking tool just to check some simple rules for exception
handling, such as if an exception is ignored. Admittedly, it would be much harder to reason
about the potential consequences caused for ignoring a certain exception, that’s why without
much more domain knowledge I can only recommend to 1) avoid over-catching an exception, especially
when the handling logic will swallow it, and 2) log them, as what this patch does.

Nevertheless, it seems the four cases I mentioned in JIRA-6242 are particularly suspicious.
It might be worthwhile to double check their correctness if you have time. I am reposting
them below.

Thanks!
Ding

========================= 
Case 1: 
Line: 375, File: "com/cloud/network/ovs/OvsTunnelManagerImpl.java" 

326: try { 
327: String myIp = getGreEndpointIP(dest.getHost(), nw); 
  .. .. .. 
373: } catch (Exception e) { 
374: // I really thing we should do a better handling of these exceptions 
375: s_logger.warn("Ovs Tunnel network created tunnel failed", e); 
376: } 

Comment seems to suggest for a better handling. 
========================= 
========================= 
Case 2: 
Line: 2295, File: "com/cloud/resource/ResourceManagerImpl.java" 

2284: try { 
2285: /* 
2286: * FIXME: this is a buggy logic, check with alex. Shouldn't 
2287: * return if propagation return non null 
2288: */ 
2289: Boolean result = _clusterMgr.propagateResourceEvent(h.getId(), ResourceState.Event.UpdatePassword);

2290: if (result != null) { 
2291: return result; 
2292: } 
2293: 
2294: doUpdateHostPassword(h.getId()); 
2295: } catch (AgentUnavailableException e) { 
2296: } 

Seem from the comment the logic should be fixed. 
A similar code snipeet is at: 
  Line: 2276, File: "com/cloud/resource/ResourceManagerImpl.java" 
========================= 

========================= 
Case 3: 

  Line: 184, File: "org/apache/cloudstack/api/command/user/autoscale/CreateAutoScaleVmGroupCmd.java"


174: try 
175: { 
176: success = _autoScaleService.configureAutoScaleVmGroup(this); 
177: if (success) { 
178: vmGroup = _entityMgr.findById(AutoScaleVmGroup.class, getEntityId()); 
179: AutoScaleVmGroupResponse responseObject = _responseGenerator.createAutoScaleVmGroupResponse(vmGroup);

180: setResponseObject(responseObject); 
181: responseObject.setResponseName(getCommandName()); 
182: } 
183: } catch (Exception ex) { 
184: // TODO what will happen if Resource Layer fails in a step inbetween 
185: s_logger.warn("Failed to create autoscale vm group", ex); 
186: } 

The comment, "TODO", seems to suggest for better handling. 
========================= 
========================= 
Case 4: 

  Line: 222, File: "com/cloud/api/ApiDispatcher.java" 
This snippet is moved to ParamProcessWorker.java in the trunk. 

203: try { 
204: setFieldValue(field, cmd, paramObj, parameterAnnotation); 
     .. .. 
220: } catch (CloudRuntimeException cloudEx) { 
221: s_logger.error("CloudRuntimeException", cloudEx); 
222: // FIXME: Better error message? This only happens if the API command is not executable,
which typically 
223: //means 
224: // there was 
225: // and IllegalAccessException setting one of the parameters. 
226: throw new ServerApiException(ApiErrorCode.INTERNAL_ERROR, "Internal error executing API
command " + cmd.getCommandName().substring(0, cmd.getCommandName().length() - 8)); 
227: } 

The "FIXME" comment seems to suggest for getter error message. 
=========================
On Apr 2, 2014, at 4:37 PM, Daan Hoogland <daan.hoogland@gmail.com> wrote:

> Ding Yuan,
> 
> Any objections to that?
> 
> On Wed, Apr 2, 2014 at 10:32 PM, Alena Prokharchyk
> <Alena.Prokharchyk@citrix.com> wrote:
>> 
>> 
>> On 4/2/14, 1:27 PM, "Daan Hoogland" <daan.hoogland@gmail.com> wrote:
>> 
>>> I think we agree indeed. Doesn't mean we should start this discuss
>>> thread or write a arch guideline on the wiki somewhere. Maybe Ding
>>> Yuan wants to do a preliminary version of it?
>> 
>> Wiki guide would be useful indeed.
>> 
>> 
>>> 
>>> In the meantime I don't think that it hurts for the present patch to
>>> do everything in debug and decide about higher levels needed later.
>> 
>> Agree.
>> 
>> 
>>> 
>>> regards,
>>> 
>>> On Wed, Apr 2, 2014 at 10:11 PM, Alena Prokharchyk
>>> <Alena.Prokharchyk@citrix.com> wrote:
>>>> Daan,
>>>> 
>>>> Correct me if I¹m wrong, but all of the logging added by Ding, fall
>>>> under
>>>> "to go with it or to indicate passing a certain code path². I¹ve just
>>>> noticed that some of them were added with DEBUG, and some with WARN
>>>> level,
>>>> and wanted to correct that.
>>>> 
>>>> So we should:
>>>> 
>>>> 1) For sure: never print them out in WARN as there is nothing admin
>>>> should
>>>> do in this case, because the code just handles them by ignoring.
>>>> 2) Figure out what would be the correct level to log them with: INFO or
>>>> DEBUG
>>>> 
>>>> From ³Logging best practices² articles, I can see that people use INFO
>>>> as
>>>> a ³storyline² of normal application behavior, and DEBUG for sort of
>>>> information that helps you to track down the failure cases scenarios.
>>>> To
>>>> me, stuff added by Ding, falls under second category. But I might be
>>>> wrong
>>>> as I don¹t recall on the spot any discussions happening on the debug
>>>> topic, from the mailing list.
>>>> 
>>>> -Alena.
>>>> 
>>>> 
>>>> On 4/2/14, 12:57 PM, "Daan Hoogland" <daan.hoogland@gmail.com> wrote:
>>>> 
>>>>> Alena,
>>>>> 
>>>>> What I read in your comment is a description of INFO vs WARN. Debug
>>>>> would be only for outputting stacktraces to go with it or to indicate
>>>>> passing a certain code path.
>>>>> 
>>>>> Agree?
>>>>> 
>>>>> On Wed, Apr 2, 2014 at 8:31 PM, Alena Prokharchyk
>>>>> <alena.prokharchyk@citrix.com> wrote:
>>>>>> 
>>>>>> -----------------------------------------------------------
>>>>>> This is an automatically generated e-mail. To reply, visit:
>>>>>> https://reviews.apache.org/r/19917/#review39324
>>>>>> -----------------------------------------------------------
>>>>>> 
>>>>>> 
>>>>>> Is there a reason why logs for some exceptions are being logged in
>>>>>> DEBUG mode, and some in WARN? From my point of view, if the code
only
>>>>>> catches it and doesn't error out, it should be logged in DEBUG. Lots
of
>>>>>> Admins are seeking for WARN statements in the log, and they might
be
>>>>>> confused seeing WARN w/o further failure or retry.
>>>>>> 
>>>>>> - Alena Prokharchyk
>>>>>> 
>>>>>> 
>>>>>> On April 2, 2014, 1:55 p.m., Ding Yuan wrote:
>>>>>>> 
>>>>>>> -----------------------------------------------------------
>>>>>>> This is an automatically generated e-mail. To reply, visit:
>>>>>>> https://reviews.apache.org/r/19917/
>>>>>>> -----------------------------------------------------------
>>>>>>> 
>>>>>>> (Updated April 2, 2014, 1:55 p.m.)
>>>>>>> 
>>>>>>> 
>>>>>>> Review request for cloudstack.
>>>>>>> 
>>>>>>> 
>>>>>>> Repository: cloudstack-git
>>>>>>> 
>>>>>>> 
>>>>>>> Description
>>>>>>> -------
>>>>>>> 
>>>>>>> This is the patch for JIRA-6242. See
>>>>>>> https://issues.apache.org/jira/browse/CLOUDSTACK-6242 for more
>>>>>>> details.
>>>>>>> Thanks!
>>>>>>> 
>>>>>>> 
>>>>>>> Diffs
>>>>>>> -----
>>>>>>> 
>>>>>>> 
>>>>>>> engine/orchestration/src/com/cloud/agent/manager/AgentManagerImpl.java
>>>>>>> 0d41bc1
>>>>>>> 
>>>>>>> engine/orchestration/src/com/cloud/agent/manager/ClusteredAgentManager
>>>>>>> Im
>>>>>>> pl.java 01508a4
>>>>>>> 
>>>>>>> engine/orchestration/src/com/cloud/vm/VirtualMachineManagerImpl.java
>>>>>>> 3e088db
>>>>>>> 
>>>>>>> engine/orchestration/src/org/apache/cloudstack/engine/datacenter/entit
>>>>>>> y/
>>>>>>> api/db/dao/EngineDataCenterDaoImpl.java 4b6818e
>>>>>>>  engine/schema/src/com/cloud/dc/dao/DataCenterDaoImpl.java ea5039f
>>>>>>>  engine/schema/src/com/cloud/host/dao/HostDaoImpl.java 426c90d
>>>>>>>  engine/schema/src/com/cloud/storage/dao/StoragePoolHostDaoImpl.java
>>>>>>> e42eaf4
>>>>>>>  engine/schema/src/com/cloud/storage/dao/VMTemplateDaoImpl.java
>>>>>>> 34fdca5
>>>>>>>  engine/schema/src/com/cloud/upgrade/dao/Upgrade2214to30.java
>>>>>>> 58dd916
>>>>>>>  engine/schema/src/com/cloud/vm/dao/ConsoleProxyDaoImpl.java
5e9c2f0
>>>>>>>  engine/schema/src/com/cloud/vm/dao/SecondaryStorageVmDaoImpl.java
>>>>>>> 1f382d6
>>>>>>> 
>>>>>>> engine/storage/src/org/apache/cloudstack/storage/datastore/DataObjectM
>>>>>>> an
>>>>>>> agerImpl.java 6ed1274
>>>>>>> 
>>>>>>> framework/ipc/src/org/apache/cloudstack/framework/serializer/OnwireCla
>>>>>>> ss
>>>>>>> Registry.java 83c8a42
>>>>>>> 
>>>>>>> plugins/hypervisors/xen/src/com/cloud/hypervisor/xen/discoverer/XcpSer
>>>>>>> ve
>>>>>>> rDiscoverer.java 0ad6dc4
>>>>>>> 
>>>>>>> plugins/hypervisors/xen/src/com/cloud/hypervisor/xen/resource/XenServe
>>>>>>> rC
>>>>>>> onnectionPool.java b779085
>>>>>>> 
>>>>>>> plugins/hypervisors/xen/src/com/cloud/hypervisor/xen/resource/XenServe
>>>>>>> rS
>>>>>>> torageProcessor.java e512046
>>>>>>> 
>>>>>>> plugins/storage/volume/solidfire/src/org/apache/cloudstack/storage/dat
>>>>>>> as
>>>>>>> tore/lifecycle/SolidFirePrimaryDataStoreLifeCycle.java af6a77a
>>>>>>>  server/src/com/cloud/resource/ResourceManagerImpl.java f9a59ba
>>>>>>>  server/src/com/cloud/server/ConfigurationServerImpl.java b8da4c8
>>>>>>> 
>>>>>>> services/console-proxy/server/src/com/cloud/consoleproxy/ConsoleProxyT
>>>>>>> hu
>>>>>>> mbnailHandler.java 06f21d3
>>>>>>>  utils/src/com/cloud/utils/net/NetUtils.java 6350986
>>>>>>> 
>>>>>>> Diff: https://reviews.apache.org/r/19917/diff/
>>>>>>> 
>>>>>>> 
>>>>>>> Testing
>>>>>>> -------
>>>>>>> 
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> 
>>>>>>> Ding Yuan
>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> --
>>>>> Daan
>>>> 
>>> 
>>> 
>>> 
>>> --
>>> Daan
>> 
> 
> 
> 
> -- 
> Daan
> 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message