openwhisk-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tyson Norris <tnor...@adobe.com.INVALID>
Subject Re: Allow decision about action result inclusion in logs on a per call basis
Date Tue, 10 Sep 2019 22:02:16 GMT
Thanks for the info. It makes sense. I think it is slightly confusing that this field is not
invoked anywhere in the code, because you only invoke from another class managed externally.

Thanks
Tyson

´╗┐On 9/10/19, 2:45 AM, "Ruediger Maass" <ruediger.maass@de.ibm.com> wrote:

    Hi Tyson, please find my answers below.
    
    A couple questions:
    * " Action results may contain sensible data that should not be logged." - 
    I don't think this solves that problem, correct? E.g. the result or error 
    could have sensitive data still, right? Since this is generated based on 
    the user code and that code's error messages which may have just included 
    the result details you were trying to hide.
    
    > The error result in my example is created by the managed runtime, not by 
    the user action code (that has not been started at this point). The error 
    message in this case will never contain any user messages. On the other 
    hand, user error messages that are driven by the action code are like 
    normal log statements: the customer decides what is logged and which 
    content error messages contain. This discussion is a good example why we 
    do not propose any hard-coded strategy but a flexible flag that can be set 
    as needed by product-specific SPI implementations. We know there are 
    different opinions. We will set the flag with our product-specific code as 
    we need. The generic character of ActivationFileStorage is still preserved
    . The flag makes the class even more reusable, we try to avoid to 
    introduce just another specialized copy of that class.
    
    * Does it make sense to have the data be different in Activation json vs 
    user_log?
    One approach I've been wondering about is applying the same log to the 
    Activation json, e.g.: do not store the result *anywhere* for success 
    results. Obviously this cannot work for async invocations, but we have 
    many cases where this data is never used, and just bloats the database. 
    Some factors that may affect ability to NOT store the result include: sync 
    vs async, "debugging" vs "prod", error vs success, large vs small. Your 
    case signals to me that this is useful, at least to have as a config flag, 
    but I would apply it at the Activation level, and not just downstream in 
    the user_log.
    
    > We also see that activations consume database space. But currently we 
    want/need the current behavior of OpenWhisk as it is in order to provide a 
    robust product implementation (especially also for action sequences). With 
    or without the proposed flag it would still be easy to add a feature flag 
    (or SPI implementation) like you discussed for changing the application 
    json in the database if that would fit better for your product 
    implementation. As said, the default behavior of OpenWhisk is not changed 
    by our change in any way. Please also note how tiny the code change is 
    (github.com/apache/openwhisk/pull/4604) - we do not add more complexity to 
    the code with this little change.
    
    PR is github.com/apache/openwhisk/pull/4604
    
    Thanks, Ruediger
    
    
    
    From:   Tyson Norris <tnorris@adobe.com.INVALID>
    To:     "dev@openwhisk.apache.org" <dev@openwhisk.apache.org>
    Date:   06/09/2019 20:56
    Subject:        [EXTERNAL] Re: Allow decision about action result 
    inclusion in logs on a per call basis
    
    
    
    A couple questions:
    * " Action results may contain sensible data that should not be logged." - 
    I don't think this solves that problem, correct? E.g. the result or error 
    could have sensitive data still, right? Since this is generated based on 
    the user code and that code's error messages which may have just included 
    the result details you were trying to hide.
    * Does it make sense to have the data be different in Activation json vs 
    user_log? 
    
    One approach I've been wondering about is applying the same log to the 
    Activation json, e.g.: do not store the result *anywhere* for success 
    results. Obviously this cannot work for async invocations, but we have 
    many cases where this data is never used, and just bloats the database. 
    Some factors that may affect ability to NOT store the result include: sync 
    vs async, "debugging" vs "prod", error vs success, large vs small. Your 
    case signals to me that this is useful, at least to have as a config flag, 
    but I would apply it at the Activation level, and not just downstream in 
    the user_log.
    
    Thanks
    Tyson
    
    
    
    
    On 9/6/19, 10:21 AM, "Ruediger Maass" <ruediger.maass@de.ibm.com> wrote:
    
        corrected the subject of this email, sorry about this.
     
     
     
        From:   Ruediger Maass/Germany/IBM
        To:     dev@openwhisk.apache.org
        Date:   06/09/2019 19:18
        Subject:        Re: [EXTERNAL] Re: Allow decision about action result 
        inclusion in logs on a per call basis
     
     
     
        Ok, I see the problem. I'll ty to make it more clear:
     
        - with "action log" I mean the output of action invocations that is 
    logged 
        for the users (stored in the userlogs-XXXX.log files of the invokers 
    and 
        controllers (for action sequences)
     
        - with "activation json" I mean the json document for an activation 
    like 
        this:
        {
            "activationId": "5a2b19119c574ce7ab19119c572ce730",
            "annotations": [
                {
                    "key": "path",
                    "value": 
    "ruediger.maass@de.ibm.com_MySpaceUS1/compileError"
                }, 
               ...etc ....
            ],
            "duration": 451,
            "end": 1567721624876,
            "logs": [],   // no log lines in this case
            "name": "compileError",
            "namespace": "ruediger.maass@de.ibm.com_MySpaceUS1",
            "publish": false,
            "response": {
                "result": {
                    "error": "Initialization has failed due to: SyntaxError: 
        Unexpected identifier\n    at NodeActionRunner.init 
        (/nodejsAction/runner.js:79:109)\n    at doInit 
        (/nodejsAction/src/service.js:142:31)\n    at initCode 
        (/nodejsAction/src/service.js:81:24)\n    at 
    /nodejsAction/app.js:69:13\n 
         at Layer.handle [as handle_request] 
        (/node_modules/express/lib/router/layer.js:95:5)\n    at next 
        (/node_modules/express/lib/router/route.js:131:13)\n    at 
    Route.dispatch 
        (/node_modules/express/lib/router/route.js:112:3)\n    at Layer.handle 
    [as 
        handle_request] (/node_modules/express/lib/router/layer.js:95:5)\n at 
        /node_modules/express/lib/router/index.js:277:22\n    at 
        Function.process_params 
        (/node_modules/express/lib/router/index.js:330:12)"
                },
                "status": "action developer error",
                "success": false
            },
            "start": 1567721624425,
            "subject": "ruediger.maass@de.ibm.com",
            "version": "0.0.1"
        }
     
        - We do not want to change the activation json. We want to change the 
        output to the action log that is produced from the activation json.
     
        - one activation json produces one line in the action log of type 
        'activation_record' and n lines of type 'user_log', one user_log line 
    per 
        output line that is printed by the action. The user_log lines are 
    stored 
        in the 'logs' field of the activation json. This would be the output 
    in 
        the action log for the above mentioned invocation:
     
        1 line of type action_record and 0 lines of type user_log, I formatted 
    the 
        one line for better readability:
     
        {"activationId":"68484f5c911d412b884f5c911df12b45",
        "duration":498,
        "end":1567785044285,
           ...etc. ...
           // here's the result from above we do not want to be suppressed:
        "message":"{\"error\":\"Initialization has failed due to: SyntaxError: 
    
        Unexpected identifier\\n 
          at NodeActionRunner.init (/nodejsAction/runner.js:79:109)\\n 
          at doInit (/nodejsAction/src/service.js:142:31)\\n
          at initCode (/nodejsAction/src/service.js:81:24)\\n
          at /nodejsAction/app.js:69:13\\n
          at Layer.handle [as handle_request] 
        (/node_modules/express/lib/router/layer.js:95:5)\\n
          at next (/node_modules/express/lib/router/route.js:131:13)\\n
          at Route.dispatch 
    (/node_modules/express/lib/router/route.js:112:3)\\n
          at Layer.handle [as handle_request] 
        (/node_modules/express/lib/router/layer.js:95:5)\\n
          at /node_modules/express/lib/router/index.js:277:22\\n
          at Function.process_params 
        (/node_modules/express/lib/router/index.js:330:12)\"}",
        "name":"compileError",
        "namespace":"ruediger.maass@de.ibm.com_MySpaceUS1",
        "path":"ruediger.maass@de.ibm.com_MySpaceUS1/compileError",
        "response":{
          "status":"action developer error",
          "success":false
        },
        "type":"activation_record", 
     
          ... and some more fields...
        }
     
        - We want to achieve that the 'result' field in the activation json is 
    not 
        suppressed in the error case (success=false) and printed out to the 
    action 
        log so that the user can find more information in the action log about 
    the 
        error. But we do not want the 'result' field content be part in the 
    action 
        log if an activation succeeds (success=true). As already said, this is 
    
        just one example and other strategies for adding/suppressing results 
    may 
        be applicable that also cannot be solved with a simple flag from 
    outside 
        (by an annotation or a system configuration flag). The small change we 
    
        request does not change the behavior of OpenWhisk and avoids a 
    hard-coded 
        strategy.
     
        Requested PR is 
    https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fopenwhisk%2Fpull%2F4604&amp;data=02%7C01%7Ctnorris%40adobe.com%7Cedb497dcc60143f9cc2b08d735d39fb9%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C1%7C637037055386943428&amp;sdata=6yls7OEDwxt3i%2FSquXRA4oSLXsQPxSjFWebauccgde4%3D&amp;reserved=0=

    
     
        Thanks, Ruediger 
     
     
     
     
        From:   Rodric Rabbah <rodric@gmail.com>
        To:     dev@openwhisk.apache.org
        Date:   06/09/2019 02:05
        Subject:        [EXTERNAL] Re: Allow decision about action result 
        inclusion in logs on a per call basis
     
     
     
        I think it?s not clear to me when you say ?log?. 
     
        The activation record has a log field and a result field. Are you 
    trying 
        to add error logs to the result field?
     
        At first I thought you meant you wanted to omit the result field from 
    the 
        activation record except during errors. But I think I misunderstood.
     
        Maybe an example will be helpful. 
     
        -r
     
        > On Sep 5, 2019, at 7:31 PM, Ruediger Maass 
    <ruediger.maass@de.ibm.com> 
        wrote:
        > 
        > Hi there,
        > 
        > may I ask for comments? Annotations for the decision whether results 
    
        > should be added in the action log or not would not fix the problem 
    we 
        try 
        > to solve and is a different, independent question. 
        > 
        > We want to add error information in the log but not print results in 
    the 
     
        > good (i.e., success) case. This can only be done if one inspects the 
    
        > activation 'success' field (or the 'error' field in the result) and 
    in 
        > this case log the result that contains the error information. We 
    could 
        > hard-code a strategy in `ActivationFileStorage`. But this has the 
        > disadvantage that it would change the default behavior of OpenWhisk 
    and 
        > not everyone might like this and may have different opinions about 
    the 
        > "right" strategy. Therefore we propose the more flexible way of 
        > introducing a flag, let anyone decide on the right strategy, and not 
    
        > hard-coding a strategy. In the default case, there would be no 
        behavioral 
        > change of OpenWhisk with the flag, i.e. no one needs to care about 
    the 
        > flag if not wanted. But it would help us and maybe others to solve 
    the 
        > described problem.
        > 
        > Thanks, Ruediger
        > 
        > 
        > 
        > From:   Ruediger Maass/Germany/IBM
        > To:     dev@openwhisk.apache.org
        > Date:   04/09/2019 11:07
        > Subject:        Re: Allow decision about action result inclusion in 
    logs 
     
        > on a per call basis
        > 
        > 
        > Let me please add the information that the behavior of OpenWhisk is 
    not 
        > changed by this fix. The provided feature must be actively chosen.
        > 
        > Thanks, Ruediger
        > 
        > 
        > 
        > 
        > From:   "Ruediger Maass" <ruediger.maass@de.ibm.com>
        > To:     dev@openwhisk.apache.org
        > Date:   04/09/2019 10:42
        > Subject:        [EXTERNAL] Re: Allow decision about action result 
        > inclusion in logs on a per call basis
        > 
        > 
        > 
        > Thanks for the discussion! 
        > 
        > This PR is only about suppressing results in action logs. If a user 
        still 
        > wants to have the result in the action log then he or she can simply 
    add 
        a 
        > 
        > log statement in the action for that. Why should we introduce a new 
        > mechanism for this like an annotation for automatically adding the 
        result 
        > to the log (or not)? Seems complicated. 
        > 
        > In our specific case we need the flag per invocation because we want 
    to 
        > the add the result in error cases. For example: main is not defined, 
    
        > compile error in action, etc. see below:
        > 
        > 1) no main() in js 
        >        "result": {
        >            "error": "Initialization has failed due to: 
    ReferenceError: 
        > main is not defined\n
        >                at eval (eval at <anonymous> 
        > (/nodejsAction/runner.js:79:109), <anonymous>:8:8)\n
        >                at eval (eval at <anonymous> 
        > (/nodejsAction/runner.js:79:109), <anonymous>:8:14)\n
        >                at NodeActionRunner.init 
        (/nodejsAction/runner.js:79:45)\n
        >                at doInit (/nodejsAction/src/service.js:142:31)\n
        >                at initCode (/nodejsAction/src/service.js:81:24)\n
        >                at /nodejsAction/app.js:69:13\n
        >                at Layer.handle [as handle_request] 
        > (/node_modules/express/lib/router/layer.js:95:5)\n
        >                at next 
        > (/node_modules/express/lib/router/route.js:131:13)\n
        >                at Route.dispatch 
        > (/node_modules/express/lib/router/route.js:112:3)\n
        >                at Layer.handle [as handle_request] 
        > (/node_modules/express/lib/router/layer.js:95:5)"
        > 
        > 2) compile Error
        >        "result": {
        >            "error": "Initialization has failed due to: SyntaxError: 
        > Unexpected identifier\n
        >                at NodeActionRunner.init 
        > (/nodejsAction/runner.js:79:109)\n
        >                at doInit (/nodejsAction/src/service.js:142:31)\n
        >                at initCode (/nodejsAction/src/service.js:81:24)\n
        >                at /nodejsAction/app.js:69:13\n
        >                at Layer.handle [as handle_request] 
        > (/node_modules/express/lib/router/layer.js:95:5)\n
        >                at next 
        > (/node_modules/express/lib/router/route.js:131:13)\n
        >                at Route.dispatch 
        > (/node_modules/express/lib/router/route.js:112:3)\n
        >                at Layer.handle [as handle_request] 
        > (/node_modules/express/lib/router/layer.js:95:5)\n
        >                at /node_modules/express/lib/router/index.js:277:22\n
        >                at Function.process_params 
        > (/node_modules/express/lib/router/index.js:330:12)"
        > 
        > 
        > This kind of information cannot be logged anywhere else and does not 
    
        > contain sensible information. It is very helpful information for the 
    
        > customer to correct the action. So we do not want to suppress this 
        > information.
        > 
        > 
        > 
        > From:   James Thomas <jthomas.uk@gmail.com>
        > To:     dev@openwhisk.apache.org
        > Date:   03/09/2019 15:24
        > Subject:        [EXTERNAL] Re: Allow decision about action result 
        > inclusion in logs on a per call basis
        > 
        > 
        > 
        > This is a sensible change and I agree with what Rodric has 
    suggested:
        > can we make this a per-action annotation?
        > 
        >> On Tue, 3 Sep 2019 at 13:28, Rodric Rabbah <rodric@gmail.com> 
    wrote:
        >> 
        >>> Action results may contain sensible data that should not be 
    logged.
        >> 
        >> I interpret this to mean: not stored in the activation record.
        >> 
        >> If this is what you mean, why not make this a feature controlled 
    per 
        > action
        >> using an annotation and let the user decide: ok to save the result, 
    not 
     
        > ok
        >> to save the result, only save the result on error.
        >> 
        >> -r
        >> 
        >> On Tue, Sep 3, 2019 at 8:17 AM Ruediger Maass 
        > <ruediger.maass@de.ibm.com>
        >> wrote:
        >> 
        >>> Action results may contain sensible data that should not be 
    logged. 
        > There
        >>> is a system configuration flag (writeResultToFile) that can be 
    used to
        >>> switch on or off the result inclusion in logs. But this is a 
    global 
        > switch
        >>> that holds for all activations. In our case we want to able to 
    decide 
        > per
        >>> activation whether or not the result should be included in the log 
    or 
        > not.
        >>> In our special case we want results to be included in case of 
    errors 
        > (in
        >>> other words, our predicate function for the decision is: 'Does the 
    
        > result
        >>> contain an error field?'). But also other decision logic may be
        >>> applicable.
        >>> 
        >>> This PR 
        > 
        
    https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fopenwhisk%2Fpull%2F4604&amp;data=02%7C01%7Ctnorris%40adobe.com%7Cedb497dcc60143f9cc2b08d735d39fb9%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C1%7C637037055386943428&amp;sdata=6yls7OEDwxt3i%2FSquXRA4oSLXsQPxSjFWebauccgde4%3D&amp;reserved=0=

     
     
        > 
        > is a small change
        >>> that extends ActivationFileStorage.
        >>> 
        >>> Please help and comment on the PR.
        >>> 
        >>> Thank you, Ruediger
        >>> 
        >>> 
        > 
        > 
        > 
        > -- 
        > Regards,
        > James Thomas
        > 
        > 
        > 
        > 
        > 
        > 
        > 
        > 
        > 
        > 
     
     
     
     
     
     
     
    
    
    
    
    
    
    

Mime
View raw message