nifi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andy Christianson <aichr...@protonmail.com>
Subject minifi-cpp: Expression language core API
Date Thu, 09 Nov 2017 16:06:45 GMT
MiNiFi - C++ Devs,

I am currently working on MINIFICPP-49, the expression language feature. While expression
compilation and evaluation is fairly self-contained, at the very least the API to access expression
evaluation will touch core components.

Here is how NiFi is currently exposing expression evaluation to processors:

    ...
    try {
                // read the url property from the context
                final String urlstr = trimToEmpty(context.getProperty(PROP_URL).evaluateAttributeExpressions(requestFlowFile).getValue());
                final URL url = new URL(urlstr);
    ...

While we have the opportunity now to improve this, we have a couple design constraints: the
expression code comes from properties, and dynamic evaluation of it requires a flow file as
input.

Because expressions are defined as processor properties, it is natural to expose expression
evaluation via the ProcessContext API. The current minifi-cpp API to get static properties
is as follows:

    bool getProperty(const std::string &name, std::string &value) {
      return processor_node_->getProperty(name, value);
    }

If we do not wish to introduce a Property type with its own evaluateAttributeExpressions method,
we could simply introduce another ProcessContext method for evaluating dynamic properties:

    bool evaluateProperty(const std::string &name, const core::FlowFile &flow_file,
std::string &value) {
      ...
    }

The implementation of this would compile the expression (the raw value as returned by getProperty(...))
if it has not yet been compiled, then evaluate the compiled expression against the provided
FlowFile. The end result is an API similar to, but simpler than, the NiFi interface. The alternative
is to provide the expression primitives to processors and allow them to manage compilation/evaluation
on their own. This would increase complexity across all processors which support expression
properties, which will likely be most processors.

The next important question which impacts core minifi is whether or not expression language
should be an extension. Whether or not it is an extension, some kind of standard interface
to expressions will need to be made available to all processors. Here are the pros/cons of
putting it in an extension, as far as I can tell:

Pros:

- Reduce compiled size of minifi somewhat (the lexer/parser is currently ~4300 lines of C++
with no additional library or runtime dependencies) when feature is disabled
- Allow for alternate expression language implementations in the future

Cons:

- Additional complexity by needing to add Expression primitives, a standard Expression compiler
API, dynamic object loading, and an empty (NoOp) implementation if the extension is not included
- Additional vtable lookups on an operation which will be invoked very frequently (every property
lookup on every flow file which supports expressions)
- Makes it harder for gcc/clang/etc. to inline/optimize expression language functions
- Core processors (e.g. GetFile/PutFile, where expression language will almost certainly be
desired for file paths and other properties) will depend on an optional extension

I would like to hear feedback from the dev community on these two important topics (the interface
to the expression language and whether or not the implementation should be an extension) before
writing the code that touches core components. The API question is ultimately more important
because it touches all current and future processor authors. The decision of whether it is
an extension or not is more reversible.

Regards,

Andy I.C.
Mime
View raw message