nifi-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [nifi] james94 commented on pull request #4242: NIFI-7411: Integrate H2O Driverless AI MSP in NiFi
Date Fri, 01 May 2020 23:27:04 GMT

james94 commented on pull request #4242:
URL: https://github.com/apache/nifi/pull/4242#issuecomment-622609665


   @pvillard31 
   
   To answer your first question, "**does the current implementation implies that all fields
of the input record must be used for the prediction?**"
   
   - No not all fields of the input record need to be used for the prediction. Going back
to your example, if we pass in features `A,B,C` to the MOJO Pipeline, it will filter out the
features it doesn't need. So, the MOJO Pipeline will ignore feature `C` and make the prediction
for label `D` based on features `A,B`. So, the users won't have to worry about manually removing
fields.
   
   To answer your second question, "**what will be the name of the field for the prediction,
is there a way to specify/force the name?**"
   
   - The MOJO Pipeline already has the prediction field name(s). When the MOJO Pipeline is
built in Driverless AI, some of the metadata it is given is the predicted field name(s). In
the processor in the predict() method, when I use the MojoPipeline model to make the prediction
on the input test data, next I convert the MojoFrame into a predictedRecordMap. This hash
map contains key value pairs of one or more predicted field names and field values. Now we
have a predictedRecord that also holds one or more predicted field names and field values.
So, when the user configures the CSVRecordSetWriter, they can choose "Inherit Record Schema"
for Schema Access Strategy to get the predicted field names from the predictedRecord.
   
   Also I have a GitHub Repo that has 2 NiFi templates and some example data to use the **ExecuteMojoScoringRecord**
processor in a Hydraulic System Predictive Maintenance use case. Since this processor uses
a Driverless AI MOJO Scoring Pipeline, the user will need a Driverless AI License Key to use
the processor.
   
   https://github.com/james94/Hydraulic-System-Predictive-Maintenance


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



Mime
View raw message