nifi-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shawn Weeks (Jira)" <j...@apache.org>
Subject [jira] [Updated] (NIFI-6986) ValidateRecord should optionally validate if nullable fields are present
Date Thu, 30 Jan 2020 21:56:00 GMT

     [ https://issues.apache.org/jira/browse/NIFI-6986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Shawn Weeks updated NIFI-6986:
------------------------------
    Status: Patch Available  (was: In Progress)

> ValidateRecord should optionally validate if nullable fields are present
> ------------------------------------------------------------------------
>
>                 Key: NIFI-6986
>                 URL: https://issues.apache.org/jira/browse/NIFI-6986
>             Project: Apache NiFi
>          Issue Type: Improvement
>          Components: Extensions
>            Reporter: Mark Payne
>            Assignee: Shawn Weeks
>            Priority: Major
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, if a field is nullable according to the schema, ValidateRecord considers the
record to be valid, even if the field is missing completely. For some use cases, this is desirable.
For example, it is common to drop fields in JSON when the field's value is null, because it
can drastically reduce the size of the JSON.
> However, in other use cases, this is not desirable. For example, in a CSV file, we may
want to require that there are the appropriate number of fields in a Record. It may be acceptable,
for instance to have a line like "1234, John Smith, , , ," but not to have a line like "1234,
John Smith".
> ValidateRecord should be updated with a new Property: "Allow Missing Null Values". If
the value is `true` (the default, to avoid changing behavior between versions), the Processor
should behave as it does now, where the absence of the field is synonymous with a null value.
In this case, a line like "1234, John Smith" would be valid when the CSV is expecting 6 fields,
as long as the last 4 fields are nullable.
> But if the value of this new property is `false`, the Processor should require that all
fields be present in the data, even if the field has a null value. In this case, a line like
"1234, John Smith" would be invalid if the CSV were expected to contain 6 fields.
> The `WriteJsonResult` class has a method in it: `private boolean isFieldPresent(RecordField
field, Record record)`. This method should really exist on `Record` itself with a slightly
different signature: `boolean isFieldPresent(RecordField field)`. It should have a default
implementation provided, akin to the implementation in `WriteJsonResult` and then `WriteJsonResult`
should simply use that method.
> `StandardSchemaValidator` should then be updated to use this to validate that records
have all required fields, as configured. `SchemaValidationContext` should then be updated
also to indicate whether or not the presence of null values should be validated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message