arrow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Neal Richardson (Jira)" <j...@apache.org>
Subject [jira] [Created] (ARROW-7047) [C++][Dataset] Filter expressions should not require exact type match
Date Fri, 01 Nov 2019 22:06:00 GMT
Neal Richardson created ARROW-7047:
--------------------------------------

             Summary: [C++][Dataset] Filter expressions should not require exact type match
                 Key: ARROW-7047
                 URL: https://issues.apache.org/jira/browse/ARROW-7047
             Project: Apache Arrow
          Issue Type: New Feature
          Components: C++ - Dataset
            Reporter: Neal Richardson


It's not trivial for users to be able to ensure that scalars are of identical type to the
fields they relate to in Expressions. For one, FieldExpressions don't contain a type reference,
so at the time when I construct {{field_ref("col1") > scalar(42)}}, I don't know exactly
what type col1 is to be able to ensure that scalar(42) matches. Even if it were available,
I wouldn't be able to determine what type to make it if the expression were {{(field_ref("col1")
+ field_ref("col2")) > scalar(42)}}.

We should allow CompareExpressions to cast the inputs as necessary. This should be among integer
types and floating point types, and across integers and floats too. Likewise among date/timestamp
types, and probably if comparing a string scalar against a date/timestamp column, the string
should be parsed as a datetime. We also need to think about DictionaryTypes (though in practice
this is moot until we have a comparison kernels that work on strings).

[~fsaintjacques][~bkietz]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message