asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Xikui Wang <xik...@uci.edu>
Subject Re: Filter incoming data by query predicates
Date Fri, 07 Dec 2018 01:14:50 GMT
Hi Sandra,

Yes. It will store the entire record. Note that the applying function to a
feed is different from adding a filter to a feed. To help you understand
their difference better, here is an example.

Imagine the data feed as a big dataset called FeedDataset, and you want to
store the ingested data into the TargetDataset. An equivalent statement
that moves data from the feed to the target dataset looks like this:

insert into TargetDataset(select value f from FeedDataset f);

If you apply a function called "testlib#process_func" on to the feed, the
equivalent statement is like this:

insert into TargetDataset(select value testlib#process_func(f) from
FeedDataset f);

If you have a filter function called "testlib#filter_func", and you add it
to the feed using the WHERE clause, the equivalent statement becomes this:

insert into TargetDataset(select value testlib#process_func(f) from
FeedDataset f where testlib#filter_func(f) == TRUE);

Thus, the filter function and the applied function are two things and they
are orthogonal. In the last example, some incoming data are filtered out by
the function (filter_func) in the where clause, and the remained incoming
data will still be processed by the applied function (process_func). You
can use either one that fits your needs. :)

Best,
Xikui

On Thu, Dec 6, 2018 at 8:21 AM sandraskarshaug@gmail.com <
sandraskarshaug@gmail.com> wrote:

> Hi again!
>
> I am currently trying to make use of the filtering by query predicate
> example which was discussed in another thread here ("Build UDF project"),
> see below:
>
> *connect feed UserFeed to dataset EmpDataset WHERE
> testlib#wordDetector(fname) = TRUE;*
> start feed UserFeed;
>
> . using the wordDetector UDF found here:
> https://github.com/idleft/asterix-udf-template/blob/master/src/main/java/org/apache/asterix/external/library/WordInListFunction.java
>
> However, the output type of this UDF, as defined in library_descriptor.xml
> is "ABOOLEAN". Will it still store the entire record (InputRecordType) in
> the EmpDataset, or only the boolean value? And, if I would like to use the
> records which pass the filtering in wordDetector as input to another UDF,
> would I need to change the output type of the UDF? If so, the check
> "testlib#wordDetector(fname) = TRUE;*" will not work anymore, due to the
> output being an entire record instead of only a boolean.
>
> I appreciate your help!
>
> Best regards,
> Sandra
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message