drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From mehant baid <baid.meh...@gmail.com>
Subject Re: Some questions on UDFs
Date Sat, 04 Jul 2015 21:14:27 GMT
For a detailed example on using ComplexWriter interface you can take a look
at the Mappify
<https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/Mappify.java>
(kvgen) function. The function itself is very simple however it makes use
of the utility methods in MappifyUtility
<https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/MappifyUtility.java>
and MapUtility
<https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/vector/complex/MapUtility.java>
which perform most of the work.

Currently we don't have a generic infrastructure to handle errors coming
out of functions. However there is UserException, which when raised will
make sure that Drill does not gobble up the error message in that
exception. So you can probably throw a UserException with the failing input
in your function to make sure it propagates to the user.

Thanks
Mehant

On Sat, Jul 4, 2015 at 1:48 PM, Jacques Nadeau <jacques@apache.org> wrote:

> *Holders are for both input and output.  You can also use CompleWriter for
> output and FieldReader for input if you want to write or read a complex
> value.
>
> I don't think we've provided a really clean way to construct a
> Repeated*Holder for output purposes.  You can probably do it by reaching
> into a bunch of internal interfaces in Drill.  However, I would recommend
> using the ComplexWriter output pattern for now.  This will be a little less
> efficient but substantially less brittle.  I suggest you open up a jira for
> using a Repeated*Holder as an output.
>
> On Sat, Jul 4, 2015 at 1:38 PM, Ted Dunning <ted.dunning@gmail.com> wrote:
>
> > Holders are for input, I think.
> >
> > Try the different kinds of writers.
> >
> >
> >
> > On Sat, Jul 4, 2015 at 12:49 PM, Jim Bates <jbates@maprtech.com> wrote:
> >
> > > Using a repeatedholder as a @param I've got working. I was working on a
> > > custom aggregator function using DrillAggFunc. In this I can do simple
> > > things but If I want to build a list values and do something with it in
> > the
> > > final output method I think I need to use RepeatedHolders in the
> > > @Workspace. To do that I need to create a new one in the setup method.
> I
> > > can't get one built. They all require a BufferAllocator to be passed in
> > to
> > > build it. I have not found a way to get an allocator yet. Any
> > suggestions?
> > >
> > > On Sat, Jul 4, 2015 at 1:37 PM, Ted Dunning <ted.dunning@gmail.com>
> > wrote:
> > >
> > > > If you look at the zip function in
> > > > https://github.com/mapr-demos/simple-drill-functions you can have an
> > > > example of building a structure.
> > > >
> > > > The basic idea is that your output is denoted as
> > > >
> > > >         @Output
> > > >         BaseWriter.ComplexWriter writer;
> > > >
> > > > The pattern for building a list of lists of integers is like this:
> > > >
> > > >         writer.setValueCount(n);
> > > >         ...
> > > >         BaseWriter.ListWriter outer = writer.rootAsList();
> > > >         outer.start(); // [ outer list
> > > >         ...
> > > >         // for each inner list
> > > >             BaseWriter.ListWriter inner = outer.list();
> > > >             inner.start();
> > > >             // for each inner list element
> > > >                 inner.integer().writeInt(accessor.get(i));
> > > >             }
> > > >             inner.end();   // ] inner list
> > > >         }
> > > >         outer.end(); // ] outer list
> > > >
> > > >
> > > >
> > > > On Sat, Jul 4, 2015 at 10:29 AM, Jim Bates <jbates@maprtech.com>
> > wrote:
> > > >
> > > > > I have working aggregation and simple UDFs. I've been trying to
> > > document
> > > > > and understand each of the options available in a Drill UDF.
> > > > Understanding
> > > > > the different FunctionScope's, the ones that are allowed, the ones
> > that
> > > > are
> > > > > not. The impact of different cost categories. The different  steps
> > > needed
> > > > > to understand handling any of the supported data types  and
> > structures
> > > in
> > > > > drill.
> > > > >
> > > > > Here are a few of my current road blocks. Any pointers would be
> > greatly
> > > > > appreciated.
> > > > >
> > > > >
> > > > >    1. I've been trying to understand how to correctly use
> > > RepeatedHolders
> > > > >    of whatever type. For this discussion lets start with a
> > > > >    RepeatedBigIntHolder. I'm trying to figure out the best way to
> > > create
> > > > a
> > > > > new
> > > > >    one. I have not figured out where in the existing drill code
> > someone
> > > > > does
> > > > >    this. If I use a  RepeatedBigIntHolder as a Workspace object is
> is
> > > > null
> > > > > to
> > > > >    start with. I created a new one in the startup section of the
> udf
> > > but
> > > > > the
> > > > >    vector was null. I can find no reference in creating a new
> > > > BigIntVector.
> > > > >    There is a way to create a BigIntVector and I did find an
> example
> > of
> > > > >    creating a new VarCharVector but I can't do that using the drill
> > jar
> > > > > files
> > > > >    from 1.0. The org.apache.drill.common.types.TypeProtos and
> > > > >    the org.apache.drill.common.types.TypeProtos.MinorType classes
> do
> > > not
> > > > >    appear to be accessible from the drill jar files.
> > > > >    2. What is the best way to close out a UDF in the event it
> > generates
> > > > an
> > > > >    exception? Are there specific steps one should follow to make
a
> > > clean
> > > > > exit
> > > > >    in a catch block that are beneficial to Drill?
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message