hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "Hive/DeveloperGuide/UDTF" by PaulYang
Date Sat, 17 Jul 2010 01:51:18 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "Hive/DeveloperGuide/UDTF" page has been changed by PaulYang.
http://wiki.apache.org/hadoop/Hive/DeveloperGuide/UDTF?action=diff&rev1=1&rev2=2

--------------------------------------------------

  
  == GenericUDTF Interface ==
  
- To create a UDTF, you must extend the GenericUDTF abstract class and implement the initialize()
and process() methods. The initialize method is called by Hive to notify the UDTF the argument
types to expect. The UDTF must then return an object inspector corresponding to the row objects
that the UDTF will generate.
+ A custom UDTF can be created by extendign the GenericUDTF abstract class then implementing
the {{{initialize}}}, {{{process}}}, and possibly {{{close}}} methods. The initialize method
is called by Hive to notify the UDTF the argument types to expect. The UDTF must then return
an object inspector corresponding to the row objects that the UDTF will generate. Once initialize()
has been called, Hive will give rows to the UDTF using the process() method. While in process(),
the UDTF can produce and forward rows to other operators by calling forward(). Lastly, Hive
will call the close() method when all the rows have processed by the UDTF.
+ 
+ UDTF Example:
+ 
+ {{{
+ package org.apache.hadoop.hive.contrib.udtf.example;
+ 
+ import java.util.ArrayList;
+ 
+ import org.apache.hadoop.hive.ql.exec.UDFArgumentException;
+ import org.apache.hadoop.hive.ql.metadata.HiveException;
+ import org.apache.hadoop.hive.ql.udf.generic.GenericUDTF;
+ import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector;
+ import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory;
+ import org.apache.hadoop.hive.serde2.objectinspector.StructObjectInspector;
+ import org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorFactory;
+ 
+ 
+ /**
+  * GenericUDTFCount2 outputs the number of rows seen, twice. It's output twice
+  * to test outputting of rows on close with lateral view.
+  *
+  */
+ public class GenericUDTFCount2 extends GenericUDTF {
+ 
+   Integer count = Integer.valueOf(0);
+   Object forwardObj[] = new Object[1];
+ 
+   @Override
+   public void close() throws HiveException {
+     forwardObj[0] = count;
+     forward(forwardObj);
+     forward(forwardObj);
+   }
+ 
+   @Override
+   public StructObjectInspector initialize(ObjectInspector[] argOIs) throws UDFArgumentException
{
+     ArrayList<String> fieldNames = new ArrayList<String>();
+     ArrayList<ObjectInspector> fieldOIs = new ArrayList<ObjectInspector>();
+     fieldNames.add("col1");
+     fieldOIs.add(PrimitiveObjectInspectorFactory.javaIntObjectInspector);
+     return ObjectInspectorFactory.getStandardStructObjectInspector(fieldNames,
+         fieldOIs);
+   }
+ 
+   @Override
+   public void process(Object[] args) throws HiveException {
+     count = Integer.valueOf(count.intValue() + 1);
+   }
+ 
+ }
+ 
+ }}}
  
  For reference, here is the abstract class:
  
  {{{
- /**
-  * Licensed to the Apache Software Foundation (ASF) under one
-  * or more contributor license agreements.  See the NOTICE file
-  * distributed with this work for additional information
-  * regarding copyright ownership.  The ASF licenses this file
-  * to you under the Apache License, Version 2.0 (the
-  * "License"); you may not use this file except in compliance
-  * with the License.  You may obtain a copy of the License at
-  *
-  *     http://www.apache.org/licenses/LICENSE-2.0
-  *
-  * Unless required by applicable law or agreed to in writing, software
-  * distributed under the License is distributed on an "AS IS" BASIS,
-  * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-  * See the License for the specific language governing permissions and
-  * limitations under the License.
-  */
  
  package org.apache.hadoop.hive.ql.udf.generic;
  

Mime
View raw message