hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chen Song <chen.song...@gmail.com>
Subject Re: How can I get the constant value from the ObjectInspector in the UDF
Date Wed, 26 Sep 2012 21:08:06 GMT
With my limited knowledge of hive, I don't think it is possible to get the
actual value of the argument and I don't think it is or should be designed
to provide that information either. *initialize* is intended only for
decoding the meta structure (type and its associated evaluation mechanism)
of arguments. Storing any specific values of arguments at runtime is
anti-pattern in my opinion. Can you elaborate more on why you really need
the constant value in your case?

On your 2nd question, you can get the type information from object
inspector. For example, if you expect the 1st argument as a string. You can
use the following code snippet.


>       Category category = arguments[0].getCategory();
>
> String typeName = arguments[0].getTypeName();
> if (category == Category.PRIMITIVE && ((typeName ==
> Constants.STRING_TYPE_NAME) || (typeName == Constants.VOID_TYPE_NAME))) {
> if (typeName == Constants.STRING_TYPE_NAME) {
> stringObjectInspector = (StringObjectInspector) arguments[0];
> }
> } else {
> throw new UDFArgumentTypeException(0, "The " +
> GenericUDFUtils.getOrdinal(1) + " argument is expected to be \"" +
> Constants.STRING_TYPE_NAME + "|"
> + Constants.VOID_TYPE_NAME + "\" but \"" + typeName + "\" is found");
> }
>
>
Chen

On Thu, Sep 27, 2012 at 5:04 AM, java8964 java8964 <java8964@hotmail.com>wrote:

>  I understand your message. But in this situation, I want to do the
> following:
>
> 1) I want to get the value 10 in the initialization stage. I understand
> your point that the value will only available in the evaluate stage, but
> keep in mind that for this 10 in my example, it is a constants value. It
> won't change for every evaluating. It is kind of value I should be able to
> get in the initialization stage, right? The hive Query analyzer should
> understand this parameter in the function in fact is a constants value, and
> will be able to provide to me during the initialization stage.
> 2) Further question, can I get more information from the object inspector?
> For example, when I write the UDF, I want to make sure the first parameter
> is a numeric type. I can get the type, which I am able to valid it based on
> the type. But the question is if I want to error in some case, I want to
> show the end user the NAME of the parameter in my error message, instead of
> just position.
>
> For example, in the UDF as msum(column_name, 10), if I find out the type
> of the column_name is NOT a numeric type, I want in the error message I
> give to the end user, that 'column_name' should be numeric type. But right
> now, in the API, I can not get this information. Only thing I can get is
> the category type information, but I want more.
>
> Is it possible to do that in hive 0.7.1?
>
> Thanks for your help.
>
> Yong
>
> ------------------------------
> Date: Thu, 27 Sep 2012 02:32:19 +0900
> Subject: Re: How can I get the constant value from the ObjectInspector in
> the UDF
> From: chen.song.82@gmail.com
> To: user@hive.apache.org
>
>
> Hi Yong
>
> The way GenericUDF works is as follows.
>
> *ObjectInspector initialize(ObjectInspector[] arguments) *is called only
> once for one GenericUDF instance used in your Hive query. This phase is for
> preparation steps of UDF, such as syntax check and type inference.
>
> *Object evaluate(DeferredObject[] arguments)* is called to evaluate
> against actual arguments. This should be where the actual calculation
> happens and where you can get the real values you talked about.
>
> Thanks,
> Chen
>
> On Wed, Sep 26, 2012 at 4:17 AM, java8964 java8964 <java8964@hotmail.com>wrote:
>
>  Hi, I am using Cloudera release cdh3u3, which has the hive 0.71 version.
>
> I am trying to write a hive UDF function as to calculate the moving sum.
> Right now, I am having trouble to get the constrant value passed in in the
> initialization stage.
>
> For example, let's assume the function is like the following format:
>
> msum(salary, 10) --------- salary is a int type column
>
> which means the end user wants to calculate the last 10 rows of salary.
>
> I kind of know how to implement this UDF. But I have one problem right now.
>
> 1) This is not a UDAF, as each row will return one data back as the moving
> sum.
> 2) I create an UDF class extends from the GenericUDF.
> 3) I can get the column type from the ObjectInspector[] passed to me in
> the initialize() method to verify that 'salary' and 10 both needs to be
> numeric type (later one needs to be integer)
> 4) But I also want to get the real value of 10, in this case, in the
> initialize() stage, so I can create the corresponding data structure based
> on the value end user specified here.
> 5) I looks around the javadoc of ObjectInspector class. I know at run time
> the real class of the 2nd parameter is WritableIntObjectInspector. I can
> get the type, but how I can get the real value of it?
> 6) This is kind of ConstantsObjectInspector, should be able to give the
> value to me, as it already knows the type is int. What how?
> 7) I don't want to try to get the value at the evaluate stage. Can I get
> this value at the initialize stage?
>
> Thanks
>
> Yong
>
>
>
>
> --
> Chen Song
>
>
>


-- 
Chen Song

Mime
View raw message