pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitriy Ryaboy <dvrya...@gmail.com>
Subject Re: toJSON function for tuples, bags and strings, PIG-2641
Date Tue, 10 Apr 2012 14:53:33 GMT
first question: you can do this when outputSchema() is called, as it's
passed the input schema. IIRC, in trunk you have hooks to pass that
info to the backend in a udf.

second question: see discussion on JsonLoader jira.. short answer:
non-trivial, no clear decision on what the most sensible thing to do
is (other than "map" which is unlikely to be what you want). Rather
than do something bad and then be stuck with a poor decision, allowing
people to provide their own schema instead for now.

D

On Tue, Apr 10, 2012 at 1:48 AM, Russell Jurney
<russell.jurney@gmail.com> wrote:
> Followup question: would it be nice if JsonLoader inferred schemas when
> none is present, according to some defaults?
>
> On Tue, Apr 10, 2012 at 12:48 AM, Russell Jurney
> <russell.jurney@gmail.com>wrote:
>
>> Is there a way to get the field names in an EvalFunc? I am close to done
>> but... no cigar :)  I need these to finish.
>>
>>
>> On Mon, Apr 9, 2012 at 11:03 PM, Russell Jurney <russell.jurney@gmail.com>wrote:
>>
>>> So far this is not easy.
>>>
>>>
>>> On Mon, Apr 9, 2012 at 5:42 PM, Russell Jurney <russell.jurney@gmail.com>wrote:
>>>
>>>> I see Jackson being used in the Mozilla stuff.  It looks pretty
>>>> straightforward.
>>>>
>>>>
>>>> On Mon, Apr 9, 2012 at 5:38 PM, Dmitriy Ryaboy <dvryaboy@gmail.com>wrote:
>>>>
>>>>> Jackson is your friend.
>>>>>
>>>>> On Mon, Apr 9, 2012 at 5:14 PM, Russell Jurney <
>>>>> russell.jurney@gmail.com> wrote:
>>>>> > I need to be able to JSONize and return json:chararray's of any
pig
>>>>> > datatypes, to be able to index complex types in ElasticSearch via
>>>>> > Wonderdog.  See: https://issues.apache.org/jira/browse/PIG-2641
>>>>> >
>>>>> > Does anyone have existing code they can contribute to a toJSON UDF
>>>>> that
>>>>> > handles all these types?
>>>>> >
>>>>> > For instance, Mozilla has this Map to JSON UDF:
>>>>> >
>>>>> https://github.com/mozilla-metrics/akela/blob/master/src/main/java/com/mozilla/pig/eval/json/MapToJson.java
>>>>> >
>>>>> > It is apache licensed, so I think I can paste it into a general
>>>>> toJSON UDF?
>>>>> >
>>>>> >
>>>>> > Elephant-bird has this code, which turns JSON to Maps:
>>>>> >
>>>>> https://github.com/kevinweil/elephant-bird/blob/master/src/java/com/twitter/elephantbird/pig/piggybank/JsonStringToMap.java
>>>>> >
>>>>> >  ehh... thinking out loud... I'm just gonna do this in JRuby. If
that
>>>>> has
>>>>> > issues, Python.
>>>>> >
>>>>> > Solved! :)
>>>>> >
>>>>> > --
>>>>> > Russell Jurney twitter.com/rjurney russell.jurney@gmail.com
>>>>> datasyndrome.com
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Russell Jurney twitter.com/rjurney russell.jurney@gmail.com datasyndrome
>>>> .com
>>>>
>>>
>>>
>>>
>>> --
>>> Russell Jurney twitter.com/rjurney russell.jurney@gmail.com datasyndrome.
>>> com
>>>
>>
>>
>>
>> --
>> Russell Jurney twitter.com/rjurney russell.jurney@gmail.com datasyndrome.
>> com
>>
>
>
>
> --
> Russell Jurney twitter.com/rjurney russell.jurney@gmail.com datasyndrome.com

Mime
View raw message