arrow-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Justin Polchlopek <jpolchlo...@azavea.com>
Subject Re: Achieving parity with Java extension types in Python
Date Tue, 29 Oct 2019 16:26:05 GMT
That sounds about right.  We're doing some work here that might require
this feature sooner than later, and if we decide to go the route that needs
this improved support, I'd be happy to make this PR.  Thanks for showing
that issue.  I'll be sure to tag any contribution with that ticket number.

On Tue, Oct 29, 2019 at 9:01 AM Joris Van den Bossche <
jorisvandenbossche@gmail.com> wrote:

>
> On Mon, 28 Oct 2019 at 22:41, Wes McKinney <wesmckinn@gmail.com> wrote:
>
>> Adding dev@
>>
>> I don't believe we have APIs yet for plugging in user-defined Array
>> subtypes. I assume you've read
>>
>>
>> http://arrow.apache.org/docs/python/extending_types.html#defining-extension-types-user-defined-types
>>
>> There may be some JIRA issues already about this (defining subclasses
>> of pa.Array with custom behavior) -- since Joris has been working on
>> this I'm interested in more comments
>>
>
> Yes, there is https://issues.apache.org/jira/browse/ARROW-6176 for
> exactly this issue.
> What I proposed there is to allow one to subclass pyarrow.ExtensionArray
> and to attach this to an attribute on the custom ExtensionType (eg
> __arrow_ext_array_class__ in line with the other __arrow_ext_.. methods).
> That should allow to achieve similar functionality as what is available in
> Java I think.
>
> If that seems a good way to do this, I think we certainly welcome a PR for
> that (I can also look into it otherwise before 1.0).
>
> Joris
>
>
>>
>> On Mon, Oct 28, 2019 at 3:56 PM Justin Polchlopek
>> <jpolchlopek@azavea.com> wrote:
>> >
>> > Hi!
>> >
>> > I've been working through understanding extension types in Arrow.  It's
>> a great feature, and I've had no problems getting things working in
>> Java/Scala; however, Python has been a bit of a different story.  Not that
>> I am unable to create and register extension types in Python, but rather
>> that I can't seem to recreate the functionality provided by the Java API's
>> ExtensionTypeVector class.
>> >
>> > In Java, ExtensionType::getNewVector() provides a clear pathway from
>> the registered type to output a vector in something other than the
>> underlying vector type, and I am at a loss for how to get this same
>> functionality in Python.  Am I missing something?
>> >
>> > Thanks for any hints.
>> > -Justin
>>
>

Mime
View raw message