arrow-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joris Van den Bossche <>
Subject Re: Achieving parity with Java extension types in Python
Date Tue, 29 Oct 2019 12:42:40 GMT
On Mon, 28 Oct 2019 at 22:41, Wes McKinney <> wrote:

> Adding dev@
> I don't believe we have APIs yet for plugging in user-defined Array
> subtypes. I assume you've read
> There may be some JIRA issues already about this (defining subclasses
> of pa.Array with custom behavior) -- since Joris has been working on
> this I'm interested in more comments

Yes, there is for exactly
this issue.
What I proposed there is to allow one to subclass pyarrow.ExtensionArray
and to attach this to an attribute on the custom ExtensionType (eg
__arrow_ext_array_class__ in line with the other __arrow_ext_.. methods).
That should allow to achieve similar functionality as what is available in
Java I think.

If that seems a good way to do this, I think we certainly welcome a PR for
that (I can also look into it otherwise before 1.0).


> On Mon, Oct 28, 2019 at 3:56 PM Justin Polchlopek
> <> wrote:
> >
> > Hi!
> >
> > I've been working through understanding extension types in Arrow.  It's
> a great feature, and I've had no problems getting things working in
> Java/Scala; however, Python has been a bit of a different story.  Not that
> I am unable to create and register extension types in Python, but rather
> that I can't seem to recreate the functionality provided by the Java API's
> ExtensionTypeVector class.
> >
> > In Java, ExtensionType::getNewVector() provides a clear pathway from the
> registered type to output a vector in something other than the underlying
> vector type, and I am at a loss for how to get this same functionality in
> Python.  Am I missing something?
> >
> > Thanks for any hints.
> > -Justin

View raw message