arrow-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joris Van den Bossche <jorisvandenboss...@gmail.com>
Subject Re: Indexing, encoding, transformations and processing with PyArrow - GitHub 6284
Date Tue, 28 Jan 2020 07:43:17 GMT
On Tue, 28 Jan 2020 at 08:36, Athanassios I. Hatzis <athanassios@healis.eu>
wrote:

>
> There was also the following question in my email that was not answered.
> > > I also noticed that there is NumPy integration and you can convert
> easily from NumPy to Arrow
> > > but
> > > the reverse direction has several limitations. For example I cannot
> create view for StringArray
> > > (NotImplementedError: NumPy array view is only supported for primitive
> types). But string()
> > > (utf8)
> > > is in the list of your primitive types. Any plans for supporting this
> type with NumPy soon ?
>
> Could you please suggest or point to a piece of code on how to convert
> arrow.StringArray to numpy
> for further processing ? Do I have to forget the view with the to_numpy()
> method and make a copy in
> order to process it, modify it in NumPy ?
>
> You can "convert" a pyarrow string array to a numpy array (eg with
np.array(pyarrow_array), and which will give you an object dtype numpy
array), but you cannot create a numpy view on the array. A pyarrow string
array is a variable length string dtype, something that is not supported in
numpy. So for this case, you will always need to make a copy (and convert
to an object dtype).

Joris

Mime
View raw message