arrow-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wes McKinney <>
Subject Re: Indexing, encoding, transformations and processing with PyArrow - GitHub 6284
Date Tue, 28 Jan 2020 21:34:57 GMT
On Tue, Jan 28, 2020 at 1:36 AM Athanassios I. Hatzis
<> wrote:
> On Mon, 2020-01-27 at 10:25 -0600, Wes McKinney wrote:
> >I asked to move this discussion here because we use the dev@ and user@
> > mailing list for discussions (this is explained in the GitHub issue
> > template
> Sure, I noticed this, but then I can hardly find any reason for opening an issue at GitHub.
As a
> user I find a lot easier to open and track an issue for replies at GitHub than registering
> searching in email lists and in my opinion it's a lot easier and far more efficient for
other users
> too, especially newcomers, to search and find relevant answers. By the way how am I supposed
> search, view this user list online from a Web explorer GUI like the one at GitHub, is
there a web
> link ?

Here are the links

We have the GitHub issues as a way to capture information from users
who are not yet familiar with the project.

> > treated as a valid floating point value in algorithms like dictionary_encode
> Hi Wes, I was not aware that np.nan and None are not treated equivalently thanks for
> this with your Notebook. I can understand the logic behind this but it has serious flaws
> originate from SQL, implementation of Codd's relational theory.
> This is one of the reasons that I am promoting Associative Semiotic Hypergraph as an
> data model for processing data in queries. Associations (hyperedge set connecting n data
items) are
> the equivalent of table records but null values are excluded. Therefore in my system
> should always be clean from missing values. Anyway as you suggest I need to maintain
some custom
> code for this.
> There was also the following question in my email that was not answered.
> > > I also noticed that there is NumPy integration and you can convert easily from
NumPy to Arrow
> > > but
> > > the reverse direction has several limitations. For example I cannot create
view for StringArray
> > > (NotImplementedError: NumPy array view is only supported for primitive types).
But string()
> > > (utf8)
> > > is in the list of your primitive types. Any plans for supporting this type
with NumPy soon ?
> Could you please suggest or point to a piece of code on how to convert arrow.StringArray
to numpy
> for further processing ? Do I have to forget the view with the to_numpy() method and
make a copy in
> order to process it, modify it in NumPy ?
> Thank you for your time
> Athan

View raw message