arrow-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fernando Herrera <fernando.j.herr...@gmail.com>
Subject Re: Are Arrow, Flight and Plasma suitable for my use case?
Date Fri, 19 Mar 2021 09:20:16 GMT
Hi Matias,

If you are going to do tensor operations, then you could use the Arrow
tensor
representation.

https://arrow.apache.org/docs/python/generated/pyarrow.Tensor.html

However, I don't think the data stored in the tensor will be compressed. It
will be
orderly stored so you can share the tensors with other processes.

I hope that helps
Fernando

On Fri, Mar 19, 2021 at 8:52 AM Matias Guijarro <matias.guijarro@free.fr>
wrote:

> Hi !
>
> I recently learned about Apache Arrow, and as a preliminary study I would
> like to know if it can be a good choice for my use case, or if I have to
> look
> for another technology (or to craft something specific on my own !).
>
> I could not really find answers to my questions in the FAQ or reading
> articles and blogs, but I may have missed some information so I apologize
> in advance if my questions have already been answered.
>
> Arrow is all about storing columnar data. What can be the content of the
> elements in a column ?
>
> In my case, I have scalar values (numbers), 1D arrays and 2D arrays.
> The 2D arrays can be quite big (4000x4000 float 32 for example).
> So, we could imagine long tables, hundred thousands of lines, containing
> a mix of those data types.
>
> I wonder if Arrow stays efficient for such kind of data ? In particular,
> rows of 2D data arrays in a column may be difficult to handle with the
> same level of optimization ? (just guessing)
>
> Is there some compression in Arrow ? I am thinking about blosc kind of
> compression (like in the dead "bcolz" project - by the way someone already
> wondered about Arrow + Blosc: https://github.com/Blosc/bcolz/issues/300)
>
> Another use case I have, is to be able for multiple processes on the same
> computer to access the Arrow in-memory store ; it seems to me Plasma
> does this job but I wonder about the trade-offs ?
>
> Thanks in advance for your advices - any help would be highly appreciated !
>
> Cheers,
> Matias.
>
>
>
>
>
>
>

Mime
View raw message