arrow-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sisneros, Dominic E (FAA)" <Dominic.E.Sisne...@faa.gov>
Subject unsubscribe
Date Fri, 15 Jan 2021 17:08:47 GMT


Dominic Sisneros
FAA, WSA Engineering Services, AJW-2W13B
Office: 801-320-2377
Cell: 801-558-1966

-----Original Message-----
From: Wes McKinney <wesmckinn@gmail.com> 
Sent: Friday, January 15, 2021 8:38 AM
To: user@arrow.apache.org
Subject: Re: compute::Take & ChunkedArrays

You can do that, but note that the implementation is currently not efficient, see

https://github.com/apache/arrow/blob/master/cpp/src/arrow/compute/kernels/vector_selection.cc#L1909

Rather than pre-concatenating the chunks (which can easily fail) and then invoking Take on
the resulting concatenated Array, it would be better to do a O(N log K) take on the chunks
directly, where N is the number of take indices and K is the number of chunks.

For example, if you have chunks of size

10
50
100
20

then the algorithm computes the following offset table:

0
10
60
160
180

Indices relative to the whole ChunkedArray are translated to (chunk number, intrachunk index),
for example:

take with [5, 40, 100, 170] is translated by doing binary searches in the offset table to:

(chunk=0, relative_index=5)
(1, 30)
(2, 40)
(3, 10)

Consecutive indices from the same chunk are batched together and then Take is invoked on the
respective chunk (with boundschecking disabled) to select a chunk for the resulting output
ChunkedArray.

Might be helpful to copy this to the appropriate Jira (I'm sure there is one already) to assist
the person who implements this.

Thanks,
Wes

On Mon, Jan 11, 2021 at 10:01 AM Niranda Perera <niranda.perera@gmail.com> wrote:
>
> Hi all,
>
> I was wondering how the Take API works with ChunkedArrays?
> ex: If we have a ChunkedArray[100] with Array1[50] and Array2[50] so, 
> if I want an element from each array, can I pass something like [10, 60] as the indices?
>
> --
> Niranda Perera
> @n1r44
> +1 812 558 8884 / +94 71 554 8430
> https://www.linkedin.com/in/niranda
Mime
View raw message