arrow-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Simon Dumke <simon.du...@ipp.mpg.de>
Subject Re: Reccord-Level Access
Date Fri, 30 Aug 2019 18:59:37 GMT
Hi Wes,

thanks for the feedback.
I actually share your reservations regarding performance. I just think that 
the arrow structure seems ideal for working with tabular data (especially 
for effective filtering and selection), and after that a final step would 
(i think) often involve traversing the remaining data in a row oriented 
fashion. You would probably have a good overview over the ecosystem using 
Arrow - aren't there any SQL engines etc using Arrow, who would probably 
already have invested some thought in that? Or was your answer really 
limited to the specific hava case and such a concept does exist somewhere 
else, like in the c++ lib?

I'll cerntainly put some thought into this, and if i come up with a 
sensible solution, i'd be happy to contribute it.

Kind regards,
Simon

BTW: I've seen quite some of your talks (at YouTube) and read some of your 
articles while investigating into Arrow and its surrounding ecosystem, 
therefore: Thanks for all you have done and invested for Arrow in 
particular and for the open source community in general! I (as probably 
many others) very much appreciate that!




Am 30. August 2019 19:27:31 schrieb Wes McKinney <wesmckinn@gmail.com>:

> hi Simon -- I don't think there is any such Row accessor class in Java
> but you are welcome to contribute one to the project. For performance
> sensitive applications, using a record interface might not be the best
> idea, but I can understand the convenience for some uses cases.
>
> - Wes
>
> On Fri, Aug 30, 2019 at 4:55 AM Simon Dumke <simon.dumke@ipp.mpg.de> wrote:
>>
>>
>> Hi all,
>>
>>
>>
>>
>>
>>
>>
>>
>> I did not find anything (and so: no definite answer) in the docs, so i
>> thought to ask here:
>>
>>
>>
>>
>>
>>
>>
>>
>> Does Arrow (and at this point my main concern is Arrow for java) support
>> any type of concept that allows a "record level access" (so, a "row") to
>> data in an Arrow RecordBatch or Table? I would have thougt that even in
>> column-oriented analytics etc. this would be a common last step access
>> pattern over many use cases, but i could not find any references to such a
>> thing.
>>
>>
>>
>>
>>
>>
>>
>>
>> Thanks and kind regards,
>> Simon




Mime
View raw message