arrow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Katelman, Michael" <Michael.Katel...@CubistSystematic.com>
Subject RE: list scope question
Date Thu, 11 May 2017 15:28:51 GMT
I do see a streaming reader and writer, which might what I need. Although, I'm having some
minor problems with the reader.

-----Original Message-----
From: Katelman, Michael [mailto:Michael.Katelman@CubistSystematic.com] 
Sent: Thursday, May 11, 2017 9:37
To: dev@arrow.apache.org
Subject: RE: list scope question

Thanks, Wes.

I was able to get the setup you outlined working as long as I explicitly communicate the location
of the footer to processes wanting to read the table (I'm not sure if there's a better way
of doing this). One other thing, though, that I would ultimately like to do is share a table
that is conceptually growing. Is that something that arrow is intended for? 

I see in the code that a lot of the table-related data structures are immutable and that the
footer is written only on close, so perhaps not. But, any thoughts on a use case like that
would be appreciated. 

-Mike

-----Original Message-----
From: Wes McKinney [mailto:wesmckinn@gmail.com]
Sent: Wednesday, May 10, 2017 22:11
To: dev@arrow.apache.org
Subject: Re: list scope question

hi Mike,

I recommend using record batches along with io::MemoryMappedFile. You can write the table
with ipc::FileWriter:

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_arrow_blob_master_cpp_src_arrow_ipc_writer.h-23L111&d=DwIFaQ&c=f5Q7ov8zryUUIGT55zpGgw&r=p7uiAfJkXEwbVhZPqB-VxtsgxuGNpO5tGgnMUX3wqrPAIvdxhcKmn9kvZiXDziBQ&m=pWtlkbDLKO7PWTrI6MPiTzAuUgTWYLSRraZBy7KPCYA&s=Ud95KVEwn_XX3LwfEKTLH7nXvKGmOLhYnUu09ZrLyOg&e=


and then read it with ipc::FileReader. If you use MemoryMappedFile, then the read will be
zero-copy (no memory allocated), so ideal for a multiple-process shared memory setting. You
can use one large record batch or multiple smaller record batches. Let us know if you run
into issues.

Thanks
Wes

On Wed, May 10, 2017 at 7:06 PM, Katelman, Michael <Michael.Katelman@cubistsystematic.com>
wrote:
> Great!
>
> What I'd like to do is create a table that can be shared among multiple processes. I
see, e.g., this in the comments where RecordBatch is defined:
>
>  // A record batch is a simpler and more rigid table data structure 
> intended for // use primarily in shared memory IPC. It contains a 
> schema (metadata) and a // corresponding sequence of equal-length 
> Arrow arrays class ARROW_EXPORT RecordBatch {
>
> But I wasn't entirely sure what the author had in mind.
>
> -Mike
>
> -----Original Message-----
> From: Jason Altekruse [mailto:altekrusejason@gmail.com]
> Sent: Wednesday, May 10, 2017 17:59
> To: dev@arrow.apache.org
> Subject: Re: list scope question
>
> I think we opted to not create a user list yet, as using arrow is likely going to require
some willingness to poke around in the library until we can fill out the docs and some more
complete example usages. So as far as I know this is the place to ask, what were you looking
to do?
>
> On Wed, May 10, 2017 at 2:28 PM, Katelman, Michael < Michael.Katelman@cubistsystematic.com>
wrote:
>
>> Is there a good place for user-oriented arrow questions and discussions?
>> (my apologies in advance if this isn't the appropriate venue)
>>
>> -Mike
>>
>>
>>
>>
>>
>> DISCLAIMER: This e-mail message and any attachments are intended 
>> solely for the use of the individual or entity to which it is 
>> addressed and may contain information that is confidential or legally 
>> privileged. If you are not the intended recipient, you are hereby 
>> notified that any dissemination, distribution, copying or other use 
>> of this message or its attachments is strictly prohibited. If you 
>> have received this message in error, please notify the sender 
>> immediately and permanently delete this message and any attachments.
>>
>>
>>
>>
>
>
>
>
>
> DISCLAIMER: This e-mail message and any attachments are intended solely for the use of
the individual or entity to which it is addressed and may contain information that is confidential
or legally privileged. If you are not the intended recipient, you are hereby notified that
any dissemination, distribution, copying or other use of this message or its attachments is
strictly prohibited. If you have received this message in error, please notify the sender
immediately and permanently delete this message and any attachments.
>
>
>
Mime
View raw message