arrow-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wes McKinney <wesmck...@gmail.com>
Subject Re: Help with writing Apache Arrow tables to shared memory.
Date Wed, 03 Oct 2018 20:21:51 GMT
hi Bipin -- I will reply to your mail on the dev@ mailing list but it
may take me some time. I'm traveling internationally to conferences
and also have been focused on moving the 0.11 release forward.

- Wes
On Wed, Oct 3, 2018 at 12:00 PM Bipin Mathew <bipinmathew@gmail.com> wrote:
>
> Good Morning Everyone,
>
>     I originally posted this question to the dev channel, not knowing a user channel
was available. This channel is more probably more appropriate and I am hoping the kind souls
here can help me. How, fundamentally, are we expected, to copy or indeed directly write a
arrow table to shared memory using the cpp sdk? Currently, I have an implementation like this:
>
>>  77   std::shared_ptr<arrow::Buffer> B;
>>  78   std::shared_ptr<arrow::io::BufferOutputStream> buffer;
>>  79   std::shared_ptr<arrow::ipc::RecordBatchWriter> writer;
>>  80   arrow::MemoryPool* pool = arrow::default_memory_pool();
>>  81   arrow::io::BufferOutputStream::Create(4096,pool,&buffer);
>>  82   std::shared_ptr<arrow::Table> table;
>>  83   karrow::ArrowHandle *h;
>>  84   h = (karrow::ArrowHandle *)Kj(khandle);
>>  85   table = h->table;
>>  86
>>  87   arrow::ipc::RecordBatchStreamWriter::Open(buffer.get(),table->schema(),&writer);
>>  88   writer->WriteTable(*table);
>>  89   writer->Close();
>>  90   buffer->Finish(&B);
>>  91
>>  92   // printf("Investigate Memory usage.");
>>  93   // getchar();
>>  94
>>  95
>>  96   std::shared_ptr<arrow::io::MemoryMappedFile> mm;
>>  97   arrow::io::MemoryMappedFile::Create("/dev/shm/arrow_table",B->size(),&mm);
>>  98   mm->Write(B->data(),B->size());
>>  99   mm->Close();
>
>
> "table" on line 85 is a shared_ptr to a arrow::Table object. As you can see there, I
write to an arrow:Buffer then write that to a memory mapped file. Is there a more direct approach?
I watched this video of a talk @Wes McKinney gave here:
>
> https://www.dremio.com/webinars/arrow-c++-roadmap-and-pandas2/
>
> Where a method: arrow::MemoryMappedBuffer was referenced, but I have not seen any documentation
regarding this function. Has it been deprecated?
>
> Also, as I mentioned, "table" up there is a arrow::Table object. I create it columnwise
using various arrow::[type]Builder functions. Is there anyway to actually even write the original
table directly into shared memory? Any guidance on the proper way to do these things would
be greatly appreciated.
>
> Regards,
>
> Bipin

Mime
View raw message