arrow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Micah Kornfield <emkornfi...@gmail.com>
Subject Re: Text data structures-optimized layout in Arrow
Date Sun, 03 Mar 2019 03:50:12 GMT
Hi Edmon,
This sound interesting, I'm not aware of any optimized text memory layout
beyond our standard string layout.   Are there more details about the work
you are doing?  It is a little bit hard to tell if this is a good fit for
Arrow from your description.

Thanks,
Micah

On Sat, Mar 2, 2019 at 7:39 PM Edmon Begoli <ebegoli@berkeley.edu> wrote:

> Colleagues:
>
> A colleague and I are working on optimized structures for memory and disk
> layout for raw and pre-processed text using specialized data structures,
> and with a goal of efficient I/O, inter-process transmissions, and
> media/memory storage of text-oriented data (e.g. clinical narratives,
> radiology and pathology reports, etc.)
>
> Has anyone on the Arrow dev team tackled this problem of efficient text
> storage yet?
> (not just plain text, but storing data structures in an arrow format)
>
> If not, would you welcome a contribution?
>
> Thank you,
> Edmon
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message