arrow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Julian Hyde <>
Subject Re: Adding a Map logical type to the Arrow metadata
Date Wed, 19 Jul 2017 17:37:07 GMT
List<Struct<K, V>> isn’t the only physical representation that makes sense. Because
it doesn’t take advantage of the fact that (a) keys can be re-ordered, (b) keys are unique.

So, another viable physical representation would be Struct<List<K>, List<V>>,
with the keys sorted. If keys are constant width and in contiguous memory then binary search
is very fast.

I am not claiming that this physical representation is better than yours. But the fact that
there is a more than one means it’s not a no-brainer.


> On Jul 18, 2017, at 12:10 PM, Wes McKinney <> wrote:
> I recently created
> and wanted to discuss on the mailing list to hear opinions about how
> to proceed.
> Some systems, like Spark [1], Presto [2], or Drill have a Map<K, V>
> composite type. These are sometimes stored in Parquet as a repeated
> struct, or in Arrow types List<item: Struct<key: K, value: V>>.
> While we can represent in-memory map data as List<Struct<K, V>>, it
> may be useful to add a new logical type to the set of supported
> logical types [3]. The idea is that the memory format between Map<K,
> V> and List<Struct<K, V>> is identical, so this is strictly a logical
> construct, similar to date/time values having the same in-memory
> format as the corresponding integer types (int32/int64)
> For Arrow implementation that do not provide a first class Map
> container, they could process the data as though it were a repeated
> struct. It would be helpful to us in C++ to have an arrow::MapArray
> container because we could convert to / from this type and other data
> structures like Python dictionaries. It would also be helpful to
> faithfully transport the MAP logical type from Parquet [4]
> Let me know what others think. One question I have is whether the
> repeated struct in-memory representation makes sense as the canonical
> map representation.
> Thanks
> Wes
> [1]:
> [2]:
> [3]:
> [4]:

View raw message