Hi Chris,
Have you read through the "reading and writing streaming format docs" [1].  If this doesn't work or you have something different in mind, some code samples of what you are currently doing might help.

I'll add that I think the dictionary APIs in java aren't the most ergonomic so if you have ideas on improving them, feel free to propose something.


[1] https://arrow.apache.org/docs/java/ipc.html#writing-and-reading-streaming-format

On Sat, Jul 25, 2020 at 5:49 AM Chris Nuernberger <chris@techascent.com> wrote:

Using the java API for serialization, it is not clear to me how to utilize the per-batch dictionary functionality of the Arrow binary format.  Specifically the stream writer class expects the dictionaries to be defined when it loads the schema so it isn't clear how it will handle assigning a dictionary to a provider when saving a batch.

Is there an example that clarifies this use case?

Thanks for any input or feedback,