I want to have a Java process read the content of DataFrames produced by a Python process. The Java and Python processes run on different hosts.

The solution I can think of is to have the Python process serialize the DataFrame and save it to redis, and have the Java process parse the data.

The solution I find serializes the DataFrame to 'pybytes':
(from https://stackoverflow.com/questions/57949871/how-to-set-get-pandas-dataframes-into-redis-using-pyarrow)
   import pandas as pd
import pyarrow as pa
import redis

r = redis.Redis(host='localhost', port=6379, db=0)

context = pa.default_serialization_context()
r.set("key", context.serialize(df).to_buffer().to_pybytes())
0  1
1  2
2  3

I wonder if this serialized 'pybytes' can be parsed at the Java end? If not, how can I achieve this properly?



Best Regards,
Jiaxing Wang