arrow-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jesse Wang <hello....@gmail.com>
Subject Python and Java interoperability
Date Tue, 21 Jul 2020 09:57:24 GMT
Hi,
I want to have a Java process read the content of DataFrames produced by a
Python process. The Java and Python processes run on different hosts.

The solution I can think of is to have the Python process serialize the
DataFrame and save it to redis, and have the Java process parse the data.

The solution I find serializes the DataFrame to 'pybytes':
(from
https://stackoverflow.com/questions/57949871/how-to-set-get-pandas-dataframes-into-redis-using-pyarrow
)
```
   import pandas as pd

import pyarrow as paimport redis

df=pd.DataFrame({'A':[1,2,3]})
r = redis.Redis(host='localhost', port=6379, db=0)

context = pa.default_serialization_context()
r.set("key", context.serialize(df).to_buffer().to_pybytes())
context.deserialize(r.get("key"))
   A0  11  22  3

```

I wonder if this serialized 'pybytes' can be parsed at the Java end? If
not, how can I achieve this properly?

Thanks!

-- 

Best Regards,
Jiaxing Wang

Mime
View raw message