arrow-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joris Van den Bossche <jorisvandenboss...@gmail.com>
Subject Re: pyarrow.Table.to_pandas() raise ValueError: Found non-unique column index
Date Mon, 11 May 2020 08:27:55 GMT
Hi Maqy,

Can you rename the columns?

Currently, the to_pandas method does not support converting pyarrow Tables
to pandas DataFrames if there are duplicate column names present.
I suppose that with some effort, it might be possible to support this,
though, if someone is interested in looking into this.

Best,
Joris

On Mon, 11 May 2020 at 07:18, maqy <454618260@qq.com> wrote:

> I use pyarrow to receive the arrow data sent from java, the data type is
> int(two columns). The python code I use is:
>
> ```
>
> client, addr = socket_server.accept()
>
> my_file = client.makefile(“rb”)
>
>
>
> reader = pa.RecordBatchStreamReader(my_file)
>
> talbe = reader.read_all()
>
> # raise ValueError
>
> df = table.to_pandas()
>
> ```
>
>     The reason for this problem is that the columns names of the table are
> ['int', 'int']. How should I solve this problem?
>
>
>
> Best regards,
>
> maqy
>
>
>

Mime
View raw message