ignite-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [ignite] dmelnichuk commented on issue #6553: IGNITE-11854 Fix performance problem when creating arrays
Date Wed, 05 Jun 2019 14:00:53 GMT
dmelnichuk commented on issue #6553: IGNITE-11854 Fix performance problem when creating arrays
URL: https://github.com/apache/ignite/pull/6553#issuecomment-499095495
 
 
   > I've tried the patch and it looks like something is still wrong.
   > Is it the expected behavior?
   
   Well, there is always a room for improvement, but to be able to deliver `pyignite` in reasonable
time, I started on the following premises:
   
   - all Ignite data types, that represent signed integers (`Byte`, `Short`, `Integer`, and
`Long`) must be represented with Python boundless integer (`int`) type,
   - implementation of sequential Ignite data types (arrays, `Map`, `Sequence`) must be based
on the corresponding singular types' representations. For example, `ByteArrayObject` uses
internal methods of `ByteObject`.
   
   Both measures have simplified the code greatly. So, yes, it is the expected behavior: to
create an array, client really should create each of its element iteratively, as if building
it from abstract `Iterable[int]`.
   
   >In case yes, what can be solution to put big binary data (more than 100MB ) into Ignite
using thin python client?
   
   I must admit, I don't have a simple solution to this problem. I think it may be possible
to provide a fast method of creating `ByteArray` from `bytearray` as an in situ optimization,
still maintaining its default `Iterable[int]` mapping. Retrieving `ByteArray` from Ignite
as `bytearray` will definitely break the user API though.
   
   But I can't help but wonder: is storing/retrieving large amounts of unstructured data into/from
Ignite is really a case? Especially with Python client? Python is not particularly famous
for its byte stream processing speed.
   
   > Also, I've noticed that instead of retrieving bytearray I'm getting a list of ints
when calling `cache.get()` and some of the values are negative? Is it a defect?
   
   Not at all, it is an expected behavior. As I said earlier, Java's `Byte` is a signed integer
(-128…127), while Python's `bytearray` consists of unsigned elements (0…255). You are
not losing data; it is just a matter of its representation.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

Mime
View raw message