ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Вадим Опольский <vaopols...@gmail.com>
Subject Re: IGNITE-13
Date Wed, 01 Mar 2017 14:17:38 GMT
Hi Valentin!

Thank you for comments.

There is a new method which writes directly to BinaryOutputStream instead
of intermediate array.
https://github.com/javaller/MyBenchmark/blob/master/src/
main/java/org/sample/BinaryUtilsNew.java

There is benchmark.
https://github.com/javaller/MyBenchmark/blob/master/src/
main/java/org/sample/MyBenchmark.java

Unit test
https://github.com/javaller/MyBenchmark/blob/master/src/
main/java/org/sample/BinaryOutputStreamTest.java

Statistics
https://github.com/javaller/MyBenchmark/blob/master/out_01_03_17.txt

Benchmark
 Mode       Cnt    Score        Error  Units
MyBenchmark.binaryHeapOutputInDirect            avgt          50  111,337 ±
0,742  ns/op MyBenchmark.binaryHeapOutputStreamDirect   avgt          50
23,847 ± 0,303    ns/op


Vadim










2017-02-28 4:29 GMT+03:00 Valentin Kulichenko <valentin.kulichenko@gmail.com
>:

> Hi Vadim,
>
> Looks like you accidentally removed dev list from the thread, adding it
> back.
>
> I think there is still misunderstanding. What I propose is to modify
> the BinaryUtils#strToUtf8Bytes so that it writes directly to BinaryOutputStream
> instead of intermediate array. This should decrease memory consumption and
> can also increase performance as we will avoid 'writeByteArray' step at
> the end.
>
> Does it make sense to you?
>
> -Val
>
> On Mon, Feb 27, 2017 at 6:55 AM, Вадим Опольский <vaopolskij@gmail.com>
> wrote:
>
>> Hi, Valentin!
>>
>> What do you think about using the methods of BinaryOutputStream:
>>
>> 1) writeByteArray(byte[] val)
>> 2) writeCharArray(char[] val)
>> 3) write (byte[] arr, int off, int len)
>>
>> String val = "Test";
>>     out.writeByteArray( val.getBytes(UTF_8));
>>
>>  String val = "Test";
>>     out.writeCharArray(str.toCharArray());
>>
>> String val = "Test"
>> InputStream stream = new ByteArrayInputStream(
>> exampleString.getBytes(StandartCharsets.UTF_8));
>> byte[] buffer = new byte[1024];
>> while ((buffer = stream.read()) != -1) {
>> out.writeByteArray(buffer);
>> }
>>
>> What else can we use ?
>>
>> Vadim
>>
>>
>> 2017-02-25 2:21 GMT+03:00 Valentin Kulichenko <
>> valentin.kulichenko@gmail.com>:
>>
>>> Hi Vadim,
>>>
>>> Which method implements the approach described in the ticket? From what
>>> I see, all writeToStringX versions are still encoding into an intermediate
>>> array and then call out.writeByteArray. What we need to test is the
>>> approach where bytes are written directly into the stream during encoding.
>>> Encoding algorithm itself should stay the same for now, otherwise we will
>>> not know how to interpret the result.
>>>
>>> It looks like there is some misunderstanding here, so please let me know
>>> anything is still unclear. I will be happy to answer your questions.
>>>
>>> -Val
>>>
>>> On Wed, Feb 22, 2017 at 7:22 PM, Valentin Kulichenko <
>>> valentin.kulichenko@gmail.com> wrote:
>>>
>>>> Hi Vadim,
>>>>
>>>> Thanks, I will review this week.
>>>>
>>>> -Val
>>>>
>>>> On Wed, Feb 22, 2017 at 2:28 AM, Вадим Опольский <vaopolskij@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi Valentin!
>>>>>
>>>>> https://issues.apache.org/jira/browse/IGNITE-13
>>>>>
>>>>> I created BinaryWriterExImplNew (extended of BinaryWriterExImpl) and
>>>>> added new methods with changes described in the ticket
>>>>>
>>>>> https://github.com/javaller/MyBenchmark/blob/master/src/main
>>>>> /java/org/sample/BinaryWriterExImplNew.java
>>>>>
>>>>> I created a benchmark for BinaryWriterExImplNew
>>>>>
>>>>> https://github.com/javaller/MyBenchmark/blob/master/src/main
>>>>> /java/org/sample/ExampleTest.java
>>>>>
>>>>> I run benchmark and compared results
>>>>>
>>>>> https://github.com/javaller/MyBenchmark/blob/master/totalstat.txt
>>>>>
>>>>> # Run complete. Total time: 00:10:24
>>>>> Benchmark                                    Mode  Cnt
>>>>> Score       Error  Units
>>>>> ExampleTest.binaryHeapOutputStream1          avgt   50  1114999,207 ±
>>>>> 16756,776  ns/op
>>>>> ExampleTest.binaryHeapOutputStream2          avgt   50  1118149,320 ±
>>>>> 17515,961  ns/op
>>>>> ExampleTest.binaryHeapOutputStream3          avgt   50  1113678,657 ±
>>>>> 17652,314  ns/op
>>>>> ExampleTest.binaryHeapOutputStream4          avgt   50  1112415,051 ±
>>>>> 18273,874  ns/op
>>>>> ExampleTest.binaryHeapOutputStream5          avgt   50  1111366,583 ±
>>>>> 18282,829  ns/op
>>>>> ExampleTest.binaryHeapOutputStreamACSII   avgt   50  1112079,667 ±
>>>>> 16659,532  ns/op
>>>>> ExampleTest.binaryHeapOutputStreamUTFCustom  avgt   50  1114949,759 ±
>>>>> 16809,669  ns/op
>>>>> ExampleTest.binaryHeapOutputStreamUTFNIO        avgt   50
>>>>> 1121462,325 ± 19836,466  ns/op
>>>>>
>>>>> Is it OK? Whats the next step? Do I have to move this JMH benchmark to
>>>>> the Ignite project ?
>>>>>
>>>>> Vadim Opolski
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> 2017-02-21 1:06 GMT+03:00 Valentin Kulichenko <
>>>>> valentin.kulichenko@gmail.com>:
>>>>>
>>>>>> Hi Vadim,
>>>>>>
>>>>>> I'm not sure I understand your benchmarks and how they verify the
>>>>>> optimization discussed here. Basically, here is what needs to be
done:
>>>>>>
>>>>>> 1. Create a benchmark for BinaryWriterExImpl#doWriteString method.
>>>>>> 2. Run the benchmark with current implementation.
>>>>>> 3. Make the change described in the ticket.
>>>>>> 4. Run the benchmark with these changes.
>>>>>> 5. Compare results.
>>>>>>
>>>>>> Makes sense? Let me know if anything is unclear.
>>>>>>
>>>>>> -Val
>>>>>>
>>>>>> On Mon, Feb 20, 2017 at 8:51 AM, Вадим Опольский <
>>>>>> vaopolskij@gmail.com> wrote:
>>>>>>
>>>>>>> Hello everybody!
>>>>>>>
>>>>>>> https://issues.apache.org/jira/browse/IGNITE-13
>>>>>>>
>>>>>>> Valentin, I just have finished benchmark (with JMH) -
>>>>>>> https://github.com/javaller/MyBenchmark.git
>>>>>>>
>>>>>>> It collect data about time working of serialization.
>>>>>>>
>>>>>>> For instance - https://github.com/javaller/My
>>>>>>> Benchmark/blob/master/out200217.txt
>>>>>>>
>>>>>>> To start it you have to do next:
>>>>>>>
>>>>>>> 1) clone it - git colne https://github.com/javaller/MyBenchmark.git
>>>>>>>
>>>>>>> 2) install it - mvn install
>>>>>>>
>>>>>>> 3) run benchmarks -  java -Xms1024m -Xmx4096m -jar
>>>>>>> target\benchmarks.jar
>>>>>>>
>>>>>>> Vadim Opolski
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> 2017-02-15 0:52 GMT+03:00 Valentin Kulichenko <
>>>>>>> valentin.kulichenko@gmail.com>:
>>>>>>>
>>>>>>>> Vladimir,
>>>>>>>>
>>>>>>>> I think we misunderstood each other. My understanding of
this
>>>>>>>> optimization is the following.
>>>>>>>>
>>>>>>>> Currently string serialization is done in two steps (see
>>>>>>>> BinaryWriterExImpl#doWriteString):
>>>>>>>>
>>>>>>>> strArr = BinaryUtils.strToUtf8Bytes(val); // Encode string
into
>>>>>>>> byte array.
>>>>>>>> out.writeByteArray(strArr);                      // Write
byte
>>>>>>>> array into stream.
>>>>>>>>
>>>>>>>> What this ticket suggests is to write directly into stream
while
>>>>>>>> string is encoded, without intermediate array. This both
reduces memory
>>>>>>>> consumption and eliminates array copy step.
>>>>>>>>
>>>>>>>> I updated the ticket and added this explanation there.
>>>>>>>>
>>>>>>>> Vadim, can you create a micro benchmark and check if it gives
any
>>>>>>>> improvement?
>>>>>>>>
>>>>>>>> -Val
>>>>>>>>
>>>>>>>> On Sun, Feb 12, 2017 at 10:38 PM, Vladimir Ozerov <
>>>>>>>> vozerov@gridgain.com> wrote:
>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> It is hard to say whether it makes sense or not. No doubt,
it
>>>>>>>>> could speed up marshalling process at the cost of 2x
memory required for
>>>>>>>>> strings. From my previous experience with marshalling
micro-optimizations,
>>>>>>>>> we will hardly ever notice speedup in distributed environment.
>>>>>>>>>
>>>>>>>>> But, there is another sied - it could speedup our queries,
because
>>>>>>>>> we will not have to unmarshal string on every field access.
So I would try
>>>>>>>>> to make this optimization optional and then measure query
performance with
>>>>>>>>> classes having lots of strings. It could give us interesting
results.
>>>>>>>>>
>>>>>>>>> On Mon, Feb 13, 2017 at 5:37 AM, Valentin Kulichenko
<
>>>>>>>>> valentin.kulichenko@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Vladimir,
>>>>>>>>>>
>>>>>>>>>> Can you please take a look and provide your thoughts?
Can this be
>>>>>>>>>> applied to binary marshaller? From what I recall,
it serializes string a
>>>>>>>>>> bit differently from optimized marshaller, so I'm
not sure.
>>>>>>>>>>
>>>>>>>>>> -Val
>>>>>>>>>>
>>>>>>>>>> On Fri, Feb 10, 2017 at 5:16 PM, Dmitriy Setrakyan
<
>>>>>>>>>> dsetrakyan@apache.org> wrote:
>>>>>>>>>>
>>>>>>>>>>> On Thu, Feb 9, 2017 at 11:26 PM, Valentin Kulichenko
<
>>>>>>>>>>> valentin.kulichenko@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>> > Hi Vadim,
>>>>>>>>>>> >
>>>>>>>>>>> > I don't think it makes much sense to invest
into
>>>>>>>>>>> OptimizedMarshaller.
>>>>>>>>>>> > However, I would check if this optimization
is applicable to
>>>>>>>>>>> > BinaryMarshaller, and if yes, implement
it.
>>>>>>>>>>> >
>>>>>>>>>>>
>>>>>>>>>>> Val, in this case can you please update the ticket?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> >
>>>>>>>>>>> > -Val
>>>>>>>>>>> >
>>>>>>>>>>> > On Thu, Feb 9, 2017 at 11:05 PM, Вадим
Опольский <
>>>>>>>>>>> vaopolskij@gmail.com>
>>>>>>>>>>> > wrote:
>>>>>>>>>>> >
>>>>>>>>>>> > > Dear sirs!
>>>>>>>>>>> > >
>>>>>>>>>>> > > I want to resolve issue IGNITE-13 -
>>>>>>>>>>> > > https://issues.apache.org/jira/browse/IGNITE-13
>>>>>>>>>>> > >
>>>>>>>>>>> > > Is it actual?
>>>>>>>>>>> > >
>>>>>>>>>>> > > Vadim Opolski
>>>>>>>>>>> > >
>>>>>>>>>>> >
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message