ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ivan Rakov <ivan.glu...@gmail.com>
Subject Re: [DISCUSSION] Urgent Ignite bug fix release
Date Mon, 28 Aug 2017 15:02:03 GMT
Backport issue: https://issues.apache.org/jira/browse/IGNITE-6204

Best Regards,
Ivan Rakov

On 28.08.2017 17:19, Anton Vinogradov wrote:
> Igniters,
>
> Seems 2.2 is a urgent bugfix release, so it should be based on 2.1,
> In this case all other issues with fixVersion = 2.2 should be moved to 2.3.
>
> Currently, I see 835 issues with fixVersion = 2.2
>
> Seems we should have only 4 issues with fixVersion = 2.2:
> - Change default max memory size from 80% to 20% -
> https://issues.apache.org/jira/browse/IGNITE-6182
> - Detecting low memory on ignite node startup -
> https://issues.apache.org/jira/browse/IGNITE-6003
> - Backport of improvements to checkpoint algorithm -
> https://issues.apache.org/jira/browse/IGNITE-????
> - ML profile is missing in 2.1 binary release -
> https://issues.apache.org/jira/browse/IGNITE-6193
>
> Please correct me in case I've missed something.
>
> Ivan,
>> Let's include to the release tickets with optimizations of checkpointing
> algorithm:
>> https://issues.apache.org/jira/browse/IGNITE-6178
>> https://issues.apache.org/jira/browse/IGNITE-6033
>> https://issues.apache.org/jira/browse/IGNITE-5961
>> This will help users who are experiencing problems with slow checkpoints.
>> Also, let's include https://issues.apache.org/jira/browse/IGNITE-6183 to
> soften "Ignite node crashed in the middle of checkpoint" message as per
> discussion <
> http://apache-ignite-developers.2346864.n4.nabble.com/Ignite-close-G-stop-name-true-Change-flag-cancel-to-false-td20473.html
>> .
> All of these issues should have fixVersion 2.3.
> Please create backport issue and link it to all necessary issues.
>
> P.s. I've created branch ignite-2.2, currently it equals to ignite-2.1.
>
> On Mon, Aug 28, 2017 at 4:05 PM, Anton Vinogradov <avinogradov@gridgain.com>
> wrote:
>
>> Denis,
>>
>>> BTW, who is considered to be the release manager of this release?
>> I'll do it.
>>
>> On Mon, Aug 28, 2017 at 3:54 PM, Seliverstov Igor <gvvinblade@gmail.com>
>> wrote:
>>
>>> Ok, the check happens at the node start time or on NODE_JOIN event
>>>
>>> in general it looks like:
>>>
>>> 1) calculate expected used memory = heap max + system cache max + all
>>> custom policies max + default policy size and put it into a node attribute
>>>
>>> 2) get total physycal memory, calculate expected safe to be used memory
>>> amount (leave 4 gb min or 20% of available memory for OS)
>>>
>>> 3) if expected used memory + expected used memory of other nodes on the
>>> host > than safe to be used memory amount, start calculating suggestions
>>>
>>> 4) Each ignite instance needs at least 512mb heap + 40mb system cache +
>>> 100mb default polycy, if available memory is less we cannot suggest
>>> anything reasonable, print warning, stop calculation.
>>>
>>> 5) check heap size (shouldn't exceed 30% of available memory (total_memory
>>> - reserved for OS memory) * 30% for all JVMs, if it exeedes, suggest just
>>> calculated value or 512MB minimal)
>>>
>>> 6) check if system cache size changed, suggest default value if it's so
>>>
>>> 7) in case 100 mb * policies count < available memory, suggest using
>>> default policy with max size equals to remaining memory (available - heap
>>> -
>>> system cache)
>>>
>>> 8) calculate new size for each memory policy ( it's user defined size *
>>> (remaining / (all_policies_size * nodes_cnt)); in proportion to
>>> remaining memory, devided by nodes number on the host or 100 mb minimal)
>>>
>>> 9) print suggestions
>>>
>>>
>>>
>>> 2017-08-28 15:10 GMT+03:00 Dmitriy Setrakyan <dsetrakyan@apache.org>:
>>>
>>>> Igor, can you please describe the algorithm with all the thresholds?
>>>>
>>>> On Mon, Aug 28, 2017 at 4:56 AM, Seliverstov Igor <gvvinblade@gmail.com
>>>>
>>>> wrote:
>>>>
>>>>> The suggestion here is based on initial settings, and it's so because
>>>> there
>>>>> is no other nodes on the host in the example.
>>>>>
>>>>> The algorithm tries to preserve the original ratio of memory policies
>>>>> keeping numbers reasonable (for example after some thresshold it will
>>>>> suggest not to use several memory policies if there is not enough of
>>>> memory
>>>>> for all of them) and taking into consideration nodes count on the
>>> host,
>>>>> each jvm heap, needed memory for OS, etc
>>>>>
>>>>> 2017-08-28 14:38 GMT+03:00 Dmitriy Setrakyan <dsetrakyan@apache.org>:
>>>>>
>>>>>> Looks good, but why in the example provided are we suggesting 8GB?
2
>>>>> nodes
>>>>>> with 8GB will completely exhaust the available memory. I would
>>> suggest
>>>> 6
>>>>> or
>>>>>> 7GB.
>>>>>>
>>>>>> Also, why 100MB for default policy. Anything under 1GB seems too
>>> small.
>>>>>> Can you please comment?
>>>>>>
>>>>>> D.
>>>>>>
>>>>>> On Mon, Aug 28, 2017 at 3:31 AM, Seliverstov Igor <
>>>> gvvinblade@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> One more example of possible warning:
>>>>>>>
>>>>>>> -----------------------------------------------------
>>>>>>> Excessive memory usage by Ignite node process (performance may
>>> drop)
>>>>>>> [requested=44613MB, available=15942MB].
>>>>>>>
>>>>>>> Please tune the folowing settings as suggested:
>>>>>>>    MemoryPolicyConfiguration.initialSize for bigPlc: 8102MB
>>>>>>>    MemoryPolicyConfiguration.maxSize     for bigPlc: 8102MB
>>>>>>>    MemoryPolicyConfiguration.initialSize for dfltPlc: 100MB
>>>>>>>    MemoryPolicyConfiguration.maxSize     for dfltPlc: 100MB
>>>>>>>
>>>>>>> Current settings:
>>>>>>>    Java Heap  maxSize: 3543MB
>>>>>>>    Java Heap initSize: 250MB
>>>>>>>    MemoryPolicyConfiguration.initialSize for bigPlc: 256MB
>>>>>>>    MemoryPolicyConfiguration.maxSize     for bigPlc: 40960MB
>>>>>>>    MemoryPolicyConfiguration.initialSize for dfltPlc: 10MB
>>>>>>>    MemoryPolicyConfiguration.maxSize     for dfltPlc: 10MB
>>>>>>>    The overall expected memory usage by all Ignite nodes on the
>>> host:
>>>>>>> 44613MB
>>>>>>> -----------------------------------------------------
>>>>>>>
>>>>>>> Your thoughts?
>>>>>>>
>>>>>>> 2017-08-28 5:06 GMT+03:00 Denis Magda <dmagda@apache.org>:
>>>>>>>
>>>>>>>> Guys,
>>>>>>>>
>>>>>>>> ML lib profile is missing in 2.1 release! That must be fixed
and
>>>>> rolled
>>>>>>>> out in this emergency release:
>>>>>>>> https://issues.apache.org/jira/browse/IGNITE-6193 <
>>>>>>>> https://issues.apache.org/jira/browse/IGNITE-6193>
>>>>>>>>
>>>>>>>> Oleg, Yuri, please step in and handle the issue.
>>>>>>>>
>>>>>>>> BTW, who is considered to be the release manager of this
>>> release?
>>>>>>>> —
>>>>>>>> Denis
>>>>>>>>
>>>>>>>>> On Aug 25, 2017, at 2:29 PM, Dmitriy Setrakyan <
>>>>>> dsetrakyan@apache.org>
>>>>>>>> wrote:
>>>>>>>>> I like the format proposed by Denis, very clear.
>>>>>>>>>
>>>>>>>>> However, I also do not understand why a user should change
the
>>>> size
>>>>>> of
>>>>>>>> some
>>>>>>>>> system cache. How would a user ever know what value to
put
>>> there?
>>>>>> This
>>>>>>>>> value should be configured by Ignite automatically.
>>>>>>>>>
>>>>>>>>> D.
>>>>>>>>>
>>>>>>>>> On Fri, Aug 25, 2017 at 2:24 PM, Denis Magda <
>>> dmagda@apache.org>
>>>>>>> wrote:
>>>>>>>>>> Igor,
>>>>>>>>>>
>>>>>>>>>> Let me suggest this format.
>>>>>>>>>>
>>>>>>>>>> ---------------------------------------------
>>>>>>>>>> Excessive memory usage by Ignite node process (performance
>>> may
>>>>> drop)
>>>>>>>>>> [requested=29251MB, available=15942MB]
>>>>>>>>>>
>>>>>>>>>> Please tune the following settings:
>>>>>>>>>>   [MemoryConfiguration.defaultMemoryPolicySize =
suggested
>>>> value]
>>>>>>>>>>   MemoryConfiguration.systemCacheMaxSize = suggested
value
>>>>>>>>>>   [MemoryPolicyConfiguration.maxSize for {policy_name_1}
=
>>>>> suggested
>>>>>>>>>> value]
>>>>>>>>>>   [MemoryPolicyConfiguration.maxSize for {policy_name_2}
=
>>>>> suggested
>>>>>>>>>> value]
>>>>>>>>>>
>>>>>>>>>> Current settings:
>>>>>>>>>>    [DefaultMemoryPolicySize = value]
>>>>>>>>>>    [{policy_name_1} size = value]
>>>>>>>>>>    [{policy_name_1} size = value]
>>>>>>>>>>    SystemCacheInitialSize = value
>>>>>>>>>>    SystemCacheMaxSize = value
>>>>>>>>>>    Java Heap Init Size = value
>>>>>>>>>>    Java Heap Max Size = value
>>>>>>>>>>
>>>>>>>>>> The overall memory usage by all Ignite nodes on the
host:
>>> value
>>>>>>>>>> -------------------------------------------
>>>>>>>>>>
>>>>>>>>>> Records in […] are optional. If custom memory policy
is not
>>> set
>>>> or
>>>>>> the
>>>>>>>>>> default memory policy is overridden the output will
miss
>>> some of
>>>>> the
>>>>>>>> rows.
>>>>>>>>>> As for systemCacheMaxSize, it should be show ONLY
if the
>>>> parameter
>>>>>> was
>>>>>>>> set
>>>>>>>>>> explicitly by user code. Otherwise, the platform
should be
>>> wise
>>>>>> enough
>>>>>>>> to
>>>>>>>>>> instantiate it properly depending on the host memory
usage.
>>>>>>>>>>
>>>>>>>>>> —
>>>>>>>>>> Denis
>>>>>>>>>>
>>>>>>>>>>> On Aug 25, 2017, at 1:49 PM, Seliverstov Igor
<
>>>>>> gvvinblade@gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>> The message without logging layout:
>>>>>>>>>>>
>>>>>>>>>>> Not enough memory for current process [required=29251MB,
>>>>>>>>>> available=15942MB].
>>>>>>>>>>> Please change MemoryConfiguration.systemCacheMaxSize
and
>>>>>>>>>>> MemoryConfiguration.defaultMemoryPolicySize to
decrease
>>> memory
>>>>>>>> allocated
>>>>>>>>>>> for each node.
>>>>>>>>>>>
>>>>>>>>>>> Current settings:
>>>>>>>>>>>   HeapInit=250MB
>>>>>>>>>>>   HeapMax=3543MB
>>>>>>>>>>>   DefaultMemoryPolicySize=12753MB
>>>>>>>>>>>   SystemCacheInitialSize=40MB
>>>>>>>>>>>   SystemCacheMaxSize=100MB
>>>>>>>>>>>
>>>>>>>>>>> Other ignite instanses on the server require:
12853MB
>>>>>>>>>>>
>>>>>>>>>>> I think it's make sense to describe what these
numbers are
>>>>> consist
>>>>>>> of.
>>>>>>>>>>> We simple say which parameters have an impact
on how much
>>>> memory
>>>>>> the
>>>>>>>>>>> instance needs and their (parameters) actual
values.
>>>>>>>>>>>
>>>>>>>>>>> Also we notice that more than one Ignite instance
are ran on
>>>> the
>>>>>>> server
>>>>>>>>>> or
>>>>>>>>>>> workstation and it also consumes memory.
>>>>>>>>>>>
>>>>>>>>>>> 25 авг. 2017 г. 21:30 пользователь
"Dmitriy Setrakyan" <
>>>>>>>>>>> dsetrakyan@apache.org> написал:
>>>>>>>>>>>
>>>>>>>>>>>> Igor, what is this flood of WARN messaging
coming after the
>>>>> text?
>>>>>>> Are
>>>>>>>> we
>>>>>>>>>>>> really going to print this whole thing out?
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, Aug 25, 2017 at 9:49 AM, Seliverstov
Igor <
>>>>>>>> gvvinblade@gmail.com
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> This message appears on topology change
in case the
>>> available
>>>>>>> memory
>>>>>>>> is
>>>>>>>>>>>>> exceeded
>>>>>>>>>>>>>
>>>>>>>>>>>>> 2017-08-25 19:47 GMT+03:00 Seliverstov
Igor <
>>>>>> gvvinblade@gmail.com
>>>>>>>> :
>>>>>>>>>>>>>> An example of current impl:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> [2017-08-25 19:44:37,740][WARN ][disco-event-worker-#29%
>>>>>> internal.
>>>>>>>>>>>>>> GridHomePathSelfTest0%][GridDiscoveryManager]
>>>>>>>>>>>>>> [2017-08-25 19:44:37,740][WARN ][disco-event-worker-#29%
>>>>>> internal.
>>>>>>>>>>>>>> GridHomePathSelfTest0%][GridDiscoveryManager]
Not enough
>>>>> memory
>>>>>>> for
>>>>>>>>>>>>>> current process [required=29251MB,
available=15942MB].
>>>>>>>>>>>>>> [2017-08-25 19:44:37,740][WARN ][disco-event-worker-#29%
>>>>>> internal.
>>>>>>>>>>>>>> GridHomePathSelfTest0%][GridDiscoveryManager]
Please
>>> change
>>>>>>>>>>>>>> MemoryConfiguration.systemCacheMaxSize
and
>>>>> MemoryConfiguration.
>>>>>>>>>>>>> defaultMemoryPolicySize
>>>>>>>>>>>>>> to decrease memory allocated for
each node.
>>>>>>>>>>>>>> [2017-08-25 19:44:37,740][WARN ][disco-event-worker-#29%
>>>>>> internal.
>>>>>>>>>>>>>> GridHomePathSelfTest0%][GridDiscoveryManager]
>>>>>>>>>>>>>> [2017-08-25 19:44:37,740][WARN ][disco-event-worker-#29%
>>>>>> internal.
>>>>>>>>>>>>>> GridHomePathSelfTest0%][GridDiscoveryManager]
Current
>>>>> settings:
>>>>>>>>>>>>>> [2017-08-25 19:44:37,740][WARN ][disco-event-worker-#29%
>>>>>> internal.
>>>>>>>>>>>>>> GridHomePathSelfTest0%][GridDiscoveryManager]
>>>>>   HeapInit=250MB
>>>>>>>>>>>>>> [2017-08-25 19:44:37,741][WARN ][disco-event-worker-#29%
>>>>>> internal.
>>>>>>>>>>>>>> GridHomePathSelfTest0%][GridDiscoveryManager]
>>>>>   HeapMax=3543MB
>>>>>>>>>>>>>> [2017-08-25 19:44:37,741][WARN ][disco-event-worker-#29%
>>>>>> internal.
>>>>>>>>>>>>>> GridHomePathSelfTest0%][GridDiscoveryManager]
>>>>>>>>>>>> DefaultMemoryPolicySize=
>>>>>>>>>>>>>> 12753MB
>>>>>>>>>>>>>> [2017-08-25 19:44:37,741][WARN ][disco-event-worker-#29%
>>>>>> internal.
>>>>>>>>>>>>>> GridHomePathSelfTest0%][GridDiscoveryManager]
>>>>>>>>>>>>>> SystemCacheInitialSize=40MB
>>>>>>>>>>>>>> [2017-08-25 19:44:37,741][WARN ][disco-event-worker-#29%
>>>>>> internal.
>>>>>>>>>>>>>> GridHomePathSelfTest0%][GridDiscoveryManager]
>>>>>>>>>>>> SystemCacheMaxSize=100MB
>>>>>>>>>>>>>> [2017-08-25 19:44:37,741][WARN ][disco-event-worker-#29%
>>>>>> internal.
>>>>>>>>>>>>>> GridHomePathSelfTest0%][GridDiscoveryManager]
>>>>>>>>>>>>>> [2017-08-25 19:44:37,741][WARN ][disco-event-worker-#29%
>>>>>> internal.
>>>>>>>>>>>>>> GridHomePathSelfTest0%][GridDiscoveryManager]
Other
>>> ignite
>>>>>>>> instanses
>>>>>>>>>>>> on
>>>>>>>>>>>>>> the server require: 12853MB
>>>>>>>>>>>>>> [2017-08-25 19:44:37,741][WARN ][disco-event-worker-#29%
>>>>>> internal.
>>>>>>>>>>>>>> GridHomePathSelfTest0%][GridDiscoveryManager]
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 2017-08-25 17:40 GMT+03:00 Sergey
Kozlov <
>>>>> skozlov@gridgain.com
>>>>>>> :
>>>>>>>>>>>>>>> I suppose we should not forget
JVM heap size and
>>> suggest to
>>>>>>> reduce
>>>>>>>>>>>> both
>>>>>>>>>>>>>>> options
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Fri, Aug 25, 2017 at 5:24
PM, Dmitriy Setrakyan <
>>>>>>>>>>>>> dsetrakyan@apache.org
>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Igor, I would change the
message. How about this:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Required RAM size is larger
than total physical memory
>>>>>> available
>>>>>>>> for
>>>>>>>>>>>>> OS.
>>>>>>>>>>>>>>>>> Please change MemoryConfiguration.WhichProperty
and
>>>>>>>>>>>>>>>>> MemoryPolicyConfiguration.WhichProperty
to decrease
>>>> memory
>>>>>>>>>>>>> allocated
>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>> each node.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Also, can we calculate what
the memory size allocated
>>> for
>>>>> each
>>>>>>>> node
>>>>>>>>>>>>>>> should
>>>>>>>>>>>>>>>> be? In that case we should
suggest it.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> D.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Fri, Aug 25, 2017 at 7:20
AM, Seliverstov Igor <
>>>>>>>>>>>>> gvvinblade@gmail.com
>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> What do you, guys think
about next warning?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> [2017-08-25 17:17:04,718][INFO
>>>>>>>>>>>>>>>>> ][test-runner-#1%internal.GridHomePathSelfTest%][
>>>>> GridHomePat
>>>>>>>>>>>>>>> hSelfTest0]
>>>>>>>>>>>>>>>>> System cache's MemoryPolicy
size is configured to 40
>>> MB.
>>>>> Use
>>>>>>>>>>>>>>>>> MemoryConfiguration.systemCacheMemorySize
property to
>>>>> change
>>>>>>> the
>>>>>>>>>>>>>>>> setting.
>>>>>>>>>>>>>>>>> [2017-08-25 17:17:04,718][WARN
>>>>>>>>>>>>>>>>> ][test-runner-#1%internal.GridHomePathSelfTest%][
>>>>> GridHomePat
>>>>>>>>>>>>>>> hSelfTest0]
>>>>>>>>>>>>>>>>>>>> Required
RAM size is larger than total physical
>>> memory
>>>>>>>>>>>> available
>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>> OS.
>>>>>>>>>>>>>>>>>>>> Check your
configuration to avoid swap partition
>>>> usage.
>>>>>>>>>>>>>>>>>>>> Use MemoryConfiguration
and
>>> MemoryPolicyConfiguration
>>>> to
>>>>>>>>>>>> change
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>> settings.
>>>>>>>>>>>>>>>>>>>> Physical
memory [required=16397MB,
>>> available=15942MB]
>>>>>>>>>>>>>>>>> [2017-08-25 17:17:04,726][WARN
>>>>>>>>>>>>>>>>> ][test-runner-#1%internal.GridHomePathSelfTest%][
>>>>> GridHomePat
>>>>>>>>>>>>>>> hSelfTest0]
>>>>>>>>>>>>>>>>> Peer class loading is
enabled (disable it in
>>> production
>>>> for
>>>>>>>>>>>>>>> performance
>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>> deployment consistency
reasons)
>>>>>>>>>>>>>>>>> [2017-08-25 17:17:04,726][INFO
>>>>>>>>>>>>>>>>> ][test-runner-#1%internal.GridHomePathSelfTest%][
>>>>> GridHomePat
>>>>>>>>>>>>>>> hSelfTest0]
>>>>>>>>>>>>>>>>> Configured caches [in
'sysMemPlc' memoryPolicy:
>>>>>>>>>>>>> ['ignite-sys-cache']]
>>>>>>>>>>>>>>>>> [2017-08-25 17:17:04,731][INFO
>>>>>>>>>>>>>>>>> ][test-runner-#1%internal.GridHomePathSelfTest%][
>>>>> GridHomePat
>>>>>>>>>>>>>>> hSelfTest0]
>>>>>>>>>>>>>>>>> 3-rd party licenses can
be found at:
>>>>>>>>>>>>>>>>> /home/gvvinblade/projects/igni
>>> te/incubator-ignite/libs/
>>>>>>> licenses
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 2017-08-25 13:26 GMT+03:00
Yakov Zhdanov <
>>>>>> yzhdanov@apache.org
>>>>>>>> :
>>>>>>>>>>>>>>>>>> Agree, let's release
new version including tickets
>>>>> mentioned
>>>>>>> by
>>>>>>>>>>>>>>> Denis
>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>> Ivan.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> --Yakov
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>> Sergey Kozlov
>>>>>>>>>>>>>>> GridGain Systems
>>>>>>>>>>>>>>> www.gridgain.com
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>
>>


Mime
View raw message