ignite-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vladimir Ozerov (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (IGNITE-6300) BinaryObject's set size estimator
Date Thu, 07 Sep 2017 14:11:00 GMT

    [ https://issues.apache.org/jira/browse/IGNITE-6300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16156989#comment-16156989
] 

Vladimir Ozerov edited comment on IGNITE-6300 at 9/7/17 2:10 PM:
-----------------------------------------------------------------

[~avinogradov], [~cyberdemon], this is very questionable feature. 

Estimation should be done based on storage format, not based on binary object, which is only
a part of this format. Moreover, it would also depend heavily on configuration. For example,
if your object is ~1.9Kb in size and page size also ~2Kb, then almost whole page will be filled
with data. But if your object if ~2.1Kb and page size is 4Kb, the whole page will be able
accomodate only 1 object. So your binary-based estimator would show 2Kb per entry, while in
reality it would be 4Kb per entry. Another example is indexes. Consumed space depends heavily
on how many indexes are there.

Another major point of concern is that we store object in binary format as is at the moment,
but this is merely accidental implementation detail, rather than design decision. When we
implement compression, format will be different. When we fix SQL performance issues with {{BigDecimal}}
and {{Date}} data types, format will be different. Etc..

Correct implementation is some internal metrics rather than synthetic "estimator". E.g., we
can count all data and index pages for cache and divide it to number of entries - this would
give you real and accurate estimation. If this is too expensive, then we can sample part of
pages. But the bottom line is that any estimation outside of real Ignite instance is useless.

Don't waste your time on this.


was (Author: vozerov):
[~avinogradov], [~cyberdemon], this is very questionable feature. 

Estimation should be done based on storage format, not based on binary object, which is only
a part of this format. Moreover, it would also depend heavily on configuration. For example,
if your object is ~1.9Kb in size and page size also ~2Kb, then almost whole page will be filled
with data. But if your object if ~2.1Kb and page size is 4Kb, the whole page will be able
accomodate only 1 object. So your binary-based estimator would show 2Kb per entry, while in
reality it would be 4Kb per entry. Another example is indexes. Consumed space depends heavily
on how many indexes are there.

Another major point of concern is that we store object in binary format as is at the moment,
but this is merely accidental implementation detail, rather than design decision. When we
implement compression, format will be different. When we fix SQL performance issues with {{BigDecimal}]
and {{Date}} data types, format will be different. Etc..

Correct implementation is some internal metrics rather than synthetic "estimator". E.g., we
can count all data and index pages for cache and divide it to number of entries - this would
give you real and accurate estimation. If this is too expensive, then we can sample part of
pages. But the bottom line is that any estimation outside of real Ignite instance is useless.

Don't waste your time on this.

> BinaryObject's set size estimator
> ---------------------------------
>
>                 Key: IGNITE-6300
>                 URL: https://issues.apache.org/jira/browse/IGNITE-6300
>             Project: Ignite
>          Issue Type: New Feature
>            Reporter: Anton Vinogradov
>            Assignee: Dmitriy Sorokin
>
> Need to provide some API to estimate requirements for any data model.
> For example:
> 1) You have classes A,B and C with known fields.
> 2) You know that you have to keep 1M of A, 2M of B and 45K of C.
> 3) BinarySizeEstimator should return you expected memory consumption on actual Ignite
version without starting a node.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message