Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id DD7E1200CF0 for ; Thu, 7 Sep 2017 16:11:11 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id DCF5E16185B; Thu, 7 Sep 2017 14:11:11 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 2BEA0161A95 for ; Thu, 7 Sep 2017 16:11:11 +0200 (CEST) Received: (qmail 60441 invoked by uid 500); 7 Sep 2017 14:11:10 -0000 Mailing-List: contact issues-help@ignite.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@ignite.apache.org Delivered-To: mailing list issues@ignite.apache.org Received: (qmail 60354 invoked by uid 99); 7 Sep 2017 14:11:10 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 07 Sep 2017 14:11:10 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 96CDBC1062 for ; Thu, 7 Sep 2017 14:11:09 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -99.202 X-Spam-Level: X-Spam-Status: No, score=-99.202 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id AAkcEeQDQFTN for ; Thu, 7 Sep 2017 14:11:04 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 1C42761043 for ; Thu, 7 Sep 2017 14:11:03 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 4AB92E0EF9 for ; Thu, 7 Sep 2017 14:11:01 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id BE32F2416E for ; Thu, 7 Sep 2017 14:11:00 +0000 (UTC) Date: Thu, 7 Sep 2017 14:11:00 +0000 (UTC) From: "Vladimir Ozerov (JIRA)" To: issues@ignite.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Comment Edited] (IGNITE-6300) BinaryObject's set size estimator MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Thu, 07 Sep 2017 14:11:12 -0000 [ https://issues.apache.org/jira/browse/IGNITE-6300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16156989#comment-16156989 ] Vladimir Ozerov edited comment on IGNITE-6300 at 9/7/17 2:10 PM: ----------------------------------------------------------------- [~avinogradov], [~cyberdemon], this is very questionable feature. Estimation should be done based on storage format, not based on binary object, which is only a part of this format. Moreover, it would also depend heavily on configuration. For example, if your object is ~1.9Kb in size and page size also ~2Kb, then almost whole page will be filled with data. But if your object if ~2.1Kb and page size is 4Kb, the whole page will be able accomodate only 1 object. So your binary-based estimator would show 2Kb per entry, while in reality it would be 4Kb per entry. Another example is indexes. Consumed space depends heavily on how many indexes are there. Another major point of concern is that we store object in binary format as is at the moment, but this is merely accidental implementation detail, rather than design decision. When we implement compression, format will be different. When we fix SQL performance issues with {{BigDecimal}} and {{Date}} data types, format will be different. Etc.. Correct implementation is some internal metrics rather than synthetic "estimator". E.g., we can count all data and index pages for cache and divide it to number of entries - this would give you real and accurate estimation. If this is too expensive, then we can sample part of pages. But the bottom line is that any estimation outside of real Ignite instance is useless. Don't waste your time on this. was (Author: vozerov): [~avinogradov], [~cyberdemon], this is very questionable feature. Estimation should be done based on storage format, not based on binary object, which is only a part of this format. Moreover, it would also depend heavily on configuration. For example, if your object is ~1.9Kb in size and page size also ~2Kb, then almost whole page will be filled with data. But if your object if ~2.1Kb and page size is 4Kb, the whole page will be able accomodate only 1 object. So your binary-based estimator would show 2Kb per entry, while in reality it would be 4Kb per entry. Another example is indexes. Consumed space depends heavily on how many indexes are there. Another major point of concern is that we store object in binary format as is at the moment, but this is merely accidental implementation detail, rather than design decision. When we implement compression, format will be different. When we fix SQL performance issues with {{BigDecimal}] and {{Date}} data types, format will be different. Etc.. Correct implementation is some internal metrics rather than synthetic "estimator". E.g., we can count all data and index pages for cache and divide it to number of entries - this would give you real and accurate estimation. If this is too expensive, then we can sample part of pages. But the bottom line is that any estimation outside of real Ignite instance is useless. Don't waste your time on this. > BinaryObject's set size estimator > --------------------------------- > > Key: IGNITE-6300 > URL: https://issues.apache.org/jira/browse/IGNITE-6300 > Project: Ignite > Issue Type: New Feature > Reporter: Anton Vinogradov > Assignee: Dmitriy Sorokin > > Need to provide some API to estimate requirements for any data model. > For example: > 1) You have classes A,B and C with known fields. > 2) You know that you have to keep 1M of A, 2M of B and 45K of C. > 3) BinarySizeEstimator should return you expected memory consumption on actual Ignite version without starting a node. -- This message was sent by Atlassian JIRA (v6.4.14#64029)