Return-Path: X-Original-To: apmail-asterixdb-dev-archive@minotaur.apache.org Delivered-To: apmail-asterixdb-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D28CD17524 for ; Mon, 18 May 2015 04:22:39 +0000 (UTC) Received: (qmail 84004 invoked by uid 500); 18 May 2015 04:22:39 -0000 Delivered-To: apmail-asterixdb-dev-archive@asterixdb.apache.org Received: (qmail 83954 invoked by uid 500); 18 May 2015 04:22:39 -0000 Mailing-List: contact dev-help@asterixdb.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@asterixdb.incubator.apache.org Delivered-To: mailing list dev@asterixdb.incubator.apache.org Received: (qmail 83941 invoked by uid 99); 18 May 2015 04:22:39 -0000 Received: from Unknown (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 18 May 2015 04:22:39 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 0CB4B18393E for ; Mon, 18 May 2015 04:22:39 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 4.9 X-Spam-Level: **** X-Spam-Status: No, score=4.9 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=3, KAM_BADIPHTTP=2, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-us-west.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id dLR90oJ_HytN for ; Mon, 18 May 2015 04:22:28 +0000 (UTC) Received: from mail-la0-f49.google.com (mail-la0-f49.google.com [209.85.215.49]) by mx1-us-west.apache.org (ASF Mail Server at mx1-us-west.apache.org) with ESMTPS id BC4D624BB7 for ; Mon, 18 May 2015 04:22:27 +0000 (UTC) Received: by labbd9 with SMTP id bd9so201218081lab.2 for ; Sun, 17 May 2015 21:22:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=Yk1gJeKyniciHvfcRZ/aeMXTYWK6DugQCE2Tf4W4BNU=; b=Fga1wj14+2HzIwtDe4npzQiog1/NE2JAAWYV41oTmm0p3GO76fGYMPzgRAXRrDNUbw RrP8RhOYh/XTpMviBdYUfNYYMbrCE0DtimWNU7I0wjEuqBhBtsJ9ARwgNw4FG5F8A5sM US/lTdIUsJcdpmeceyYaBUNXPy7y+z2sfDAlHrZsb7kyZjtshX5Xb+Czw/h+rbyWUies 1FXEbEB2jqofmGyKOqfDXq6SRQUOsyc7nwT9Q+NDrBDZBOlhJtUCFZuEiJ8Oq2vcrM66 PQf0kOpn3xGbCGmwtBld6juli3YYNgY5lfqnmTNZ2uumilJLaSWKYd9xY71ntEmZ3XeR Douw== MIME-Version: 1.0 X-Received: by 10.112.180.201 with SMTP id dq9mr16389226lbc.78.1431922946041; Sun, 17 May 2015 21:22:26 -0700 (PDT) Received: by 10.114.172.145 with HTTP; Sun, 17 May 2015 21:22:25 -0700 (PDT) In-Reply-To: References: <0-11857465121765544286-2307673520383560858-asterixdb=googlecode.com@googlecode.com> <5559346D.8080206@ics.uci.edu> Date: Sun, 17 May 2015 21:22:25 -0700 Message-ID: Subject: Re: Issue 868 in asterixdb: About prefix merge policy behavior From: Young-Seok Kim To: "asterixdb-dev@googlegroups.com" Cc: dev@asterixdb.incubator.apache.org Content-Type: multipart/alternative; boundary=001a11c345c08b6e4c0516538ad4 --001a11c345c08b6e4c0516538ad4 Content-Type: text/plain; charset=UTF-8 There should be proper flow control policies which considers applications' needs. Best, Young-Seok On Sun, May 17, 2015 at 8:25 PM, Chen Li wrote: > A general question is: if the speed of incoming records is higher than > the merge and disk-flush speed, what should we do? Can we reject the > insertion requests? > > Chen > > On Sun, May 17, 2015 at 5:38 PM, Michael Carey > wrote: > > Moving this to the incubator dev list (just looking it over again). Q - > if > > when a merge finishes there's a bigger backlog of components - will it > > currently consider doing a more-ways merge? (Instead of 5, if there are > 13 > > sitting there when the 5-merge finishes - will a 13-merge be initiated?) > > Just curious. We do probably need to think about some sort of better > flow > > control here - the storage gate should presumably slow down admissions > if it > > can't keep up - have to ponder what that might mean. (I have a better > idea > > of what it could mean for feeds than for normal inserts.) One could > argue > > that an increasing backlog is a sign that we should be scaling out the > > number of partitions for the dataset (future work but important work > :-)). > > > > Cheers, > > Mike > > > > On 4/15/15 2:33 PM, asterixdb@googlecode.com wrote: > > > > Status: New > > Owner: ---- > > Labels: Type-Defect Priority-Medium > > > > New issue 868 by kiss...@gmail.com: About prefix merge policy behavior > > https://code.google.com/p/asterixdb/issues/detail?id=868 > > > > I describe how current prefix merge policy works based on the observation > > from ingestion experiments. > > Also, the similar observation was observed by Sattam as well. > > The observed behavior seems a bit unexpected, so I post the observation > here > > to consider better merge policy and/or better lsm index design regarding > > merge operations. > > > > The aqls used for the experiment are shown at the end of this writing. > > > > Prefix merge policy decides to merge disk components based on the > following > > conditions > > 1. Look at the candidate components for merging in oldest-first order. > If > > one exists, identify the prefix of the sequence of all such components > for > > which the sum of their sizes exceeds MaxMergableComponentSize. Schedule > a > > merge of those components into a new component. > > 2. If a merge from 1 doesn't happen, see if the set of candidate > components > > for merging exceeds MaxToleranceComponentCnt. If so, schedule a merge > all > > of the current candidates into a new single component. > > Also, the prefix merge policy doesn't allow concurrent merge operations > for > > a single index partition. > > In other words, if there is a scheduled or an on-going merge operation, > even > > if the above conditions are met, the merge operation is not scheduled. > > > > Based on this merge policy, the following situation can occur. > > Suppose MaxToleranceCompCnt = 5 and 5 disk components were flushed to > disk. > > When 5th disk component is flushed, the prefix merge policy schedules a > > merge operation to merge the 5 components. > > During the merge operation is scheduled and starts merging, concurrently > > ingested records generates more disk components. > > As long as a merge operation is not fast enough to catch up the speed of > > generating 5 disk components by incoming ingested records, > > the number of disk components increases as time goes. > > So, the slower merge operations are, the more disk components there will > be > > as time goes. > > > > I also attached a result of a command, "ls -alR asterixdb > > instance for an ingestion experiment>" which was executed after the > > ingestion is over. > > The attached file shows that for primary index (whose directory is > > FsqCheckinTweet_idx_FsqCheckinTweet), ingestion generated 20 disk > > components, where each disk component consists of btree (the filename has > > suffix _b) and bloom filter (the filename has suffix_f) and > > MaxMergableComponentSize is set to 1GB. > > It also shows that for the secondary index (whose directory is > > FsqCheckinTweet_idx_sifCheckinCoordinate), ingestion generated more than > > 1400 components, where each disk component consist of a dictionary btree > > (suffix: _b), an inverted list (suffix: _i), a deleted-key btree (suffix: > > _d), and a bloom filter for the deleted-key btree (suffix: _f). > > Even if the ingestion was over, since our merge operation happens > > asynchronously, the merge operation continues and eventually merge all > > mergable disk components according to the describe merge policy. > > > > ------------------------------------------ > > AQLs for the ingestion experiment > > ------------------------------------------ > > drop dataverse STBench if exists; > > create dataverse STBench; > > use dataverse STBench; > > > > create type FsqCheckinTweetType as closed { > > id: int64, > > user_id: int64, > > user_followers_count: int64, > > text: string, > > datetime: datetime, > > coordinates: point, > > url: string? > > } > > create dataset FsqCheckinTweet (FsqCheckinTweetType) primary key id > > > > /* this index type is only available kisskys/hilbertbtree branch. > however, > > you can easily replace sif index to inverted keyword index on the text > field > > and you will see similar behavior */ > > create index sifCoordinate on FsqCheckinTweet(coordinates) type > sif(-180.0, > > -90.0, 180.0, 90.0); > > > > /* create feed */ > > create feed TweetFeed > > using file_feed > > (("fs"="localfs"), > > ("path"="127.0.0.1 > :////Users/kisskys/Data/SynFsqCheckinTweet.adm"),("format"="adm"),("type-name"="FsqCheckinTweetType"),("tuple-interval"="0")); > > > > /* connect feed */ > > use dataverse STBench; > > set wait-for-completion-feed "true"; > > connect feed TweetFeed to dataset FsqCheckinTweet; > > > > > > > > > > Attachments: > > storage-layout.txt 574 KB > > > > > > -- > > You received this message because you are subscribed to the Google Groups > > "asterixdb-dev" group. > > To unsubscribe from this group and stop receiving emails from it, send an > > email to asterixdb-dev+unsubscribe@googlegroups.com. > > For more options, visit https://groups.google.com/d/optout. > > -- > You received this message because you are subscribed to the Google Groups > "asterixdb-dev" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to asterixdb-dev+unsubscribe@googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > --001a11c345c08b6e4c0516538ad4--