Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id ED3D6200AC8 for ; Tue, 7 Jun 2016 23:48:06 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id EAC03160A36; Tue, 7 Jun 2016 21:48:06 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 3D80F160968 for ; Tue, 7 Jun 2016 23:48:06 +0200 (CEST) Received: (qmail 65603 invoked by uid 500); 7 Jun 2016 21:48:05 -0000 Mailing-List: contact user-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@accumulo.apache.org Delivered-To: mailing list user@accumulo.apache.org Received: (qmail 65593 invoked by uid 99); 7 Jun 2016 21:48:05 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 07 Jun 2016 21:48:05 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id B94AEC0773 for ; Tue, 7 Jun 2016 21:48:04 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -0.426 X-Spam-Level: X-Spam-Status: No, score=-0.426 tagged_above=-999 required=6.31 tests=[KAM_LAZY_DOMAIN_SECURITY=1, RP_MATCHES_RCVD=-1.426] autolearn=disabled Received: from mx2-lw-us.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id 9m5qNky9zaKu for ; Tue, 7 Jun 2016 21:48:03 +0000 (UTC) Received: from hera.ccri.com (mail.ccri.com [50.205.35.100]) by mx2-lw-us.apache.org (ASF Mail Server at mx2-lw-us.apache.org) with ESMTPS id CEFAD5F1BE for ; Tue, 7 Jun 2016 21:48:02 +0000 (UTC) Received: from [192.168.2.39] by hera.ccri.com with esmtpsa (TLSv1:DHE-RSA-AES128-SHA:128) (Exim 4.80.1) (envelope-from ) id 1bAOqw-0000mz-4h for user@accumulo.apache.org; Tue, 07 Jun 2016 17:48:02 -0400 Subject: Re: Unbalanced tablets or extra rfiles To: user@accumulo.apache.org References: <57573696.7070603@ccri.com> <57573DE2.1030605@gmail.com> From: Andrew Hulbert Message-ID: <57574112.5020805@ccri.com> Date: Tue, 7 Jun 2016 17:48:02 -0400 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.8.0 MIME-Version: 1.0 In-Reply-To: <57573DE2.1030605@gmail.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit archived-at: Tue, 07 Jun 2016 21:48:07 -0000 Yeah it looks like in both cases there tablets that have ~del markers but are also referenced as entries for tablets. I assume there's no problem with both? Most are many many months old. Many actually seem to have multiple file: assignments (multiple rows in metadata table) ...which shouldn't happen, right? I also assume that the files in the directory don't particularly matter since they are assigned to other tablets in the metdata table. Cool & thanks again. Fun to learn the internals. -Andrew On 06/07/2016 05:34 PM, Josh Elser wrote: > re #1, you can try grep'ing over the Accumulo metadata table to see if > there are references to the file. It's possible that some files might > be kept around for table snapshots (but these should eventually be > compacted per Mike's point in #3, I believe). > > Mike Drob wrote: >> 1) Is your Accumulo Garbage Collector process running? It will delete >> un-referenced files. >> 2) I've heard it said that 200 tablets per tserver is the sweet spot, >> but it depends a lot on your read and write patterns. >> 3) >> https://accumulo.apache.org/1.7/accumulo_user_manual#_table_compaction_major_everything_idle >> >> >> On Tue, Jun 7, 2016 at 4:03 PM, Andrew Hulbert > > wrote: >> >> Hi all, >> >> A few questions on behavior if you have any time... >> >> 1. When looking in accumulo's HDFS directories I'm seeing a >> situation where "tablets" aka "directories" for a table have more >> than the default 1G split threshold worth of rfiles in them. In one >> large instance, we have 400G worth of rfiles in the default_tablet >> directory (a mix of A, C, and F-type rfiles). We took one of these >> tables and compacted it and now there are appropriately ~1G worth of >> files in HDFS. On an unrelated table we have tablets with 100+G of >> bulk imported rfiles in the tablet's HDFS directory. >> >> These seems to be common across multiple clouds. All the ingest is >> done via batch writing. Is anyone aware of why this would happen or >> if it is even important? Perhaps these are leftover rfiles from some >> process. Their timestamps cover large date ranges. >> >> 2. There's been some discussion on the number of files per tserver >> for efficiency. Are there any limits on the size of rfiles for >> efficiency? For instance, I assume that compacting all the files >> into a single rfile per 1G split is more efficient bc it avoids >> merging (but maybe decreases concurrency). However, would it be >> better to have 500 tablets per node on a table with 1G splits versus >> having 50 tablets with 10G splits. Assuming HDFS and Accumulo don't >> mind 10G files! >> >> 3. Is there any way to force idle tablets to actually major compact >> other than the shell? Seems like it never happens. >> >> Thanks! >> >> Andrew >> >>