Return-Path: X-Original-To: apmail-chukwa-dev-archive@www.apache.org Delivered-To: apmail-chukwa-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 5E9591778F for ; Sat, 18 Apr 2015 18:54:59 +0000 (UTC) Received: (qmail 95429 invoked by uid 500); 18 Apr 2015 18:54:59 -0000 Delivered-To: apmail-chukwa-dev-archive@chukwa.apache.org Received: (qmail 95393 invoked by uid 500); 18 Apr 2015 18:54:59 -0000 Mailing-List: contact dev-help@chukwa.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@chukwa.apache.org Delivered-To: mailing list dev@chukwa.apache.org Received: (qmail 95381 invoked by uid 99); 18 Apr 2015 18:54:59 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 18 Apr 2015 18:54:59 +0000 Date: Sat, 18 Apr 2015 18:54:59 +0000 (UTC) From: "Eric Yang (JIRA)" To: dev@chukwa.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Reopened] (CHUKWA-744) Refactor ETL process for HBaseWriter MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CHUKWA-744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Yang reopened CHUKWA-744: ------------------------------ Missed ETL files for HBase parsing. > Refactor ETL process for HBaseWriter > ------------------------------------ > > Key: CHUKWA-744 > URL: https://issues.apache.org/jira/browse/CHUKWA-744 > Project: Chukwa > Issue Type: Task > Components: Data Processors > Affects Versions: 0.6.0 > Reporter: Eric Yang > Assignee: Eric Yang > > The current ETL classes are based on Demux MapProcessor and ReduceProcessor. The processors were designed to pass in archive key embedded in the processor as well as ChunkSaver to preserve chunks that can not be parsed. This is fine when running map reduce based demux job for processing data. The short lived task will spill out ChunkSaver into separate file for examination later. However, the processors can generate memory leaks for long period of time in Chukwa agent because Chunks are saved in ChukwaSaver without clean up. > It would be better to redesign the parser classes with well defined behavior. If the chunk can not be parsed, it should throw ParseException to upper layer for retry or log to agent log. -- This message was sent by Atlassian JIRA (v6.3.4#6332)