Return-Path: X-Original-To: apmail-chukwa-dev-archive@www.apache.org Delivered-To: apmail-chukwa-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 7B2EF178AD for ; Sat, 14 Mar 2015 21:41:40 +0000 (UTC) Received: (qmail 47648 invoked by uid 500); 14 Mar 2015 21:41:40 -0000 Delivered-To: apmail-chukwa-dev-archive@chukwa.apache.org Received: (qmail 47616 invoked by uid 500); 14 Mar 2015 21:41:40 -0000 Mailing-List: contact dev-help@chukwa.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@chukwa.apache.org Delivered-To: mailing list dev@chukwa.apache.org Received: (qmail 47604 invoked by uid 99); 14 Mar 2015 21:41:40 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 14 Mar 2015 21:41:40 +0000 Date: Sat, 14 Mar 2015 21:41:40 +0000 (UTC) From: "Eric Yang (JIRA)" To: dev@chukwa.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Comment Edited] (CHUKWA-734) Gora Storage System for Chuckwa Logs MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CHUKWA-734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14362042#comment-14362042 ] Eric Yang edited comment on CHUKWA-734 at 3/14/15 9:40 PM: ----------------------------------------------------------- Thanks Lewis, A few suggestions, 1. Need to include hbase-client for test case to pass: {code} org.apache.hbase hbase-client ${hbase.version} {code} 2. gora.properties is better hosted in conf directory instead of src/main/resources. This allow user to configure it during deployment time instead of hardcode into jar file. 3. We may want to generate two gora.properties, one for test case, and one for release. The one with test case can run with in memory database to reduce test running time. The production one is preconfigured with hbase to make it easier for new comer to adopt this solution. 4. We probably want to have a developer guide for GoraWriter. It is really powerful stuff to enrich Chukwa's capability to write to different storage system. Tutorial could help new developers. 5. I encountered a issue when I configure gora.properties to write to HBase from chukwa agent. I get this error: {code} 2015-03-14 14:12:59.451 java[11075:636025] Unable to load realm info from SCDynamicStore Exception in thread "main" java.lang.NoSuchMethodError: org.apache.hadoop.hbase.HTableDescriptor.addFamily(Lorg/apache/hadoop/hbase/HColumnDescriptor;)V at org.apache.gora.hbase.store.HBaseMapping$HBaseMappingBuilder.build(HBaseMapping.java:174) at org.apache.gora.hbase.store.HBaseStore.readMapping(HBaseStore.java:811) at org.apache.gora.hbase.store.HBaseStore.initialize(HBaseStore.java:116) at org.apache.gora.store.DataStoreFactory.initializeDataStore(DataStoreFactory.java:101) at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:160) at org.apache.gora.store.DataStoreFactory.getDataStore(DataStoreFactory.java:277) at org.apache.hadoop.chukwa.datacollection.writer.gora.GoraWriter.init(GoraWriter.java:67) at org.apache.hadoop.chukwa.datacollection.writer.gora.GoraWriter.(GoraWriter.java:53) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at java.lang.Class.newInstance(Class.java:374) at org.apache.hadoop.chukwa.datacollection.writer.PipelineStageWriter.init(PipelineStageWriter.java:100) at org.apache.hadoop.chukwa.datacollection.writer.PipelineStageWriter.(PipelineStageWriter.java:48) at org.apache.hadoop.chukwa.datacollection.connector.PipelineConnector.start(PipelineConnector.java:87) at org.apache.hadoop.chukwa.datacollection.agent.ChukwaAgent.main(ChukwaAgent.java:292) {code} This is what I added to gora.properties: {code} gora.datastore.default=org.apache.gora.hbase.store.HBaseStore gora.datastore.autocreateschema=true {code} I am not sure if the last error was caused by the default Chukwa agent attempting to write to system metrics into HBase using gora. This bring an interesting question on how we want to configure data type map to writers. was (Author: eyang): Thanks Lewis, A few suggestions, 1. Need to include hbase-client for test case to pass: {code} org.apache.hbase hbase-client ${hbase.version} {code} 2. gora.properties is better hosted in conf directory instead of src/main/resources. This allow user to configure it during deployment time instead of hardcode into jar file. 3. We may want to generate two gora.properties, one for test case, and one for release. The one with test case can run with in memory database to reduce test running time. The production one is preconfigured with hbase to make it easier for new comer to adopt this solution. 4. We probably want to have a developer guide for GoraWriter. It is really powerful stuff to enrich Chukwa's capability to write to different storage system. Tutorial could help new developers. 5. I encountered a issue when I configure gora.properties to write to HBase from chukwa agent. I get this error: {code} 2015-03-14 14:12:59.451 java[11075:636025] Unable to load realm info from SCDynamicStore Exception in thread "main" java.lang.NoSuchMethodError: org.apache.hadoop.hbase.HTableDescriptor.addFamily(Lorg/apache/hadoop/hbase/HColumnDescriptor;)V at org.apache.gora.hbase.store.HBaseMapping$HBaseMappingBuilder.build(HBaseMapping.java:174) at org.apache.gora.hbase.store.HBaseStore.readMapping(HBaseStore.java:811) at org.apache.gora.hbase.store.HBaseStore.initialize(HBaseStore.java:116) at org.apache.gora.store.DataStoreFactory.initializeDataStore(DataStoreFactory.java:101) at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:160) at org.apache.gora.store.DataStoreFactory.getDataStore(DataStoreFactory.java:277) at org.apache.hadoop.chukwa.datacollection.writer.gora.GoraWriter.init(GoraWriter.java:67) at org.apache.hadoop.chukwa.datacollection.writer.gora.GoraWriter.(GoraWriter.java:53) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at java.lang.Class.newInstance(Class.java:374) at org.apache.hadoop.chukwa.datacollection.writer.PipelineStageWriter.init(PipelineStageWriter.java:100) at org.apache.hadoop.chukwa.datacollection.writer.PipelineStageWriter.(PipelineStageWriter.java:48) at org.apache.hadoop.chukwa.datacollection.connector.PipelineConnector.start(PipelineConnector.java:87) at org.apache.hadoop.chukwa.datacollection.agent.ChukwaAgent.main(ChukwaAgent.java:292) {code} This is what I added to gora.properties: {code} gora.datastore.default=org.apache.gora.hbase.store.HBaseStore gora.datastore.autocreateschema=true {code} I am not sure if the last error was caused by the default Chukwa agent attempting to write to system metrics into HBase using gora. I am not sure if this would cause problem. This bring an interesting question on how we want to configure data type map to writers. > Gora Storage System for Chuckwa Logs > ------------------------------------ > > Key: CHUKWA-734 > URL: https://issues.apache.org/jira/browse/CHUKWA-734 > Project: Chukwa > Issue Type: New Feature > Components: Data Collection > Affects Versions: 0.6.0 > Reporter: Lewis John McGibbney > Fix For: 0.7.0 > > Attachments: CHUKWA-734.patch, CHUKWA-734v2.patch > > Original Estimate: 5h > Remaining Estimate: 5h > > I would like to build a Gora-backed log-to-datastore module for Chuckwa. I am going to work on this today. > Gora is an in-memory data modeling and storage abstraction > http://gora.apache.org > Gora powers the Apache Nutch 2.X software which generates a bunch of log data. Having a Chuckwa monitoring tool for Nutch would be grand. -- This message was sent by Atlassian JIRA (v6.3.4#6332)