chukwa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Yang (JIRA)" <j...@apache.org>
Subject [jira] Commented: (CHUKWA-22) Need index for chukwa sequence files
Date Mon, 12 Oct 2009 17:09:31 GMT

    [ https://issues.apache.org/jira/browse/CHUKWA-22?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12764748#action_12764748
] 

Eric Yang commented on CHUKWA-22:
---------------------------------

Chukwa's demux processor already ordered the data, hence 95% of the time, it should be sequential
write to hbase.  My test machines also have 16GB of RAM.  Hence, I am not seeing the memory
and throughput problems yet.  Maybe my dataset is too small when writing to hbase.  The paper
is a interesting read. Thanks for sharing.  I am open to suggestion on indexing chukwa data.
 Perhaps, the data could be managed using tfiles, yet this would make chukwa to repeat a lot
of work from hbase.  That is something that I would like to avoid.
Something to think about.

> Need index for chukwa sequence files
> ------------------------------------
>
>                 Key: CHUKWA-22
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-22
>             Project: Hadoop Chukwa
>          Issue Type: New Feature
>          Components: Data Processors
>         Environment: Redhat EL 5.1 and Java 6
>            Reporter: Eric Yang
>            Assignee: Eric Yang
>
> Chukwa has ability to collect large volume of data, but the lack of index prevents Chukwa
front end to serve data straight from HDFS.  This jira is the place holder for designing a
indexing service for Chukwa.  The plan is to create indexing service base on available software
like lucene or katta.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message