hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arun C Murthy (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-342) Design/Implement a tool to support archival and analysis of logfiles.
Date Tue, 04 Jul 2006 03:04:31 GMT
     [ http://issues.apache.org/jira/browse/HADOOP-342?page=all ]

Arun C Murthy updated HADOOP-342:
---------------------------------

    Attachment: logalyzer.patch

Here's the 'logalyzer' tool.

Doug: I felt that it made sense to create a org.apache.hadoop.tools package for logalyzer
and other such tools in the future... let me know if you prefer it to be in some other package
and i'll update it accordingly.

thanks,
Arun

> Design/Implement a tool to support archival and analysis of logfiles.
> ---------------------------------------------------------------------
>
>          Key: HADOOP-342
>          URL: http://issues.apache.org/jira/browse/HADOOP-342
>      Project: Hadoop
>         Type: New Feature

>     Reporter: Arun C Murthy
>  Attachments: logalyzer.patch
>
> Requirements:
>   a) Create a tool support archival of logfiles (from diverse sources) in hadoop's dfs.
>   b) The tool should also support analysis of the logfiles via grep/sort primitives.
The tool should allow for fairly generic pattern 'grep's and let users 'sort' the matching
lines (from grep) on 'columns' of their choice.
>   E.g. from hadoop logs: Look for all log-lines with 'FATAL' and sort them based on timestamps
(column x)  and then on column y (column x, followed by column y).
> Design/Implementation:
>   a) Log Archival
>     Archival of logs from diverse sources can be accomplished using the *distcp* tool
(HADOOP-341).
>   
>   b) Log analysis
>     The idea is to enable users of the tool to perform analysis of logs via grep/sort
primitives.
>     This can be accomplished via a relatively simple Map-Reduce task where the map does
the *grep* for the given pattern via RegexMapper and then the implicit *sort* (reducer) is
used with a custom Comparator which performs the user-specified comparision (columns). 
>     The sort/grep specs can be fairly powerful by letting the user of the tool use java's
in-built regex patterns (java.util.regex).

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


Mime
View raw message