chukwa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jerome Boulon (JIRA)" <j...@apache.org>
Subject [jira] Commented: (CHUKWA-4) Collectors don't finish writing .done datasink from last .chukwa datasink when stopped using bin/stop-collectors
Date Fri, 26 Mar 2010 21:58:27 GMT

    [ https://issues.apache.org/jira/browse/CHUKWA-4?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12850369#action_12850369
] 

Jerome Boulon commented on CHUKWA-4:
------------------------------------

I'm exploring a couple of options:

1- use the local fileSystem instead of HDFS in the first place and it's working pretty well
for me since my HDFS is S3 so I can not really write directly to it
2- At startup time, any .chukwa files with last access time greater than 2 rotations period
cannot be file that collectors are writing to. So we can
  2.1 Open the file and read it, if not then you have todo 2.2 so not sure if always doing
2.2 will not be a better option
  2.2 Open the file, Create a new SequenceFile and copy data from one to the other, then close
the file and rename. To avoid being trapped to the same issue and case of Kill -9/Crash, we
need to use another extension like .recover and at start time we can delete any .recover file.

Also at this point, there's no a valid way to process invalid SequenceFile in hadoop so paying
the price when we know that something could be wrong seems to be better that crashing a M/R
with 300 input files and then having to parse them all ....


> Collectors don't finish writing .done datasink from last .chukwa datasink when stopped
using bin/stop-collectors
> ----------------------------------------------------------------------------------------------------------------
>
>                 Key: CHUKWA-4
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-4
>             Project: Hadoop Chukwa
>          Issue Type: Bug
>          Components: data collection
>         Environment: I am running on our local cluster. This is a linux machine that
I also run Hadoop cluster from.
>            Reporter: Andy Konwinski
>            Priority: Minor
>
> When I use start-collectors, it creates the datasink as expected, writes to it as per
normal, i.e. writes to the .chukwa file, and roll overs work fine when it renames the .chukwa
file to .done. However, when I use bin/stop-collectors to shut down the running collector
it leaves a .chukwa file in the HDFS file system. Not sure if this is a valid sink or not,
but I think that the collector should gracefully clean up the datasink and rename it .done
before exiting.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message