hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Trivial Update of "Chukwa_Processes_and_Data_Flow" by BillGraham
Date Wed, 03 Feb 2010 01:40:12 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "Chukwa_Processes_and_Data_Flow" page has been changed by BillGraham.
http://wiki.apache.org/hadoop/Chukwa_Processes_and_Data_Flow?action=diff&rev1=2&rev2=3

--------------------------------------------------

   1. Collectors close chunks and rename them to {{{*.done}}}
    * from: {{{logs/*.chukwa}}}
    * to: {{{logs/*.done}}} 
-  1. DemuxManager wakes up every 20 seconds, runs M/R to merges {{{*.done}}} files and moves
them.
+  1. DemuxManager checks for {{{*.done}}} files every 20 seconds.
+   1. If {{{*.done}}} files exist, moves files in place for demux processing:
-   * from: {{{logs/*.done}}}
+    * from: {{{logs/*.done}}}
-   * to: {{{demuxProcessing/mrInput}}}
+    * to: {{{demuxProcessing/mrInput}}}
+   1. If demux is successful within 3 attempts, archives the completed files:
-   * to: {{{demuxProcessing/mrOutput}}}
+    * from: {{{demuxProcessing/mrOutput}}}
-   * to: {{{{{{dataSinkArchives/[yyyyMMdd]/*/*.done}}} 
+    * to: {{{dataSinkArchives/[yyyyMMdd]/*/*.done}}} 
+   1. Otherwise moves the completed files to an error folder:
+    * from: {{{demuxProcessing/mrOutput}}}
+    * to: {{{dataSinkArchives/InError/[yyyyMMdd]/*/*.done}}} 
   1. PostProcessManager wakes up every few minutes and aggregates, orders and de-dups record
files.
    * from: postProcess/demuxOutputDir_*/[clusterName]/[dataType]/[dataType]_[yyyyMMdd]_[HH].R.evt}}}
    * to: {{{repos/[clusterName]/[dataType]/[yyyyMMdd]/[HH]/[mm]/[dataType]_[yyyyMMdd]_[HH]_[N].[N].evt}}}


Mime
View raw message