chukwa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ahmed Fathalla (JIRA)" <j...@apache.org>
Subject [jira] Commented: (CHUKWA-4) Collectors don't finish writing .done datasink from last .chukwa datasink when stopped using bin/stop-collectors
Date Wed, 07 Apr 2010 05:49:33 GMT

    [ https://issues.apache.org/jira/browse/CHUKWA-4?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12854357#action_12854357
] 

Ahmed Fathalla commented on CHUKWA-4:
-------------------------------------

I made the changes Jerome recommended, I tried it and it seems to be working correctly. Please
take a look and tell me any comments you might have

public class CopySequenceFile {
  static Logger log = Logger.getLogger(LocalWriter.class);
  private static SequenceFile.Writer seqFileWriter = null;
  private static SequenceFile.Reader seqFileReader = null; 
  private static FSDataOutputStream newOutputStr = null;
  
  public static void main(String args[]){
		
	}
	
  public static void createValidSequenceFile(Configuration conf, String originalFileDir, String
originalFileName,FileSystem localFs){
    try{
	  String originalCompleteDir= originalFileDir + originalFileName;
	  Path originalPath= new Path (originalCompleteDir);
	  int extensionIndex= originalFileName.indexOf(".chukwa",0);
      String recoverDoneFileName=originalFileName.substring(0, extensionIndex)+".recoverDone";
 	  String recoverDoneDir= originalFileDir + recoverDoneFileName;
 	  Path recoverDonePath= new Path(recoverDoneDir);
	  String recoverFileName=originalFileName.substring(0, extensionIndex)+".recover";
	  String recoverDir= originalFileDir+ recoverFileName;
	  Path recoverPath= new Path (recoverDir);
	  String doneFileName=originalFileName.substring(0, extensionIndex)+".done";
	  String doneDir= originalFileDir+ doneFileName;
	  Path donePath= new Path (doneDir);
	  
	  newOutputStr = localFs.create(recoverPath);
      seqFileWriter = SequenceFile.createWriter(conf, newOutputStr,
        ChukwaArchiveKey.class, ChunkImpl.class,
        SequenceFile.CompressionType.NONE, null);
      seqFileReader = new SequenceFile.Reader (localFs, originalPath, conf);
        
      System.out.println("key class name is " + seqFileReader.getKeyClassName());
      System.out.println("value class name is " + seqFileReader.getValueClassName());
      ChukwaArchiveKey key = new ChukwaArchiveKey();
      ChunkImpl evt = ChunkImpl.getBlankChunk();
       try{ 
         while (seqFileReader.next(key, evt)){
           seqFileWriter.append(key, evt);
        }
       }
       catch (ChecksumException e){ //The exception occurs when we read a bad chunk while
copying
         log.warn("Encountered Bad Chunk while copying .chukwa file, continuing",e);	 
       }
       try{
	     localFs.rename(recoverPath, recoverDonePath); //Rename the destination file from .recover
to .recoverDone 
   	     localFs.delete(originalPath,false); //Delete Original .chukwa file
	     localFs.rename(recoverDonePath, donePath); //rename .recoverDone to .done
         }
       catch (Exception e){
         log.warn("Error occured while renaming .recoverDone to .recover or deleting .chukwa",e);
	 
    	 e.printStackTrace();
         }
       seqFileReader.close();
	   seqFileWriter.close();
	   newOutputStr.close();
	}

	catch(Exception e){
	  log.warn("Error during .chukwa file recovery",e);	 
	  e.printStackTrace();
	}	
  }
}


> Collectors don't finish writing .done datasink from last .chukwa datasink when stopped
using bin/stop-collectors
> ----------------------------------------------------------------------------------------------------------------
>
>                 Key: CHUKWA-4
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-4
>             Project: Hadoop Chukwa
>          Issue Type: Bug
>          Components: data collection
>         Environment: I am running on our local cluster. This is a linux machine that
I also run Hadoop cluster from.
>            Reporter: Andy Konwinski
>            Priority: Minor
>
> When I use start-collectors, it creates the datasink as expected, writes to it as per
normal, i.e. writes to the .chukwa file, and roll overs work fine when it renames the .chukwa
file to .done. However, when I use bin/stop-collectors to shut down the running collector
it leaves a .chukwa file in the HDFS file system. Not sure if this is a valid sink or not,
but I think that the collector should gracefully clean up the datasink and rename it .done
before exiting.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message