hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Denny Ye <denny...@gmail.com>
Subject Re: Nonempty files become empty after being saved to Hdfs.
Date Fri, 24 Feb 2012 07:55:13 GMT
hi Peter,
    There is no any operation log? Please using the state change log with
debug level.
    I checked the code for creating 'symlink'. It looks like regular
operation with other interfaces at NameNode. GC, directory lock, shell
script may be doubtful point.

Denny Ye

2012/2/24 Tianqiang Peter Li <peterltq@gmail.com>

> Hi, Denny,
> Thanks for the info, you are right about his, after some digging, I found
> there were no data loss, the root cause of getting empty file is that
> during the file rotation, the old file was closed, the new empty new file
> was created/opened, then hdfs tried to create the small file(acting as
> symbol link in hdfs) for this file, but for some reasons(gc?), the
> operation for creating this symbol link file stuck for 2 minutes, then
> right after it was successfully created, the scribe did a periodic check,
> found this file has passed the rotation window(the rotation period is
> 1min), then rotate out this newly-created empty file. It's good to know
> there is no data integrity issue. I am still looking at where this big
> hiccup comes from. Anyway, thanks for the helpful comments on this.
> -Peter
> On Sun, Feb 19, 2012 at 6:25 PM, Denny Ye <dennyy99@gmail.com> wrote:
>> hi Peter,
>>        I met this problem in our environment at one time. It may caused
>> by the blank data stream from Scribe to HDFS. In fixed rotate model, there
>> is no any data. Otherwise, NameNode can start the 'allocate block'
>> operation during the first data packet to HDFS client. Also, this only one
>> tip to your doubt.
>> -Regards
>> Denny Ye
>> 2012/2/19 Tianqiang Peter Li <peterltq@gmail.com>
>>> Hi, guys,
>>> I am using scribe to write data directly to hdfs, it works well most of
>>> the time, but sporadically, I found some of files(1-2 out of 6000 per day)
>>> written to hdfs become empty files, interestingly, this only happens when
>>> hdfs is busy handing lots of other write request at the same time, here is
>>> log pulled from namenode's log output, it seems the allocateBlock operation
>>> following the create operation is vanished, unlike other normal operations.
>>> Here are the comparison of a normal file and a abnormal one below:
>>> ----------- *Normal* one  (
>>> file:/prefix-2012-02-18_00410 )---------------------------------
>>> 2012-02-18 02:32:15,873 INFO
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: ugi=user ip=/
>>>        cmd=create      src=/prefix-2012-02-18_00410
>>> dst=null        perm=user:supergroup:rw-r--r--
>>> 2012-02-18 02:32:15,905 INFO org.apache.hadoop.hdfs.StateChange: *BLOCK*
>>> NameSystem.allocateBlock*: /prefix-2012-02-18_00410.
>>> blk_-2839230806711337016_129091175
>>> 2012-02-18 02:32:17,422 INFO org.apache.hadoop.hdfs.StateChange:
>>> Removing lease on  file /prefix-2012-02-18_00410 from client
>>> DFSClient_-36442571
>>> 2012-02-18 02:32:17,422 INFO org.apache.hadoop.hdfs.StateChange: DIR*
>>> NameSystem.completeFile: file /prefix-2012-02-18_00410 is closed by
>>> DFSClient_-36442571
>>> ----------- End --------------
>>> ----------- *Problematic* one (
>>> file:/prefix-2012-02-18_00409 )----------------------------
>>> 2012-02-18 02:30:47,287 INFO
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: ugi=user ip=/
>>>        cmd=create      src=/prefix-2012-02-18_00409
>>> dst=null        perm=user:supergroup:rw-r--r--
>>> *<comment: no allocateBlock is found, and there is 60-90 sec gap
>>> between>*
>>> 2012-02-18 02:32:15,648 INFO org.apache.hadoop.hdfs.StateChange:
>>> Removing lease on  file /prefix-2012-02-18_00409 from client
>>> DFSClient_748605466
>>> 2012-02-18 02:32:15,648 INFO org.apache.hadoop.hdfs.StateChange: DIR*
>>> NameSystem.completeFile: file /prefix-2012-02-18_00409 is closed by
>>> DFSClient_748605466
>>> ----------- End -------------------
>>> Not sure this is more scribe-related or hdfs-related, let me try
>>> shooting the question in this mailing-list.
>>> The 90 secs delay looks suspicious to me, and might cause this issue as
>>> I guess, did you guys meet this situation before? Any light shed on me
>>> about why is that and how to work it around to guarantee the entire file
>>> has been successfully saved into Hdfs?  Any places I should focus on for
>>> improvements?(on hdfs side(e.g. hdfs tunning, gc tuning etc) or application
>>> side(e.g. timeout setting on hdfs api calling in scribe), your
>>> suggestions/thoughts are greatly appreciated.
>>> Regards,
>>> Peter

View raw message