hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mohit Anchlia <mohitanch...@gmail.com>
Subject Re: Sync and Data Replication
Date Sun, 10 Jun 2012 19:17:19 GMT
On Sun, Jun 10, 2012 at 9:39 AM, Harsh J <harsh@cloudera.com> wrote:

> Mohit,
>
> On Sat, Jun 9, 2012 at 11:11 PM, Mohit Anchlia <mohitanchlia@gmail.com>
> wrote:
> > Thanks Harsh for detailed info. It clears things up. Only thing from
> those
> > page is concerning is what happens when client crashes. It says you could
> > lose upto a block worth of information. Is this still true given that NN
> > would auto close the file?
>
> Where does it say this exactly? It is true that immediate readers will
> not get the last block (as it remains open and uncommitted), but once
> the lease recovery kicks in the file is closed successfully and the
> last block is indeed made available, so there's no 'data loss'.
>

I saw it in "Coherency Model" -> "consequences of application design"
paragraph.

Thanks for the information. It at least helps me in that I don't have to
worry about the data loss when sync is not closed.

>
> > Is it a good practice to reduce NN default value so that it auto-closes
> > before 1 hr.
>
> I've not seen people do this/need to do this. Most don't run into such
> a situation and it is vital to properly close() files or sync() on
> file streams before making it available to readers. HBase manages open
> files during WAL-recovery using lightweight recoverLease APIs that
> were added for its benefit, so it doesn't need to wait for an hour for
> WALs to close and recover data.
>
> --
> Harsh J
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message