accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Vines <john.w.vi...@ugov.gov>
Subject Re: EXTERNAL: Re: Accumulo and file locality
Date Mon, 25 Jun 2012 18:09:37 GMT
Essentially it will right to the local disk. Technically, it is writing to
HDFS, and when it writes to HDFS that write goes to local disk among other
places. When you read from that tserver, it will read through hdfs. And if
that datanode is a host for the block being read, then there will be read
locality. Once the tablet migrates to a tserver without locality, there
will be reads across the network until major compaction occurs to make the
file local again.

John

On Mon, Jun 25, 2012 at 1:58 PM, Cardon, Tejay E <tejay.e.cardon@lmco.com>wrote:

>  John, let me make sure I understand.  When a tserver is running on the
> same physical box as a datanode, it will write to the local disk.  HDFS
> will then replicate that write across the network.  When I read from that
> tserver, it will not need a network read (assuming no failures).  Is that
> correct?****
>
> ** **
>
> Thanks,
> Tejay****
>
> ** **
>
> *From:* John Vines [mailto:john.w.vines@ugov.gov]
> *Sent:* Monday, June 25, 2012 11:46 AM
> *To:* user@accumulo.apache.org
> *Subject:* EXTERNAL: Re: Accumulo and file locality****
>
> ** **
>
> When a tserver writes, it writes out to hdfs. When you utilize the hdfs
> api, data will be written first to the local datanode. So it does go to
> local disk, but it's local disk as well as others via datanodes. So each
> tserver should run a datanode so you actually get locality. If there is no
> datanode where the tserver is, then all reads and writes go over the
> network, which is suboptimal.
>
> John****
>
> On Mon, Jun 25, 2012 at 1:21 PM, William Slacum <wslacum@gmail.com> wrote:
> ****
>
> The loggers will write to local disk, however, the TabletServer will
> write out files to HDFS during major and minor compactions.
>
> I don't know how complex the tablet assignment algorithm is, but it's
> safe to assume that if your tablet spans multiple HDFS blocks, a
> TServer will, in all likelihood, only be hosting 1 HDFS block of a
> given tablet at any given time, and do fetches for other HDFS (and
> RFile) blocks as the need arises. There is a caching mechanism for
> holding on to RFile blocks.****
>
>
> On Mon, Jun 25, 2012 at 10:13 AM, Cardon, Tejay E
> <tejay.e.cardon@lmco.com> wrote:
> > All,
> >
> >                 If I understand things correctly, when an Accumulo tablet
> > server writes data, things are organized such that those writes go to the
> > local disk (ie each tablet server writes and reads data to/from the disk
> > local to that server).  Is this correct?  And if so, then is it correct
> to
> > assume that every tablet server should run on an HDFS data node?  Or am I
> > completely off base here?
> >
> >
> >
> > Thanks,
> >
> > Tejay Cardon****
>
> ** **
>

Mime
View raw message