accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dickson, Matt MR" <matt.dick...@defence.gov.au>
Subject RE: Meaning of < in METADATA table [SEC=UNOFFICIAL]
Date Thu, 26 Jun 2014 01:47:00 GMT
UNOFFICIAL

I just noticed the early release of the Oreilly book 'Accumulo' on Safari Books Online has
a chapter dedicated to the metadata table.  This outlines each of the entries nicely.

-----Original Message-----
From: David Medinets [mailto:david.medinets@gmail.com]
Sent: Thursday, 26 June 2014 00:14
To: accumulo-user
Subject: Re: Meaning of < in METADATA table [SEC=UNOFFICIAL]

I've added this information to my "Data Distribution Throughout the Accumulo Cluster" page
at https://github.com/medined/D4M_Schema/blob/master/docs/data_distribution.md#user-content-example-of-splits-inside-the-metadata-table.

On Tue, Jun 24, 2014 at 11:13 PM, Eric Newton <eric.newton@gmail.com> wrote:
> That's an excellent idea.  I would still like to distill the 
> information into a talk, but it would be nice if the information was also in the code.
>
>
>
>
> On Tue, Jun 24, 2014 at 10:53 PM, Sean Busbey <busbey@cloudera.com> wrote:
>>
>> Couldn't we add this kind of documentation to e.g. the MetadataSchema 
>> class?
>>
>> Or some developer focuses architecture docs?
>>
>>
>>
>> On Jun 24, 2014 9:48 PM, "Eric Newton" <eric.newton@gmail.com> wrote:
>> >
>> > "<" means no end-row.  It's the last tablet, which is often called 
>> > the default tablet.
>> >
>> > So,
>> >
>> >> 3p< ~tab:~pr \x00
>> >
>> >
>> > You can decode this as a tablet covering (-inf, +inf) for tablet id 3p.
>> >
>> >> 3p;a ~tab:~pr \x00
>> >> 3p;m ~tab:~pr \x01a
>> >> 3p;z ~tab:~pr \x01m
>> >> 3p<  ~tab:~pr \x01z
>> >
>> >
>> > This table, id "3p" has splits: (-inf, a], (a, m], (m, z], (z, +inf).
>> > "pr" stands for "end-row of previous tablet", which we often 
>> > shorten to "prevrow". Tilde's sort late in UTF8, which is important 
>> > for some race conditions when a table is splitting.
>> >
>> > I'm putting together a new presentation on the decoding of the 
>> > metadata tables.  I need to get the presentation approved by my 
>> > colleagues, so it's going to be some time before it is ready.  I 
>> > would like to write a shell Formatter that would make the metadata more more
human friendly.
>> >
>> > If you have any other questions about the metadata table, please ask.
>> > I'll make sure the answers are in the presentation.
>> >
>> > -Eric
>> >
>> >
>> >
>> >
>> > On Tue, Jun 24, 2014 at 10:21 PM, William Slacum 
>> > <wilhelm.von.cloud@accumulo.net> wrote:
>> >>
>> >> < is a byte used for doing an ordering on rows that share the same 
>> >> prefix.
>> >>
>> >> There was a presentation floating around on the specifics of the 
>> >> metadata table at one point. I believe that helps tablet 
>> >> information sort before the last tablet, which is suffixed with 
>> >> '~', to force it to sort after the other tablets. We'll probably 
>> >> get an Eric or Keith email soon laying down the law, but that's what I remember.
>> >>
>> >>
>> >> On Tue, Jun 24, 2014 at 9:57 PM, Dickson, Matt MR 
>> >> <matt.dickson@defence.gov.au> wrote:
>> >>>
>> >>> UNOFFICIAL
>> >>>
>> >>> When looking up rfile references in the metadata table we normally see
>> >>> <tableid>;<range> for the rowid.   I've noticed some rowids
are <tableid><
>> >>> eg.    3p<
>> >>>
>> >>> Is this because the table is small and hasn't been split or some 
>> >>> other reason?
>> >>
>> >>
>> >
>
>

Mime
View raw message