accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Medinets <david.medin...@gmail.com>
Subject Re: Meaning of < in METADATA table [SEC=UNOFFICIAL]
Date Wed, 25 Jun 2014 14:14:12 GMT
I've added this information to my "Data Distribution Throughout the
Accumulo Cluster" page at
https://github.com/medined/D4M_Schema/blob/master/docs/data_distribution.md#user-content-example-of-splits-inside-the-metadata-table.

On Tue, Jun 24, 2014 at 11:13 PM, Eric Newton <eric.newton@gmail.com> wrote:
> That's an excellent idea.  I would still like to distill the information
> into a talk, but it would be nice if the information was also in the code.
>
>
>
>
> On Tue, Jun 24, 2014 at 10:53 PM, Sean Busbey <busbey@cloudera.com> wrote:
>>
>> Couldn't we add this kind of documentation to e.g. the MetadataSchema
>> class?
>>
>> Or some developer focuses architecture docs?
>>
>>
>>
>> On Jun 24, 2014 9:48 PM, "Eric Newton" <eric.newton@gmail.com> wrote:
>> >
>> > "<" means no end-row.  It's the last tablet, which is often called the
>> > default tablet.
>> >
>> > So,
>> >
>> >> 3p< ~tab:~pr \x00
>> >
>> >
>> > You can decode this as a tablet covering (-inf, +inf) for tablet id 3p.
>> >
>> >> 3p;a ~tab:~pr \x00
>> >> 3p;m ~tab:~pr \x01a
>> >> 3p;z ~tab:~pr \x01m
>> >> 3p<  ~tab:~pr \x01z
>> >
>> >
>> > This table, id "3p" has splits: (-inf, a], (a, m], (m, z], (z, +inf).
>> > "pr" stands for "end-row of previous tablet", which we often shorten to
>> > "prevrow". Tilde's sort late in UTF8, which is important for some race
>> > conditions when a table is splitting.
>> >
>> > I'm putting together a new presentation on the decoding of the metadata
>> > tables.  I need to get the presentation approved by my colleagues, so it's
>> > going to be some time before it is ready.  I would like to write a shell
>> > Formatter that would make the metadata more more human friendly.
>> >
>> > If you have any other questions about the metadata table, please ask.
>> > I'll make sure the answers are in the presentation.
>> >
>> > -Eric
>> >
>> >
>> >
>> >
>> > On Tue, Jun 24, 2014 at 10:21 PM, William Slacum
>> > <wilhelm.von.cloud@accumulo.net> wrote:
>> >>
>> >> < is a byte used for doing an ordering on rows that share the same
>> >> prefix.
>> >>
>> >> There was a presentation floating around on the specifics of the
>> >> metadata table at one point. I believe that helps tablet information sort
>> >> before the last tablet, which is suffixed with '~', to force it to sort
>> >> after the other tablets. We'll probably get an Eric or Keith email soon
>> >> laying down the law, but that's what I remember.
>> >>
>> >>
>> >> On Tue, Jun 24, 2014 at 9:57 PM, Dickson, Matt MR
>> >> <matt.dickson@defence.gov.au> wrote:
>> >>>
>> >>> UNOFFICIAL
>> >>>
>> >>> When looking up rfile references in the metadata table we normally see
>> >>> <tableid>;<range> for the rowid.   I've noticed some rowids
are <tableid><
>> >>> eg.    3p<
>> >>>
>> >>> Is this because the table is small and hasn't been split or some other
>> >>> reason?
>> >>
>> >>
>> >
>
>

Mime
View raw message