hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: Hbase fast access
Date Fri, 21 Oct 2016 20:46:46 GMT
bq. this search is carried out through map-reduce on region servers?

No map-reduce. region server uses its own thread(s).

bq. all updates are done in memory o disk access

Can you clarify ? There seems to be some missing letters.

On Fri, Oct 21, 2016 at 1:43 PM, Mich Talebzadeh <mich.talebzadeh@gmail.com>
wrote:

> thanks
>
> having read the docs it appears to me that the main reason of hbase being
> faster is:
>
>
>    1. it behaves like an rdbms like oracle tetc. reads are looked for in
>    the buffer cache for consistent reads and if not found then store files
> on
>    disks are searched. Does this mean that this search is carried out
> through
>    map-reduce on region servers?
>    2. when the data is written it is written to log file sequentially
>    first, then to in-memory store, sorted like b-tree of rdbms and then
>    flushed to disk. this is exactly what checkpoint in an rdbms does
>    3. one can point out that hbase is faster because log structured merge
>    tree (LSM-trees)  has less depth than a B-tree in rdbms.
>    4. all updates are done in memory o disk access
>    5. in summary LSM-trees reduce disk access when data is read from disk
>    because of reduced seek time again less depth to get data with LSM-tree
>
>
> appreciate any comments
>
>
> cheers
>
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * https://www.linkedin.com/profile/view?id=
> AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCd
> OABUrV8Pw>*
>
>
>
> http://talebzadehmich.wordpress.com
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
> On 21 October 2016 at 17:51, Ted Yu <yuzhihong@gmail.com> wrote:
>
> > See some prior blog:
> >
> > http://www.cyanny.com/2014/03/13/hbase-architecture-
> > analysis-part1-logical-architecture/
> >
> > w.r.t. compaction in Hive, it is used to compact deltas into a base file
> > (in the context of transactions).  Likely they're different.
> >
> > Cheers
> >
> > On Fri, Oct 21, 2016 at 9:08 AM, Mich Talebzadeh <
> > mich.talebzadeh@gmail.com>
> > wrote:
> >
> > > Hi,
> > >
> > > Can someone in a nutshell explain *the *Hbase use of log-structured
> > > merge-tree (LSM-tree) as data storage architecture
> > >
> > > The idea of merging smaller files to larger files periodically to
> reduce
> > > disk seeks,  is this similar concept to compaction in HDFS or Hive?
> > >
> > > Thanks
> > >
> > >
> > > Dr Mich Talebzadeh
> > >
> > >
> > >
> > > LinkedIn * https://www.linkedin.com/profile/view?id=
> > > AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> > > <https://www.linkedin.com/profile/view?id=
> AAEAAAAWh2gBxianrbJd6zP6AcPCCd
> > > OABUrV8Pw>*
> > >
> > >
> > >
> > > http://talebzadehmich.wordpress.com
> > >
> > >
> > > *Disclaimer:* Use it at your own risk. Any and all responsibility for
> any
> > > loss, damage or destruction of data or any other property which may
> arise
> > > from relying on this email's technical content is explicitly
> disclaimed.
> > > The author will in no case be liable for any monetary damages arising
> > from
> > > such loss, damage or destruction.
> > >
> > >
> > >
> > > On 21 October 2016 at 15:27, Mich Talebzadeh <
> mich.talebzadeh@gmail.com>
> > > wrote:
> > >
> > > > Sorry that should read Hive not Spark here
> > > >
> > > > Say compared to Spark that is basically a SQL layer relying on
> > different
> > > > engines (mr, Tez, Spark) to execute the code
> > > >
> > > > Dr Mich Talebzadeh
> > > >
> > > >
> > > >
> > > > LinkedIn * https://www.linkedin.com/profile/view?id=
> > > AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> > > > <https://www.linkedin.com/profile/view?id=
> > AAEAAAAWh2gBxianrbJd6zP6AcPCCd
> > > OABUrV8Pw>*
> > > >
> > > >
> > > >
> > > > http://talebzadehmich.wordpress.com
> > > >
> > > >
> > > > *Disclaimer:* Use it at your own risk. Any and all responsibility for
> > any
> > > > loss, damage or destruction of data or any other property which may
> > arise
> > > > from relying on this email's technical content is explicitly
> > disclaimed.
> > > > The author will in no case be liable for any monetary damages arising
> > > from
> > > > such loss, damage or destruction.
> > > >
> > > >
> > > >
> > > > On 21 October 2016 at 13:17, Ted Yu <yuzhihong@gmail.com> wrote:
> > > >
> > > >> Mich:
> > > >> Here is brief description of hbase architecture:
> > > >> https://hbase.apache.org/book.html#arch.overview
> > > >>
> > > >> You can also get more details from Lars George's or Nick Dimiduk's
> > > books.
> > > >>
> > > >> HBase doesn't support SQL directly. There is no cost based
> > optimization.
> > > >>
> > > >> Cheers
> > > >>
> > > >> > On Oct 21, 2016, at 1:43 AM, Mich Talebzadeh <
> > > mich.talebzadeh@gmail.com>
> > > >> wrote:
> > > >> >
> > > >> > Hi,
> > > >> >
> > > >> > This is a general question.
> > > >> >
> > > >> > Is Hbase fast because Hbase uses Hash tables and provides random
> > > access,
> > > >> > and it stores the data in indexed HDFS files for faster lookups.
> > > >> >
> > > >> > Say compared to Spark that is basically a SQL layer relying on
> > > different
> > > >> > engines (mr, Tez, Spark) to execute the code (although it has
Cost
> > > Base
> > > >> > Optimizer), how Hbase fares, beyond relying on these engines
> > > >> >
> > > >> > Thanks
> > > >> >
> > > >> >
> > > >> > Dr Mich Talebzadeh
> > > >> >
> > > >> >
> > > >> >
> > > >> > LinkedIn * https://www.linkedin.com/profile/view?id=
> > > AAEAAAAWh2gBxianrbJ
> > > >> d6zP6AcPCCdOABUrV8Pw
> > > >> > <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrb
> > > >> Jd6zP6AcPCCdOABUrV8Pw>*
> > > >> >
> > > >> >
> > > >> >
> > > >> > http://talebzadehmich.wordpress.com
> > > >> >
> > > >> >
> > > >> > *Disclaimer:* Use it at your own risk. Any and all responsibility
> > for
> > > >> any
> > > >> > loss, damage or destruction of data or any other property which
> may
> > > >> arise
> > > >> > from relying on this email's technical content is explicitly
> > > disclaimed.
> > > >> > The author will in no case be liable for any monetary damages
> > arising
> > > >> from
> > > >> > such loss, damage or destruction.
> > > >>
> > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message