impala-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dimitris Tsirogiannis <dtsirogian...@cloudera.com>
Subject Re: Questions about Statestore and Catalogservice
Date Fri, 08 Dec 2017 19:15:39 GMT
Hi Lars,

You should think of the statestore as a pub/sub system used by the other
impala processes (impalads and catalogd). One example topic is the
"catalog_update" which is used to communicate catalog changes (e.g.
adding/dropping a table, etc) from the catalog to all the coordinators
(impalads). Periodically, the statestore will contact the catalog to ask
for metadata changes that get disseminated to the cluster (coordinators).

All other user-triggered operations (DDL/DML, INVALIDATE, REFRESH, etc) use
more or less the following path. The issuing impalad sends the request to
the catalog, the catalog processes it and then responds back to the
impalad; this is a synchronous operation. In the background, the statestore
performs the task described above to propagate metadata changes to all the
other impalad nodes. That's at a high level how impalads, catalogd and
statestore interact with each other. Is this what you're interested in or
are you looking for more low-level implementation details?

Thanks
Dimitris

On Fri, Dec 8, 2017 at 2:48 AM, Lars Francke <lars.francke@gmail.com> wrote:

> Hi,
>
> I'm trying to understand how the communication between the components
> works.
>
> I understand that an impala daemon subscribes to the statestore. The
> statestore seems to have the concept of heartbeats and topics. But I'm not
> sure what topics are all about.
>
> The docs also say that only the statestore communicates with the catalog
> service. How does that happen? How is a INVALIDATE/REFRESH statement routed
> from a daemon to the catalog service and back?
>
> I'm sure I'll have follow-up questions but this would already be very
> helpful. Thank you!
>
> Cheers,
> Lars
>

Mime
View raw message