kudu-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From William Berkeley <wdberke...@cloudera.com>
Subject Re: Kudu hashes and Java hashes
Date Tue, 28 Aug 2018 16:58:51 GMT
> 1. We have multiple Kudu clients (Reducers).

Would it be better if each one has a single session to a single tablet
writing large number of records,

or multiple sessions writing to different tablets (total number of records
is the same)?


The advantage I see in writing to a single tablet from a single reducer is
that if the reducer is scheduled locally to the leader replica of the
tablet then one network hop is eliminated. However, the client, AFAIK,
doesn't offer a general mechanism to know where a write will go. If the
table is purely range partitioned it is possible, but not if the table has
hash partitioning. Since leadership can change at any time, it wouldn't be
a reliable mechanism anyway. To compare, Kudu does offer a way to split a
scan into scan tokens, which can be serialized and dehydrated into a
scanner on a machine where a suitable replica lives.


So it doesn't really matter, as long as there are a good enough number of
rows for most of the tablets being written to, so that the execution time
isn't dominated by roundtrips to n servers (vs. to 1 server).


Are you seeing a specific problem where Kudu isn't as fast as you
anticipated, or where reducers writing to many tablets is a bottleneck?


> 2. Assuming it is preferable to have 1-to-1 relationship, i.e. 1 Reducers
should write to 1 Tablet. What would be the proper implementation to reduce
amount of connections between reducers to different tablets, i.e. if there
are 128 reducers (each gathers its own set of unique hashes) and 128
tablets, then ideally each reducer should write to 1 tablet, but not to
each of 128 tablets.


I don't think the 1-1 relationship is preferable if, for each reducer, the
number of rows written per tablet is large enough to fill multiple batches
(10,000, say, is good enough).


> Why have the questions arised: there is a hash implementation in Java and
another one in Kudu. Is there any chance to ensure Java Reducers use the
same hash function as Kudu partitioning hash?


The current Java implementation of the encodings of the primary and
partition keys can be found at
https://github.com/apache/kudu/blob/master/java/kudu-client/src/main/java/org/apache/kudu/client/KeyEncoder.java,
so you could adjust your code to match that and, possibly with a bit of
additional custom code, be able to tell ahead of time which tablet a row
belongs to. However, it is implementation, not interface. I don't imagine
it changing, but it could, and there's no guarantee it won't.


Adding the ability to determine the row-to-tablet mapping, as a feature
request, I think might not be a good idea because writes must go to the
tablet leader, and that can change at any time, so such a feature still
doesn't provide a reliable way to determine the row-to-tablet-server
mapping.


-Will

On Tue, Aug 28, 2018 at 1:08 AM Sergejs Andrejevs <S.Andrejevs@intrum.com>
wrote:

> Hi there,
>
>
>
> We're running Map-Reduce jobs in java and Reducers write to Kudu.
>
> In java we use hashCode() function to send results from Mappers to
> Reducers, e.g.
>
>     public int getPartition(ArchiveKey key, Object value, int
> numReduceTasks) {
>
>         int hash = key.getCaseId().hashCode();
>
>         return (hash & Integer.MAX_VALUE) % numReduceTasks;
>
>     }
>
> There is also a partitioning hash function in Kudu tables.
>
>
>
> Therefore, there are 2 questions:
>
> 1. We have multiple Kudu clients (Reducers).
>
> Would it be better if each one has a single session to a single tablet
> writing large number of records,
>
> or multiple sessions writing to different tablets (total number of records
> is the same)?
>
> 2. Assuming it is preferable to have 1-to-1 relationship, i.e. 1 Reducers
> should write to 1 Tablet. What would be the proper implementation to reduce
> amount of connections between reducers to different tablets, i.e. if there
> are 128 reducers (each gathers its own set of unique hashes) and 128
> tablets, then ideally each reducer should write to 1 tablet, but not to
> each of 128 tablets.
>
>
>
> Why have the questions arised: there is a hash implementation in Java and
> another one in Kudu. Is there any chance to ensure Java Reducers use the
> same hash function as Kudu partitioning hash?
>
>
>
>
>
> Best regards,
>
> Sergejs
>

Mime
View raw message