kudu-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dan Burkert <danburk...@apache.org>
Subject Re: Kudu hashes and Java hashes
Date Tue, 28 Aug 2018 17:14:52 GMT
I'm only aware of one reason you'd want to pre-partition the data before
inserting it into Kudu, and that's if you are sorting the input data prior
to inserting.  Having a way to map a row to a partition means the sort step
can be done per-partition instead of globally, which can help reduce memory
usage.  Impala does this via the C++ KuduPartitioner
<https://kudu.apache.org/cpp-client-api/classkudu_1_1client_1_1KuduPartitioner.html>
API, but there isn't yet a Java equivalent.

- Dan

On Tue, Aug 28, 2018 at 9:58 AM, William Berkeley <wdberkeley@cloudera.com>
wrote:

> > 1. We have multiple Kudu clients (Reducers).
>
> Would it be better if each one has a single session to a single tablet
> writing large number of records,
>
> or multiple sessions writing to different tablets (total number of records
> is the same)?
>
>
> The advantage I see in writing to a single tablet from a single reducer is
> that if the reducer is scheduled locally to the leader replica of the
> tablet then one network hop is eliminated. However, the client, AFAIK,
> doesn't offer a general mechanism to know where a write will go. If the
> table is purely range partitioned it is possible, but not if the table has
> hash partitioning. Since leadership can change at any time, it wouldn't be
> a reliable mechanism anyway. To compare, Kudu does offer a way to split a
> scan into scan tokens, which can be serialized and dehydrated into a
> scanner on a machine where a suitable replica lives.
>
>
> So it doesn't really matter, as long as there are a good enough number of
> rows for most of the tablets being written to, so that the execution time
> isn't dominated by roundtrips to n servers (vs. to 1 server).
>
>
> Are you seeing a specific problem where Kudu isn't as fast as you
> anticipated, or where reducers writing to many tablets is a bottleneck?
>
>
> > 2. Assuming it is preferable to have 1-to-1 relationship, i.e. 1
> Reducers should write to 1 Tablet. What would be the proper implementation
> to reduce amount of connections between reducers to different tablets, i.e.
> if there are 128 reducers (each gathers its own set of unique hashes) and
> 128 tablets, then ideally each reducer should write to 1 tablet, but not to
> each of 128 tablets.
>
>
> I don't think the 1-1 relationship is preferable if, for each reducer, the
> number of rows written per tablet is large enough to fill multiple batches
> (10,000, say, is good enough).
>
>
> > Why have the questions arised: there is a hash implementation in Java
> and another one in Kudu. Is there any chance to ensure Java Reducers use
> the same hash function as Kudu partitioning hash?
>
>
> The current Java implementation of the encodings of the primary and
> partition keys can be found at https://github.com/apache/
> kudu/blob/master/java/kudu-client/src/main/java/org/
> apache/kudu/client/KeyEncoder.java, so you could adjust your code to
> match that and, possibly with a bit of additional custom code, be able to
> tell ahead of time which tablet a row belongs to. However, it is
> implementation, not interface. I don't imagine it changing, but it could,
> and there's no guarantee it won't.
>
>
> Adding the ability to determine the row-to-tablet mapping, as a feature
> request, I think might not be a good idea because writes must go to the
> tablet leader, and that can change at any time, so such a feature still
> doesn't provide a reliable way to determine the row-to-tablet-server
> mapping.
>
>
> -Will
>
> On Tue, Aug 28, 2018 at 1:08 AM Sergejs Andrejevs <S.Andrejevs@intrum.com>
> wrote:
>
>> Hi there,
>>
>>
>>
>> We're running Map-Reduce jobs in java and Reducers write to Kudu.
>>
>> In java we use hashCode() function to send results from Mappers to
>> Reducers, e.g.
>>
>>     public int getPartition(ArchiveKey key, Object value, int
>> numReduceTasks) {
>>
>>         int hash = key.getCaseId().hashCode();
>>
>>         return (hash & Integer.MAX_VALUE) % numReduceTasks;
>>
>>     }
>>
>> There is also a partitioning hash function in Kudu tables.
>>
>>
>>
>> Therefore, there are 2 questions:
>>
>> 1. We have multiple Kudu clients (Reducers).
>>
>> Would it be better if each one has a single session to a single tablet
>> writing large number of records,
>>
>> or multiple sessions writing to different tablets (total number of
>> records is the same)?
>>
>> 2. Assuming it is preferable to have 1-to-1 relationship, i.e. 1 Reducers
>> should write to 1 Tablet. What would be the proper implementation to reduce
>> amount of connections between reducers to different tablets, i.e. if there
>> are 128 reducers (each gathers its own set of unique hashes) and 128
>> tablets, then ideally each reducer should write to 1 tablet, but not to
>> each of 128 tablets.
>>
>>
>>
>> Why have the questions arised: there is a hash implementation in Java and
>> another one in Kudu. Is there any chance to ensure Java Reducers use the
>> same hash function as Kudu partitioning hash?
>>
>>
>>
>>
>>
>> Best regards,
>>
>> Sergejs
>>
>

Mime
View raw message