impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthew Jacobs (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-3710: Kudu DML should ignore conflicts, pt2
Date Tue, 08 Nov 2016 16:52:15 GMT
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-3710: Kudu DML should ignore conflicts, pt2
......................................................................


Patch Set 1:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/4985/1/be/src/exec/data-sink.cc
File be/src/exec/data-sink.cc:

Line 119:     if (!dst_stats->__isset.kudu_stats) dst_stats->__set_kudu_stats(TKuduDmlStats());
> pass 0 in c'tor of TKuduDmlStats?
I can't bc there's no parameter, I think because I use optional fields.


Line 161:       ss << "NumRowErrors: " << stats.kudu_stats.num_row_errors <<
endl;
> General question: Is there an easy way to distinguish the different types o
It is unfortunate and probably not helpful enough, but I'm very worried about us presenting
incorrect information to the user which I think is worse. The issue is that we can't trust
the Kudu error codes enough. E.g. they re-use error codes for some different cases where I'm
not confident enough that we'll always get it right. My goal is to work with them for the
next release to get a more concrete contract. I was on the fence about having different Impala
error codes at all, but I think it's worthwhile for de-duping the logs. If you think this
is confusing, I'd prefer to just remove the warning logging w/ different error codes.


http://gerrit.cloudera.org:8080/#/c/4985/1/be/src/exec/kudu-table-sink.cc
File be/src/exec/kudu-table-sink.cc:

Line 214:           COUNTER_ADD(num_row_errors_, 1);
> This is expensive right? How about we update a local counter and COUNTER_AD
Done


Line 305:     if (!e.IsNotFound() && !e.IsAlreadyPresent() && status.ok())
{
> Why not move to L325 into an else if?
Done


Line 310:       // Kudu does not have yet have a way to programmatically differentiate between
'row
> few words mixed up
Done


http://gerrit.cloudera.org:8080/#/c/4985/1/common/thrift/ImpalaInternalService.thrift
File common/thrift/ImpalaInternalService.thrift:

Line 430: // TODO: Refactor to reflect usage by other DML statements.
> Do you intend to do that soon? The structures are a little confusing now.
Yeah I'll go ahead and do this if you think the strategy is otherwise OK.


Line 445:   // The latest observed Kudu timestamp reported by the KuduSession at this partition.
> is partition == table here?
Hm yeah I guess it's not great to call this a partition... Table I think might be confusing
too since there are many of these inserting across the cluster for the same table. How about
I just remove 'at this partition'?


-- 
To view, visit http://gerrit.cloudera.org:8080/4985
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4eb1ad91dc355ea51de261c3a14df0f9d28c879c
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs <mj@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.behm@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message