asterixdb-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jianfeng Jia (Code Review)" <>
Subject Change in hyracks[master]: ASTERIXDB-1102: VarSize Encoding to store length of String a...
Date Thu, 15 Oct 2015 07:59:05 GMT
Jianfeng Jia has posted comments on this change.

Change subject: ASTERIXDB-1102: VarSize Encoding to store length of String and ByteArray

Patch Set 5:

File hyracks/hyracks-data/hyracks-data-std/src/main/java/org/apache/hyracks/data/std/primitive/

Line 65:     public static UTF8StringPointable generateUTF8Pointable(String testString) {
> Is there a reason here this parameter is named testString?

Line 316:             GrowableArray out) throws IOException {
> I agree with your comments here. If we throw an RTE when this happens, is i
I guess we don't have the "warning" mechanism in the system now. So I will stick to the original
File hyracks/hyracks-dataflow-common/src/main/java/org/apache/hyracks/dataflow/common/data/marshalling/

Line 35: 
> Ah ok, I think this was the Singleton thing you were referring to. Is there
Good point!
There is no reason to have such confusing method name anymore. I replaced the INSTANCE method
to a public constructor
File hyracks/hyracks-util/src/main/java/org/apache/hyracks/util/bytes/

Line 23: 
> So one question I have, is that is there no allocation-free library we migh
I searched for a while, it seems our only-generate-once object program style is really odd
for the utility package.
I guess most of the utilities need to considerate the multi-thread scenario. While in our
style none of our utility is thread-safe. Instead, we rely on the outside factory pattern
to make sure there is one instance per thread. 
Given that, sadly it really hard to directly reuse the 3rd party libraries

I'm really curious about the initial experiment to compare the overhead of this object generation.

Line 106:      */
> I guess I can't think of a clean way to avoid having this method duplicate 
I can't think a nice way without creating any temporary buffer neither. 
I split the fillup quadruplet part into a separate function to reduce the redundancy a little.
File hyracks/hyracks-util/src/main/java/org/apache/hyracks/util/encoding/

Line 133: 
> Is there a reason this is commented out? Is it just not used?
The comment is originally here which shows a complementary way of encoding and decoding, it
might be used in the future.
File hyracks/hyracks-util/src/main/java/org/apache/hyracks/util/string/

Line 27: 
> Note from discussion: This is really just migrated from asterix-fuzzyjoin a

Line 143:         // a little speedup for ascii chars
> From discussion, let's just cut this and always return Character.toLowerCas

To view, visit
To unsubscribe, visit

Gerrit-MessageType: comment
Gerrit-Change-Id: I7e95df0f06984b784ebac2c84b97e56a50207d27
Gerrit-PatchSet: 5
Gerrit-Project: hyracks
Gerrit-Branch: master
Gerrit-Owner: Jianfeng Jia <>
Gerrit-Reviewer: Ian Maxon <>
Gerrit-Reviewer: Jenkins <>
Gerrit-Reviewer: Jianfeng Jia <>
Gerrit-Reviewer: Preston Carman <>
Gerrit-Reviewer: Taewoo Kim <>
Gerrit-Reviewer: Till Westmann <>
Gerrit-HasComments: Yes

View raw message