arrow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andy Grove <andygrov...@gmail.com>
Subject Re: Timeline for 0.15.0 release
Date Sun, 22 Sep 2019 19:24:26 GMT
There's been quite a bit of activity in DataFusion over the past few weeks,
and there are currently two issues that I would like to see merged in time
for the release:

ARROW-6660: Minor docs update
ARROW-6089: Physical query plan for selection operator

ARROW-6089 is part of the new query execution implementation that I talked
about on the mailing list just recently and is new functionality that
doesn't impact any existing users, so maybe this one could be rubber stamp
approved if there are no objections. The equivalent PRs for the other
operators (projection and aggregate) were already merged.





On Sat, Sep 21, 2019 at 7:12 PM Wes McKinney <wesmckinn@gmail.com> wrote:

> It's ideal if your GPG key is in the web of trust (i.e. you can get it
> signed by another PMC member), but is not 100% essential.
>
> Speaking of the release, there are at least 2 code changes I still
> want to get in
>
> ARROW-5717
> ARROW-6353
>
> I just pushed updates to ARROW-5717, will merge once the build is green.
>
> There are a couple of Rust patches still marked for 0.15. The rest
> seems to be documentation and a couple of integration test failures we
> should see about fixing in time.
>
> On Fri, Sep 20, 2019 at 11:26 PM Micah Kornfield <emkornfield@gmail.com>
> wrote:
> >
> > Thanks Krisztián and Wes,
> > I've gone ahead and started registering myself on all the packaging
> sites.
> >
> > Is there any review process when adding my GPG key to the SVN file? [1]
> > doesn't seem to mention explicitly.
> >
> > Thanks,
> > Micah
> >
> > [1] https://www.apache.org/dev/version-control.html#https-svn
> >
> > On Fri, Sep 20, 2019 at 5:01 PM Krisztián Szűcs <
> szucs.krisztian@gmail.com>
> > wrote:
> >
> > > On Thu, Sep 19, 2019 at 5:52 PM Wes McKinney <wesmckinn@gmail.com>
> wrote:
> > >
> > >> On Thu, Sep 19, 2019 at 12:13 AM Micah Kornfield <
> emkornfield@gmail.com>
> > >> wrote:
> > >> >>
> > >> >> The process should be well documented at this point but there
are a
> > >> >> number of steps.
> > >> >
> > >> > Is [1] the up-to-date documentation for the release?   Are there
> > >> instructions for the adding the code signing Key to SVN?
> > >> >
> > >> > I will make a go of it.  i will try to mitigate any internet issues
> by
> > >> doing the process for a cloud instance (I assume that isn't a
> problem?).
> > >> >
> > >>
> > >> Setting up a new cloud environment suitable for producing an RC may be
> > >> time consuming, but you are welcome to try. Krisztian -- are you
> > >> available next week to help Micah and potentially take over producing
> > >> the RC if there are issues?
> > >>
> > > Sure, I'll be available next week. We can also grant access to
> > > https://github.com/ursa-labs/crossbow because configuring all
> > > the CI backends can be time consuming.
> > >
> > >>
> > >> > Thanks,
> > >> > Micah
> > >> >
> > >> > [1]
> > >>
> https://cwiki.apache.org/confluence/display/ARROW/Release+Management+Guide
> > >> >
> > >> > On Wed, Sep 18, 2019 at 8:29 AM Wes McKinney <wesmckinn@gmail.com>
> > >> wrote:
> > >> >>
> > >> >> The process should be well documented at this point but there
are a
> > >> >> number of steps. Note that you need to add your code signing key
to
> > >> >> the KEYS file in SVN (that's not very hard to do). I think it's
> fine
> > >> >> to hand off the process to others after the VOTE but it would
be
> > >> >> tricky to have multiple RMs involved with producing the source
and
> > >> >> binary artifacts for the vote
> > >> >>
> > >> >> On Tue, Sep 17, 2019 at 10:55 PM Micah Kornfield <
> > >> emkornfield@gmail.com> wrote:
> > >> >> >
> > >> >> > SGTM, as well.
> > >> >> >
> > >> >> > I should have a little bit of time next week if I can help
as RM
> but
> > >> I have
> > >> >> > a couple of concerns:
> > >> >> > 1.  In the past I've had trouble downloading and validating
> > >> releases. I'm a
> > >> >> > bit worried, that I might have similar problems doing the
> necessary
> > >> uploads.
> > >> >> > 2.  My internet connection will likely be not great, I don't
> know if
> > >> this
> > >> >> > would make it even less likely to be successful.
> > >> >> >
> > >> >> > Does it become problematic if somehow I would have to abandon
the
> > >> process
> > >> >> > mid-release?  Is there anyone who could serve as a backup?
 Are
> the
> > >> steps
> > >> >> > well documented?
> > >> >> >
> > >> >> > Thanks,
> > >> >> > Micah
> > >> >> >
> > >> >> > On Tue, Sep 17, 2019 at 4:25 PM Neal Richardson <
> > >> neal.p.richardson@gmail.com>
> > >> >> > wrote:
> > >> >> >
> > >> >> > > Sounds good to me.
> > >> >> > >
> > >> >> > > Do we have a release manager yet? Any volunteers?
> > >> >> > >
> > >> >> > > Neal
> > >> >> > >
> > >> >> > > On Tue, Sep 17, 2019 at 4:06 PM Wes McKinney <
> wesmckinn@gmail.com>
> > >> wrote:
> > >> >> > >
> > >> >> > > > hi all,
> > >> >> > > >
> > >> >> > > > It looks like we're drawing close to be able to
make the
> 0.15.0
> > >> >> > > > release. I would suggest "pencils down" at the
end of this
> week
> > >> and
> > >> >> > > > see if a release candidate can be produced next
Monday
> September
> > >> 23.
> > >> >> > > > Any thoughts or objections?
> > >> >> > > >
> > >> >> > > > Thanks,
> > >> >> > > > Wes
> > >> >> > > >
> > >> >> > > > On Wed, Sep 11, 2019 at 11:23 AM Wes McKinney <
> > >> wesmckinn@gmail.com>
> > >> >> > > wrote:
> > >> >> > > > >
> > >> >> > > > > hi Eric -- yes, that's correct. I'm planning
to amend the
> > >> Format docs
> > >> >> > > > > today regarding the EOS issue and also update
the C++
> library
> > >> >> > > > >
> > >> >> > > > > On Wed, Sep 11, 2019 at 11:21 AM Eric Erhardt
> > >> >> > > > > <Eric.Erhardt@microsoft.com> wrote:
> > >> >> > > > > >
> > >> >> > > > > > I assume the plan is to merge the
> > >> ARROW-6313-flatbuffer-alignment
> > >> >> > > > branch into master before the 0.15 release, correct?
> > >> >> > > > > >
> > >> >> > > > > > BTW - I believe the C# alignment changes
are ready to be
> > >> merged into
> > >> >> > > > the alignment branch -
> > >> https://github.com/apache/arrow/pull/5280/
> > >> >> > > > > >
> > >> >> > > > > > Eric
> > >> >> > > > > >
> > >> >> > > > > > -----Original Message-----
> > >> >> > > > > > From: Micah Kornfield <emkornfield@gmail.com>
> > >> >> > > > > > Sent: Tuesday, September 10, 2019 10:24
PM
> > >> >> > > > > > To: Wes McKinney <wesmckinn@gmail.com>
> > >> >> > > > > > Cc: dev <dev@arrow.apache.org>;
niki.lj <
> niki.lj@aliyun.com>
> > >> >> > > > > > Subject: Re: Timeline for 0.15.0 release
> > >> >> > > > > >
> > >> >> > > > > > I should have a little more bandwidth
to help with some
> of
> > >> the
> > >> >> > > > packaging starting tomorrow and going into the
weekend.
> > >> >> > > > > >
> > >> >> > > > > > On Tuesday, September 10, 2019, Wes McKinney
<
> > >> wesmckinn@gmail.com>
> > >> >> > > > wrote:
> > >> >> > > > > >
> > >> >> > > > > > > Hi folks,
> > >> >> > > > > > >
> > >> >> > > > > > > With the state of nightly packaging
and integration
> builds
> > >> things
> > >> >> > > > > > > aren't looking too good for being
in release readiness
> by
> > >> the end
> > >> >> > > of
> > >> >> > > > > > > this week but maybe I'm wrong. I'm
planning to be
> working
> > >> to close
> > >> >> > > as
> > >> >> > > > > > > many issues as I can and also to
help with the ongoing
> > >> alignment
> > >> >> > > > fixes.
> > >> >> > > > > > >
> > >> >> > > > > > > Wes
> > >> >> > > > > > >
> > >> >> > > > > > > On Thu, Sep 5, 2019, 11:07 PM Micah
Kornfield <
> > >> >> > > emkornfield@gmail.com
> > >> >> > > > >
> > >> >> > > > > > > wrote:
> > >> >> > > > > > >
> > >> >> > > > > > >> Just for reference [1] has a
dashboard of the current
> > >> issues:
> > >> >> > > > > > >>
> > >> >> > > > > > >>
> > >> >> > > >
> > >> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwi
> > >> >> > > > > > >> ki.apache.org
> > >> >> > > > %2Fconfluence%2Fdisplay%2FARROW%2FArrow%2B0.15.0%2BRelea
> > >> >> > > > > > >> se&amp;data=02%7C01%7CEric.Erhardt%40microsoft.com
> > >> >> > > > %7Ccbead81a42104034
> > >> >> > > > > > >>
> > >> >> > > >
> > >> a4f308d736678a45%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C6370376
> > >> >> > > > > > >>
> > >> >> > > >
> > >> 90648216338&amp;sdata=0Upux3i%2B9X6f8uanGKSGM5VYxR6c2ADWrxSPi1%2FgbH4
> > >> >> > > > > > >> %3D&amp;reserved=0
> > >> >> > > > > > >>
> > >> >> > > > > > >> On Thu, Sep 5, 2019 at 3:43
PM Wes McKinney <
> > >> wesmckinn@gmail.com>
> > >> >> > > > wrote:
> > >> >> > > > > > >>
> > >> >> > > > > > >>> hi all,
> > >> >> > > > > > >>>
> > >> >> > > > > > >>> It doesn't seem like we're
going to be in a position
> to
> > >> release
> > >> >> > > at
> > >> >> > > > > > >>> the beginning of next week.
I hope that one more
> week of
> > >> work (or
> > >> >> > > > > > >>> less) will be enough to
get us there. Aside from
> merging
> > >> the
> > >> >> > > > > > >>> alignment changes, we need
to make sure that our
> > >> packaging jobs
> > >> >> > > > > > >>> required for the release
candidate are all working.
> > >> >> > > > > > >>>
> > >> >> > > > > > >>> If folks could remove issues
from the 0.15.0 backlog
> > >> that they
> > >> >> > > > don't
> > >> >> > > > > > >>> think they will finish by
end of next week that would
> > >> help focus
> > >> >> > > > > > >>> efforts (there are currently
78 issues in 0.15.0
> still).
> > >> I am
> > >> >> > > > > > >>> looking to tackle a few
small features related to
> > >> dictionaries
> > >> >> > > > while
> > >> >> > > > > > >>> the release window is still
open.
> > >> >> > > > > > >>>
> > >> >> > > > > > >>> - Wes
> > >> >> > > > > > >>>
> > >> >> > > > > > >>> On Tue, Aug 27, 2019 at
3:48 PM Wes McKinney <
> > >> >> > > wesmckinn@gmail.com>
> > >> >> > > > > > >>> wrote:
> > >> >> > > > > > >>> >
> > >> >> > > > > > >>> > hi,
> > >> >> > > > > > >>> >
> > >> >> > > > > > >>> > I think we should try
to release the week of
> September
> > >> 9, so
> > >> >> > > > > > >>> > development work should
be completed by end of next
> > >> week.
> > >> >> > > > > > >>> >
> > >> >> > > > > > >>> > Does that seem reasonable?
> > >> >> > > > > > >>> >
> > >> >> > > > > > >>> > I plan to get up a
patch for the protocol alignment
> > >> changes for
> > >> >> > > > > > >>> > C++ in the next couple
of days -- I think that
> getting
> > >> the
> > >> >> > > > > > >>> > alignment work done
is the main barrier to
> releasing.
> > >> >> > > > > > >>> >
> > >> >> > > > > > >>> > Thanks
> > >> >> > > > > > >>> > Wes
> > >> >> > > > > > >>> >
> > >> >> > > > > > >>> > On Mon, Aug 19, 2019
at 12:25 PM Ji Liu
> > >> >> > > > > > >>> > <niki.lj@aliyun.com.invalid>
> > >> >> > > > > > >>> wrote:
> > >> >> > > > > > >>> > >
> > >> >> > > > > > >>> > > Hi, Wes, on the
java side, I can think of several
> > >> bugs that
> > >> >> > > > need
> > >> >> > > > > > >>> > > to
> > >> >> > > > > > >>> be fixed or reminded.
> > >> >> > > > > > >>> > >
> > >> >> > > > > > >>> > > i. ARROW-6040:
Dictionary entries are required in
> > >> IPC streams
> > >> >> > > > > > >>> > > even
> > >> >> > > > > > >>> when empty[1]
> > >> >> > > > > > >>> > > This one is under
review now, however through
> this
> > >> PR we find
> > >> >> > > > > > >>> > > that
> > >> >> > > > > > >>> there seems a bug in java
reading and writing
> > >> dictionaries in IPC
> > >> >> > > > > > >>> which is Inconsistent with
spec[2] since it assumes
> all
> > >> >> > > > dictionaries
> > >> >> > > > > > >>> are at the start of stream
(see details in PR
> comments,
> > >> and this
> > >> >> > > > > > >>> fix may not catch up with
version 0.15). @Micah
> Kornfield
> > >> >> > > > > > >>> > >
> > >> >> > > > > > >>> > > ii. ARROW-1875:
Write 64-bit ints as strings in
> > >> integration
> > >> >> > > > test
> > >> >> > > > > > >>> JSON files[3]
> > >> >> > > > > > >>> > > Java side code
already checked in, other
> > >> implementations
> > >> >> > > seems
> > >> >> > > > not.
> > >> >> > > > > > >>> > >
> > >> >> > > > > > >>> > > iii. ARROW-6202:
OutOfMemory in JdbcAdapter[4]
> > >> Caused by
> > >> >> > > trying
> > >> >> > > > > > >>> > > to load all records
in one contiguous batch,
> fixed
> > >> >> > > > > > >>> by providing iterator API
for iteratively reading in
> > >> >> > > ARROW-6219[5].
> > >> >> > > > > > >>> > >
> > >> >> > > > > > >>> > > Thanks,
> > >> >> > > > > > >>> > > Ji Liu
> > >> >> > > > > > >>> > >
> > >> >> > > > > > >>> > > [1]
> > >> >> > > > > > >>> > >
> > >> >> > > >
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%
> > >> >> > > > > > >>> > >
> > >> >> > > >
> 2Fgithub.com%2Fapache%2Farrow%2Fpull%2F4960&amp;data=02%7C01%7CE
> > >> >> > > > > > >>> > > ric.Erhardt%40microsoft.com
> > >> >> > > > %7Ccbead81a42104034a4f308d736678a45%7
> > >> >> > > > > > >>> > >
> > >> >> > > >
> C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637037690648216338&a
> > >> >> > > > > > >>> > >
> > >> >> > > >
> mp;sdata=eDF%2FAsJmVs7WjfEuNBYo%2F1TypIN44xx1TTlK6kQHZVg%3D&amp;
> > >> >> > > > > > >>> > > reserved=0 [2]
> > >> >> > > > > > >>> > >
> > >> >> > > >
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%
> > >> >> > > > > > >>> > > 2Farrow.apache.org
> > >> >> > > > %2Fdocs%2Fipc.html&amp;data=02%7C01%7CEric.Erh
> > >> >> > > > > > >>> > > ardt%40microsoft.com
> > >> >> > > > %7Ccbead81a42104034a4f308d736678a45%7C72f988
> > >> >> > > > > > >>> > >
> > >> >> > > >
> bf86f141af91ab2d7cd011db47%7C1%7C0%7C637037690648216338&amp;sdat
> > >> >> > > > > > >>> > >
> > >> >> > > >
> a=H0pM8bVKsOyeORDhHxLlS%2BpaS%2F5meT52wxTKmNssuMk%3D&amp;reserve
> > >> >> > > > > > >>> > > d=0 [3]
> > >> >> > > > > > >>> > >
> > >> >> > > >
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%
> > >> >> > > > > > >>> > > 2Fissues.apache.org
> > >> >> > > > %2Fjira%2Fbrowse%2FARROW-1875&amp;data=02%7C0
> > >> >> > > > > > >>> > > 1%7CEric.Erhardt%40microsoft.com
> > >> >> > > > %7Ccbead81a42104034a4f308d736678
> > >> >> > > > > > >>> > >
> > >> >> > > >
> a45%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637037690648216
> > >> >> > > > > > >>> > >
> > >> >> > > >
> 338&amp;sdata=coTpuoEGhfjyOSBTagdlohOTX24DQZmtbWC0gYsDmkM%3D&amp
> > >> >> > > > > > >>> > > ;reserved=0 [4]
> > >> >> > > > > > >>> > >
> > >> >> > > >
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%
> > >> >> > > > > > >>> > > 2Fissues.apache.org
> > >> >> > > > %2Fjira%2Fbrowse%2FARROW-6202%5B5&amp;data=02
> > >> >> > > > > > >>> > > %7C01%7CEric.Erhardt%40microsoft.com
> > >> >> > > > %7Ccbead81a42104034a4f308d73
> > >> >> > > > > > >>> > >
> > >> >> > > >
> 6678a45%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C63703769064
> > >> >> > > > > > >>> > >
> > >> >> > > >
> 8216338&amp;sdata=gnyUMk8cUgwc802QBLF3eAp3mznYwonlbF0qmGyzgmY%3D
> > >> >> > > > > > >>> > > &amp;reserved=0]
> > >> >> > > > > > >>>
> > >> >> > > >
> > >> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fis
> > >> >> > > > > > >>> sues.apache.org
> > >> >> > > > %2Fjira%2Fbrowse%2FARROW-6219&amp;data=02%7C01%7CEric
> > >> >> > > > > > >>> .Erhardt%40microsoft.com
> > >> >> > > > %7Ccbead81a42104034a4f308d736678a45%7C72f988
> > >> >> > > > > > >>>
> > >> >> > > >
> > >> bf86f141af91ab2d7cd011db47%7C1%7C0%7C637037690648216338&amp;sdata=d3
> > >> >> > > > > > >>>
> > >> LF%2BTeWSprASqO%2ByE4LywlsULHGcb1Iq%2F2byHrEPkY%3D&amp;reserved=0
> > >> >> > > > > > >>> > >
> > >> >> > > > > > >>> > >
> > >> >> > > > > > >>> > >
> > >> >> > > > > > >>> > >
> > >> >> > > >
> ----------------------------------------------------------------
> > >> >> > > > > > >>> > > -- From:Wes McKinney
<wesmckinn@gmail.com> Send
> > >> >> > > > > > >>> > > Time:2019年8月19日(星期一)
23:03 To:dev <
> > >> dev@arrow.apache.org>
> > >> >> > > > > > >>> > > Subject:Re: Timeline
for 0.15.0 release
> > >> >> > > > > > >>> > >
> > >> >> > > > > > >>> > > I'm going to work
some on organizing the 0.15.0
> > >> backlog some
> > >> >> > > > > > >>> > > this week, if
anyone wants to help with grooming
> > >> >> > > (particularly
> > >> >> > > > > > >>> > > for languages
other than C++/Python where I'm
> > >> focusing) that
> > >> >> > > > > > >>> > > would be helpful.
There have been almost 500 JIRA
> > >> issues
> > >> >> > > opened
> > >> >> > > > > > >>> > > since the
> > >> >> > > > > > >>> > > 0.14.0 release,
so we should make sure to check
> > >> whether
> > >> >> > > there's
> > >> >> > > > > > >>> > > any regressions
or other serious bugs that we
> should
> > >> try to
> > >> >> > > fix
> > >> >> > > > > > >>> > > for 0.15.0.
> > >> >> > > > > > >>> > >
> > >> >> > > > > > >>> > > On Thu, Aug 15,
2019 at 6:23 PM Wes McKinney
> > >> >> > > > > > >>> > > <wesmckinn@gmail.com>
> > >> >> > > > > > >>> wrote:
> > >> >> > > > > > >>> > > >
> > >> >> > > > > > >>> > > > The Windows
wheel issue in 0.14.1 seems to be
> > >> >> > > > > > >>> > > >
> > >> >> > > > > > >>> > > >
> > >> >> > > >
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2
> > >> >> > > > > > >>> > > > F%2Fissues.apache.org
> > >> >> > > > %2Fjira%2Fbrowse%2FARROW-6015&amp;data=02
> > >> >> > > > > > >>> > > > %7C01%7CEric.Erhardt%40microsoft.com
> > >> >> > > > %7Ccbead81a42104034a4f308d
> > >> >> > > > > > >>> > > >
> > >> >> > > >
> 736678a45%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C6370376
> > >> >> > > > > > >>> > > >
> > >> >> > > >
> 90648216338&amp;sdata=D9lqHR16oRAFlPaIrcXq3UtW%2BLuJQW1u0Gom2u
> > >> >> > > > > > >>> > > > WEWg0%3D&amp;reserved=0
> > >> >> > > > > > >>> > > >
> > >> >> > > > > > >>> > > > I think the
root cause could be the Windows
> > >> changes in
> > >> >> > > > > > >>> > > >
> > >> >> > > > > > >>> > > >
> > >> >> > > >
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2
> > >> >> > > > > > >>> > > >
> > >> >> > > >
> F%2Fgithub.com%2Fapache%2Farrow%2Fcommit%2F&amp;data=02%7C01%7
> > >> >> > > > > > >>> > > > CEric.Erhardt%40microsoft.com
> > >> >> > > > %7Ccbead81a42104034a4f308d736678a
> > >> >> > > > > > >>> > > >
> > >> >> > > >
> 45%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C63703769064821
> > >> >> > > > > > >>> > > >
> > >> >> > > >
> 6338&amp;sdata=iPmFB%2BncIbmvp5D31vjB4A2KyuMP%2B83Vp7%2BDiOxvl
> > >> >> > > > > > >>> > > > bs%3D&amp;reserved=0
> > >> >> > > > > > >>> 223ae744cc2a12c60cecb5db593263a03c13f85a
> > >> >> > > > > > >>> > > >
> > >> >> > > > > > >>> > > > I would be
appreciative if a volunteer would
> look
> > >> into what
> > >> >> > > > > > >>> > > > was
> > >> >> > > > > > >>> wrong
> > >> >> > > > > > >>> > > > with the
0.14.1 wheels on Windows. Otherwise
> > >> 0.15.0 Windows
> > >> >> > > > > > >>> > > > wheels will
be broken, too
> > >> >> > > > > > >>> > > >
> > >> >> > > > > > >>> > > > The bad wheels
can be found at
> > >> >> > > > > > >>> > > >
> > >> >> > > > > > >>> > > >
> > >> >> > > >
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2
> > >> >> > > > > > >>> > > >
> > >> >> > > >
> F%2Fbintray.com%2Fapache%2Farrow%2Fpython%23files%2Fpython%252
> > >> >> > > > > > >>> > > > F0.14.1&amp;data=02%7C01%7CEric.Erhardt%
> > >> 40microsoft.com
> > >> >> > > > %7Ccbea
> > >> >> > > > > > >>> > > >
> > >> >> > > >
> d81a42104034a4f308d736678a45%7C72f988bf86f141af91ab2d7cd011db4
> > >> >> > > > > > >>> > > >
> > >> >> > > >
> 7%7C1%7C0%7C637037690648216338&amp;sdata=vZzx4HNS9qp2UWhFagqfJ
> > >> >> > > > > > >>> > > > zbY%2BGzwspH1TO3wdfrbA6Y%3D&amp;reserved=0
> > >> >> > > > > > >>> > > >
> > >> >> > > > > > >>> > > > On Thu, Aug
15, 2019 at 1:28 PM Antoine Pitrou
> <
> > >> >> > > > > > >>> solipsis@pitrou.net>
wrote:
> > >> >> > > > > > >>> > > > >
> > >> >> > > > > > >>> > > > > On Thu,
15 Aug 2019 11:17:07 -0700 Micah
> > >> Kornfield
> > >> >> > > > > > >>> > > > > <emkornfield@gmail.com>
wrote:
> > >> >> > > > > > >>> > > > > >
>
> > >> >> > > > > > >>> > > > > >
> In C++ they are
> > >> >> > > > > > >>> > > > > >
> independent, we could have 32-bit array
> > >> lengths and
> > >> >> > > > > > >>> variable-length
> > >> >> > > > > > >>> > > > > >
> types with 64-bit offsets if we wanted
> (we
> > >> just
> > >> >> > > > wouldn't
> > >> >> > > > > > >>> > > > > >
> be
> > >> >> > > > > > >>> able to
> > >> >> > > > > > >>> > > > > >
> have a List child with more than
> INT32_MAX
> > >> elements).
> > >> >> > > > > > >>> > > > > >
> > >> >> > > > > > >>> > > > > >
I think the point is we could do this in
> C++
> > >> but we
> > >> >> > > > don't.
> > >> >> > > > > > >>> I'm not sure we
> > >> >> > > > > > >>> > > > > >
would have introduced the "Large" types if
> we
> > >> did.
> > >> >> > > > > > >>> > > > >
> > >> >> > > > > > >>> > > > > 64-bit
offsets take twice as much space as
> 32-bit
> > >> >> > > offsets,
> > >> >> > > > > > >>> > > > > so if
> > >> >> > > > > > >>> you're
> > >> >> > > > > > >>> > > > > storing
lots of small-ish lists or strings,
> > >> 32-bit
> > >> >> > > offsets
> > >> >> > > > > > >>> > > > > are
preferrable.  So even with 64-bit array
> > >> lengths from
> > >> >> > > > the
> > >> >> > > > > > >>> > > > > start
> > >> >> > > > > > >>> it would
> > >> >> > > > > > >>> > > > > still
be beneficial to have types with 32-bit
> > >> offsets.
> > >> >> > > > > > >>> > > > >
> > >> >> > > > > > >>> > > > > >
Going with the limited address space in
> Java
> > >> and
> > >> >> > > calling
> > >> >> > > > > > >>> > > > > >
it a
> > >> >> > > > > > >>> reference
> > >> >> > > > > > >>> > > > > >
implementation seems suboptimal. If a
> consumer
> > >> uses a
> > >> >> > > > "Large"
> > >> >> > > > > > >>> type
> > >> >> > > > > > >>> > > > > >
presumably it is because they need the
> ability
> > >> to store
> > >> >> > > > > > >>> > > > > >
more
> > >> >> > > > > > >>> than INT32_MAX
> > >> >> > > > > > >>> > > > > >
child elements in a column, otherwise it is
> > >> just
> > >> >> > > wasting
> > >> >> > > > > > >>> > > > > >
space
> > >> >> > > > > > >>> [1].
> > >> >> > > > > > >>> > > > >
> > >> >> > > > > > >>> > > > > Probably.
Though if the individual elements
> > >> (lists or
> > >> >> > > > > > >>> > > > > strings)
> > >> >> > > > > > >>> are
> > >> >> > > > > > >>> > > > > large,
not much space is wasted in
> proportion,
> > >> so it may
> > >> >> > > be
> > >> >> > > > > > >>> simpler in
> > >> >> > > > > > >>> > > > > such
a case to always create a "Large" type
> > >> array.
> > >> >> > > > > > >>> > > > >
> > >> >> > > > > > >>> > > > > >
[1] I suppose theoretically there might be
> some
> > >> >> > > > > > >>> > > > > >
performance
> > >> >> > > > > > >>> benefits on
> > >> >> > > > > > >>> > > > > >
64-bit architectures to using the native
> word
> > >> sizes.
> > >> >> > > > > > >>> > > > >
> > >> >> > > > > > >>> > > > > Concretely,
common 64-bit architectures
> don't do
> > >> that, as
> > >> >> > > > > > >>> > > > > 32-bit
> > >> >> > > > > > >>> is an
> > >> >> > > > > > >>> > > > > extremely
common integer size even in
> > >> high-performance
> > >> >> > > > code.
> > >> >> > > > > > >>> > > > >
> > >> >> > > > > > >>> > > > > Regards
> > >> >> > > > > > >>> > > > >
> > >> >> > > > > > >>> > > > > Antoine.
> > >> >> > > > > > >>> > > > >
> > >> >> > > > > > >>> > > > >
> > >> >> > > > > > >>> > >
> > >> >> > > > > > >>>
> > >> >> > > > > > >>
> > >> >> > > >
> > >> >> > >
> > >>
> > >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message