thrift-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bryan Duxbury <br...@rapleaf.com>
Subject Re: hyper-inflation of generated code
Date Mon, 22 Mar 2010 15:10:03 GMT
I've noticed how big our jars can get, and I opened a ticket about
decreasing the amount of duplication with libraries some time ago, but it
hasn't been a priority yet. (
https://issues.apache.org/jira/browse/THRIFT-447,
https://issues.apache.org/jira/browse/THRIFT-701 are the relevant tickets.)

I'm all for making some changes, but is 1.6MB of jar really a problem for
you? I know that personally my project depends on 30MB of jar, only 2 of
which is my Thrift stuff.

I'd love to work with you to get a patch in to extract some of the redundant
code. I doubt it will be that hard to do - someone just has to take a look
at it. Feel free to email me off-list if you would like to chat. I have to
imagine you could fix thrift a lot faster than you could build a competing
system from scratch.

-Bryan

On Mon, Mar 22, 2010 at 7:43 AM, tomer filiba <tomerfiliba@gmail.com> wrote:

> if you recall, i'm working on a project called xthrift, which adds passing
> objects by-reference on top of thrift. the project seemed very promising up
> until yesterday, when i realized thrift generates way to much code to make
> it feasible.
>
> i made an test case of 6 classes, each with 6 methods and 6 attributes, and
> 6 service functions that expose those. i attached the thrift file that's
> generated from my xthrift file -- it contains around 100 functions.
>
> generating java code using the thrift compiler yields a 2.2 MB java source
> file! when compiled, it yields a 1.6MB jar! in csharp and python, the
> situation is slightly better: ~700 KB. just for the sake of entropy,
> compressing (bz2) the generated java code yields a 34 KB file (the a ratio
> is 65! )
>
> for our project, that contains ~100 classes, each with ~10 methods and ~5
> attributes, plus ~50 functions, the generated java code would weigh tens if
> not hundreds of MBs, which is unacceptable, of course.
>
> looking at the generated code, it's easy to spot the redundancy: thrift
> employs a "full beta-reduction policy", i.e., it doesn't encapsulate common
> functionality into functions, instead it just repeats them over and over.
> this yields ~80,000 lines of code that mostly repeat one another.
>
> judging from the code size, i understand thrift is not meant to handle more
> than ~50 functions per project, unless you are willing to accept tens of MBs
> of library footprint.[1]
> is there any "compiler switch" or planned feature, to eliminate this code
> bloat?
>
> if not, my company will have to drop thrift and adopt an in-house solution
> (which we really hoped to avoid...)
>
>
> thanks in advance,
> -tomer
>
> [1] a 100 MB library, on today's hardware, is not unheardof, but our
> project's RAM footprint is ~30 MB... it would be a pity to require such big
> a footprint just for glue code.
>
>
>
> An NCO and a Gentleman
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message