incubator-esme-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Richard Hirsch <hirsch.d...@gmail.com>
Subject Re: Removing textile from code base
Date Thu, 26 Nov 2009 16:01:25 GMT
Regarding the public timeline, it is on a different page on the new
UI, so this might solve some of our problems.

D.

On Thu, Nov 26, 2009 at 3:41 PM, Vassil Dichev <vdichev@apache.org> wrote:
>> We're not looking at the root cause of the problem.  The Textile stuff is a
>> hit if we run it on each message for each user.  This is no different than
>> having an SQL query in the code that's a Cartesian product and throwing out
>> SQL because of it.
>>
>> Let's find out where and why we keep loading the same message from the RDBMS
>> rather than going to the message cache.
>>
>> Let's find out why we're hitting the RDBMS in general... there are
>> abstractions in the system (or at least were) that make RDBMS access a local
>> thing rather than a global thing.
>>
>> I'll have time on Monday to look at this, but running around chopping off
>> pieces of code and changing functionality isn't going to get us any closer
>> to solving the problem... it's just going to cause the problem to be
>> manifest elsewhere.
>
> I did not remove the Textile parser only because it potentially causes
> problems. I think it doesn't fit very well and it's a bit of an
> overkill. First of all, for messages headings, tables and paragraphs
> are not such a good fit conceptually.
>
> Second, some elements from MsgParser clash with the Textile parser
> ones. For instance, links to images cannot be parsed because MsgParser
> takes turn first and converts it to an URL element first.
>
> Third, the way parsing with Textile is done is inefficient currently
> anyway. I parse every separate text element. Since text can be
> separated by urls, tags and usernames, that means I could invoke the
> Textile parser several times per message. For instance, this message
> has 4 text elements => 4 Textile invocations:
>
>    message with #tag and @username and http://blog.esme.us url in text
>
> Yes, if the performance analysis is correct, the Textile parser is not
> the cause of the problem. It might be easier to solve the problem
> without it. We even intended to include pluggable parser
> implementations some day.
>
> AFAICT, the problem was not that the RDBMS is queried every time
> (although that's how the PublicTimeline has worked from day 1 if I
> remember correctly). The problem, as explained by Markus, was that the
> message was formatted from the raw string every time it's accessed for
> rendering a timeline. The RDBMS was mentioned tangentially by Michael
> Bechauf(or someone else?). Markus, did I get this correctly?
>
> I still don't see how the message could be parsed several times, since
> digestedXHTML is lazy and so will be cached (this alone should make it
> *way* easier for Scala to write efficient implementations over Java).
>
> I want to profile the stacktrace where most strings are allocated.
> This should answer some questions.
>
> I also plan to remove rendering the public timeline on each user's
> timeline page. First of all because it's not cached, and second
> because it's not updated in real-time like the friends' timeline, but
> only after an explicit refresh of the browser. So the public timeline
> is not only slow, but might be confusing for the user, as they will
> expect it to work similarly to the personal timeline (as the layout is
> the same).
>
> Vassil
>

Mime
View raw message