nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Earl Cahill <cahi...@yahoo.com>
Subject dtrace and nutch
Date Fri, 07 Oct 2005 02:39:05 GMT
Just about to switch a box over to solaris 10, in part
so I can try and help out with nutch profiling via
dtrace.  Wondering if anyone has tried it.  

In my limited experience, dtrace kicks.  Some info is
here

http://www.sun.com/bigadmin/content/dtrace/

I think it could really aid in profiling a running
crawl.  Like all the emails that say, several hours in
my fetch slows down, I am hoping that dtrace could
work wonders.

A brief overview.  Dtrace is a profiler like strace or
truss.  Rather unobstrusively dtrace can attach to a
process and from kernel space to user space, report on
what is going on.  Dtrace has a langague, called d,
which allows you to hook into the profile.  Rumor has
it that java is very well profiled, and I think it
will give you method names and the like.  You can hook
in and say, when this method gets called, increment a
counter, track nanoseconds and report every two
seconds at what gets called the most and what takes
the most time from user space to kernel space.

I think it would be nice to pick a few key methods (or
all of them) and write some d that would track what is
going on in those methods during a crawl.

I saw a notice where sun engineers had a booth and
offered a free ipod (I think) if someone had an
application that they couldn't speed it up.  Not sure
how it all worked out, but quite the claim.

Anyway, a world of possiblities, just wondering if
anyone is running nutch on solaris, played with dtrace
or is interested in doing so.

Earl


		
__________________________________ 
Yahoo! Mail - PC Magazine Editors' Choice 2005 
http://mail.yahoo.com

Mime
View raw message