lucy-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dag Lem <...@nimrod.no>
Subject Re: [lucy-user] SearchServer / ClusterSearcher - massive performance hit
Date Tue, 23 Oct 2012 15:35:57 GMT
Marvin Humphrey <marvin@rectangular.com> writes:

> On Mon, Oct 22, 2012 at 8:12 AM, Dag Lem <dag@nimrod.no> wrote:

[...]

> > If there is anything I can do to help isolate any possible problem,
> > please do tell me so (e.g. strace / perl profiling / ...)
> 
> We're not there yet.  If we see expensive queries take longer in
> ClusterSearcher, I think some Perl profiling might help.  If, however, only
> cheap queries are slower, then we'd want to focus on optimizing your
> application first.

See below for profiling of IndexSearcher and SearchServer /
ClusterSearcher using two different queries. Sorry, couldn't resist :-)

Some observations:

* Lucy::Search::IndexSearcher::top_docs (used by SearchServer) is
  about twice as slow Lucy::Search::Searcher::hits (used by
  IndexSearcher).
* The time used for object serialization with sharding surpasses the
  time spent in Lucy::Search::Searcher::hits without sharding, and
  scales with query complexity.
* Quite a lot of time is spent on (local) network communication, and
  this also seems to scale with query complexity.


Query: "fornavn:(dag) AND etternavn:(lem)"
------------------------------------------

IndexSearcher:

Total Elapsed Time = 3.506807 Seconds
  User+System Time = 3.536807 Seconds
Exclusive Times
%Time ExclSec CumulS #Calls sec/call Csec/c  Name
 77.9   2.755  2.755  10000   0.0003 0.0003  Lucy::Search::Searcher::hits
 17.9   0.635  0.685  10000   0.0001 0.0001  Lucy::Search::QueryParser::parse
 1.13   0.040  0.040      1   0.0400 0.0400  utf8::AUTOLOAD
 0.57   0.020  0.020      1   0.0200 0.0200  Lucy::Autobinding::init_autobindin
                                             gs
 0.57   0.020  0.030      6   0.0033 0.0049  IO::Socket::BEGIN
 0.57   0.020  0.147     11   0.0018 0.0134  main::BEGIN
 0.28   0.010  0.010      3   0.0033 0.0033  utf8::SWASHNEW
 0.28   0.010  0.010      5   0.0020 0.0020  Exporter::export
 0.28   0.010  0.010      6   0.0017 0.0017  DynaLoader::dl_load_file
 0.28   0.010  0.010      3   0.0033 0.0033  Storable::BEGIN
 0.28   0.010  0.010      2   0.0050 0.0048  Lucy::Search::Compiler::BEGIN
 0.28   0.010  0.020      4   0.0025 0.0049  Lucy::Document::Doc::BEGIN
 0.28   0.010  0.040      8   0.0012 0.0050  Lucy::BEGIN
 0.28   0.010  0.049      6   0.0016 0.0082  LucyX::Remote::SearchServer::BEGIN
 0.28   0.010  0.010     40   0.0002 0.0002  Exporter::import

SearchServer/ClusterSearcher server:

Total Elapsed Time = 36.55342 Seconds
  User+System Time = 14.87342 Seconds
Exclusive Times
%Time ExclSec CumulS #Calls sec/call Csec/c  Name
 37.1   5.530  5.530  10000   0.0006 0.0006  Lucy::Search::IndexSearcher::top_d
                                             ocs
 19.3   2.870  3.030  30003   0.0001 0.0001  Storable::net_mstore
 12.7   1.900  1.900  20000   0.0001 0.0001  Lucy::Search::IndexSearcher::doc_f
                                             req
 12.4   1.850 14.410  30003   0.0001 0.0005  LucyX::Remote::SearchServer::serve
                                             _rpc
 3.29   0.490  0.840  30003   0.0000 0.0000  Storable::mretrieve
 3.23   0.480  0.480  30005   0.0000 0.0000  IO::Select::handles
 3.16   0.470  0.950  30004   0.0000 0.0000  IO::Select::can_read
 2.35   0.350  0.350  10000   0.0000 0.0000  Lucy::Object::Obj::STORABLE_thaw
 2.15   0.320  0.320  60006   0.0000 0.0000  IO::Handle::read
 1.68   0.250  1.090  30003   0.0000 0.0000  Storable::thaw
 1.68   0.250 15.609      1   0.2499 15.609  LucyX::Remote::SearchServer::serve
 1.55   0.230  3.260  30003   0.0000 0.0001  Storable::_freeze
 1.08   0.160  0.160  10000   0.0000 0.0000  Lucy::Object::Obj::STORABLE_freeze
 1.08   0.160  2.060  20000   0.0000 0.0001  LucyX::Remote::SearchServer::do_do
                                             c_freq
 0.94   0.140  5.670  10000   0.0000 0.0006  LucyX::Remote::SearchServer::do_to
                                             p_docs

SearchServer/ClusterSearcher client:

Total Elapsed Time = 30.04173 Seconds
  User+System Time = 13.58173 Seconds
Exclusive Times
%Time ExclSec CumulS #Calls sec/call Csec/c  Name
 20.4   2.780  2.945  30004   0.0001 0.0001  Storable::net_mstore
 20.2   2.745 10.624  30004   0.0001 0.0004  LucyX::Remote::ClusterSearcher::_r
                                             pc
 10.4   1.425  1.485  10000   0.0001 0.0001  Lucy::Search::QueryParser::parse
 8.47   1.150  1.225  30004   0.0000 0.0000  IO::Socket::send
 4.97   0.675  1.905  30004   0.0000 0.0001  LucyX::Remote::ClusterSearcher::_c
                                             b_send
 4.71   0.640 13.084  10000   0.0001 0.0013  LucyX::Remote::ClusterSearcher::to
                                             p_docs
 4.23   0.575  1.545  30002   0.0000 0.0001  LucyX::Remote::ClusterSearcher::_c
                                             b_recv_response
 3.90   0.530  0.530  60004   0.0000 0.0000  IO::Socket::recv
 3.53   0.480  0.795  30002   0.0000 0.0000  LucyX::Remote::ClusterSearcher::_c
                                             b_recv_length
 3.28   0.445  7.675  10000   0.0000 0.0008  Lucy::Search::ANDQuery::_make_comp
                                             iler
 2.91   0.395 13.504  10000   0.0000 0.0014  Lucy::Search::Searcher::hits
 2.72   0.370  0.515  30002   0.0000 0.0000  Storable::mretrieve
 2.58   0.350  3.295  30004   0.0000 0.0001  Storable::_freeze
 1.77   0.240  0.755  30002   0.0000 0.0000  Storable::thaw
 1.66   0.225  0.225  10000   0.0000 0.0000  Lucy::Search::TopDocs::new


Query: "fornavn:(lem dag) AND etternavn:(lem dag)"
--------------------------------------------------

IndexSearcher:

Total Elapsed Time = 5.388166 Seconds
  User+System Time = 5.288166 Seconds
Exclusive Times
%Time ExclSec CumulS #Calls sec/call Csec/c  Name
 81.6   4.320  4.320  10000   0.0004 0.0004  Lucy::Search::Searcher::hits
 13.6   0.720  0.770  10000   0.0001 0.0001  Lucy::Search::QueryParser::parse
 0.76   0.040  0.040      1   0.0400 0.0400  utf8::AUTOLOAD
 0.76   0.040  0.158     11   0.0036 0.0144  main::BEGIN
 0.57   0.030  0.079      3   0.0100 0.0264  Lucy::Search::IndexSearcher::BEGIN
 0.38   0.020  0.020  10000   0.0000 0.0000  Lucy::Search::Hits::DESTROY
 0.38   0.020  0.020  10000   0.0000 0.0000  Lucy::Search::PolyQuery::DESTROY
 0.38   0.020  0.020      6   0.0033 0.0033  DynaLoader::dl_load_file
 0.19   0.010  0.010      3   0.0033 0.0033  utf8::SWASHNEW
 0.19   0.010  0.010      1   0.0100 0.0099  Lucy::Search::IndexSearcher::new
 0.19   0.010  0.010      3   0.0033 0.0033  Storable::BEGIN
 0.19   0.010  0.020      4   0.0025 0.0050  Lucy::Document::Doc::BEGIN
 0.19   0.010  0.030      8   0.0012 0.0037  Lucy::BEGIN
 0.19   0.010  0.010      7   0.0014 0.0014  IO::Handle::BEGIN
 0.19   0.010  0.030      6   0.0017 0.0049  IO::Socket::INET::BEGIN

SearchServer/ClusterSearcher server:

Total Elapsed Time = 56.87322 Seconds
  User+System Time = 25.07322 Seconds
Exclusive Times
%Time ExclSec CumulS #Calls sec/call Csec/c  Name
 38.2   9.590  9.590  10000   0.0010 0.0010  Lucy::Search::IndexSearcher::top_d
                                             ocs
 18.1   4.560  4.770  50003   0.0001 0.0001  Storable::net_mstore
 14.6   3.680  3.680  40000   0.0001 0.0001  Lucy::Search::IndexSearcher::doc_f
                                             req
 11.8   2.980 24.559  50003   0.0001 0.0005  LucyX::Remote::SearchServer::serve
                                             _rpc
 3.11   0.780  0.780  50005   0.0000 0.0000  IO::Select::handles
 2.91   0.730  1.510  50004   0.0000 0.0000  IO::Select::can_read
 2.35   0.590  5.360  50003   0.0000 0.0001  Storable::_freeze
 2.19   0.550  0.550  10000   0.0001 0.0001  Lucy::Object::Obj::STORABLE_thaw
 2.07   0.520  0.520 100006   0.0000 0.0000  IO::Handle::read
 2.03   0.510  4.190  40000   0.0000 0.0001  LucyX::Remote::SearchServer::do_do
                                             c_freq
 2.03   0.510  1.060  50003   0.0000 0.0000  Storable::mretrieve
 1.71   0.430 26.499      1   0.4299 26.498  LucyX::Remote::SearchServer::serve
 1.28   0.320  1.380  50003   0.0000 0.0000  Storable::thaw
 0.84   0.210  0.210  10000   0.0000 0.0000  Lucy::Object::Obj::STORABLE_freeze
 0.84   0.210  5.570  50003   0.0000 0.0001  Storable::nfreeze

SearchServer/ClusterSearcher client:

Total Elapsed Time = 48.25134 Seconds
  User+System Time = 20.14134 Seconds
Exclusive Times
%Time ExclSec CumulS #Calls sec/call Csec/c  Name
 22.0   4.445 17.034  50004   0.0001 0.0003  LucyX::Remote::ClusterSearcher::_r
                                             pc
 21.7   4.370  4.565  50004   0.0001 0.0001  Storable::net_mstore
 9.38   1.890  2.145  50004   0.0000 0.0000  IO::Socket::send
 9.26   1.865  1.925  10000   0.0002 0.0002  Lucy::Search::QueryParser::parse
 6.16   1.240  1.240 100004   0.0000 0.0000  IO::Socket::recv
 5.09   1.025 14.984  10000   0.0001 0.0015  Lucy::Search::ANDQuery::_make_comp
                                             iler
 4.49   0.905  3.225  50004   0.0000 0.0001  LucyX::Remote::ClusterSearcher::_c
                                             b_send
 3.53   0.710  1.505  50002   0.0000 0.0000  LucyX::Remote::ClusterSearcher::_c
                                             b_recv_length
 3.50   0.705  2.205  50002   0.0000 0.0000  LucyX::Remote::ClusterSearcher::_c
                                             b_recv_response
 2.63   0.530 20.284  10000   0.0001 0.0020  LucyX::Remote::ClusterSearcher::to
                                             p_docs
 2.48   0.500  0.635  50002   0.0000 0.0000  Storable::mretrieve
 2.38   0.480 13.809  40000   0.0000 0.0003  LucyX::Remote::ClusterSearcher::do
                                             c_freq
 2.36   0.475  0.475 250016   0.0000 0.0000  IO::Handle::fileno
 2.23   0.450  5.355  50004   0.0000 0.0001  LucyX::Remote::ClusterSearcher::_s
                                             erialize_request
 2.09   0.420  1.055  50002   0.0000 0.0000  Storable::thaw

-- 
Best regards,

Dag Lem

Mime
View raw message