incubator-connectors-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From conflue...@apache.org
Subject [CONF] Apache Connectors Framework > FAQ
Date Sat, 09 Oct 2010 08:08:00 GMT
Space: Apache Connectors Framework (https://cwiki.apache.org/confluence/display/CONNECTORS)
Page: FAQ (https://cwiki.apache.org/confluence/display/CONNECTORS/FAQ)
Comment: https://cwiki.apache.org/confluence/display/CONNECTORS/FAQ?focusedCommentId=23340306#comment-23340306

Comment added by Karl Wright:
---------------------------------------------------------------------

FWIW, my Dell tower (a Vostro 220) which has a similar 2.80Ghz dual-core processor, but a
much faster disk, clocks the same test at 31 docs/second, this time almost totally CPU bound.
 So my guess is your system's disk performance is very poor for some reason.


In reply to a comment by Karl Wright:
With PostgreSQL, a somewhat different test set than I used in May, and with a no-doubt much
more fragmented disk, I am getting some 17 documents/second here, now, doing a file-system
crawl to a null output.  Which is 1/2 what I saw in May.

This had the following special postgresql settings:
(1) 100 max connection handles
(2) 256MB shared buffers (which may well have been overkill, but that's what my PostgreSQL
setup had)

Connection/job settings:
(1) 100 max connections of both repository amd output connections.
(2) Hop filters set to "never delete unreachable documents".

System was pretty near totally I/O bound during execution, which leads me to believe that,
since the system was brand-new in May, disk fragmentation was a major factor.  I will try
to run a benchmark where the database is on a different disk than the files being crawled,
maybe today.


Change your notification preferences: https://cwiki.apache.org/confluence/users/viewnotifications.action

Mime
View raw message