manifoldcf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <daddy...@gmail.com>
Subject Re: Scaling in MCF
Date Thu, 13 Nov 2014 14:13:40 GMT
Hi Aeham,

It's possible that for PostgreSQL 9.x, table reindexing may not even be
very helpful.  8.x, though, was pretty broken in this regard, and
postgresql would leave index references to dead tuples around until a
reindex was done.  If you can determine that that is no longer necessary it
would be a great help.

For sure, I'd create a ticket so we can at least use that as a forum to
determine how best to improve this area of the code.

Karl


On Thu, Nov 13, 2014 at 9:03 AM, Aeham Abushwashi <
aeham.abushwashi@exonar.com> wrote:

> Thanks Karl. I'll make sure jobs (and occasionally nodes) are stopped,
> restarted etc. along the way
>
> Re node restart, something I've seen crop up time and time again on larger
> test and production systems is the lengthy reindexing of the jobqueue
> table. It's particularly noticeable when some nodes in the cluster are
> started or stopped - you immediately see one of the remaining nodes kick
> off a table REINDEX. On a jobqueue table with 30-40M records a single
> REINDEX TABLE jobqueue query can take up to 25mins. I understand the need
> for table reindexing in certain cases, but if I have to restart a bunch of
> nodes in a cluster I could easily be sat waiting for an hour or two because
> of repeated calls. In fact,to avoid these lengthy wait times I've coded up
> a hack to disable REINDEXing when I know I need to restart nodes a few
> times. I wonder if this could be addressed much better by centralising the
> invocation of REINDEX operations and/or scheduling them or arranging them
> in a way that makes repeated invocations within a short period time
> avoidable. Just a thought...
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message