ManifoldCF is not limited by the number of agents processes or parallel connectors. Overall database performance is the limiting factor.
I would read this:http://manifoldcf.apache.org/release/trunk/en_US/performance-tuning.html
Also, there's a section in ManifoldCF (I believe Chapter 2) that discusses this issue.
Some five years ago, I successfully crawled 5 million web documents, using Postgresql 8.3. Postgresql 9.x is faster, and with modern SSD's, I expect that you will do even better. In general, I'd say it was fine to shoot for 10M - 100M documents on ManifoldCF, provided that you use a good database, and provided that you maintain it properly.