manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nichols, Richard" <Richard.Nich...@tellabs.com>
Subject RE: Seed File?
Date Mon, 03 Jun 2013 15:22:53 GMT
Hi Karl,

The actual file system on the documentation server is not accessible directly.  All documents
must be fetched via the web interface.
On the other hand, generating a seed file 'document' that is fetched and crawled may be a
workable solution.

Thanks!

Rick

From: Karl Wright [mailto:daddywri@gmail.com]
Sent: Monday, June 03, 2013 10:11 AM
To: user@manifoldcf.apache.org
Subject: Re: Seed File?

Hi Rick,

The web connector is not designed to discover documents unless you either provide URLs to
all of them (as seeds), or you provide a seed which is in fact a document that points to them
all.  You can use the RSS connector for a similar purpose - which is better for some, because
there is often software available that will generate an RSS feed automatically.  But in this
case I wonder why you don't just use the Windows Share Connector for all of your documents?
 Neither the RSS nor the Web connector have any notion of security...

Karl

On Mon, Jun 3, 2013 at 10:59 AM, Nichols, Richard <Richard.Nichols@tellabs.com<mailto:Richard.Nichols@tellabs.com>>
wrote:

We have a documentation system where PDF documents are accessed via web URLs.  These URLs
are stored in a database along with metadata such as the file size.  We already have in place
a program that generates two lists from this database, one with documents less than 5MB in
size, and one with documents that are larger.  (We then split up the larger docs before indexing.)

The larger (split) docs are successfully read via the Windows Shares repository connector.
 However, I can't figure out how to use the list of small-file URLs to index the smaller documents.
 I see a 'seeds' data entry box when creating a job using the web connector, but not a way
of pointing it to a seed file.

Am I missing something?  Is there a work-around?

Thanks,
Rick


============================================================
The information contained in this message may be privileged
and confidential and protected from disclosure. If the reader
of this message is not the intended recipient, or an employee
or agent responsible for delivering this message to the
intended recipient, you are hereby notified that any reproduction,
dissemination or distribution of this communication is strictly
prohibited. If you have received this communication in error,
please notify us immediately by replying to the message and
deleting it from your computer. Thank you. Tellabs
============================================================


============================================================
The information contained in this message may be privileged
and confidential and protected from disclosure. If the reader
of this message is not the intended recipient, or an employee
or agent responsible for delivering this message to the
intended recipient, you are hereby notified that any reproduction,
dissemination or distribution of this communication is strictly
prohibited. If you have received this communication in error,
please notify us immediately by replying to the message and
deleting it from your computer. Thank you. Tellabs
============================================================

Mime
View raw message