manifoldcf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <daddy...@gmail.com>
Subject Re: WebSite Crawling with Customheader Info
Date Wed, 05 Dec 2018 11:11:31 GMT
Hi Jasvinder,

That sounds like a customization you would have to make.  The Web Connector
is designed for generic web crawling, not as the basis for a custom
connector.  Indeed, I would strongly suggest that you not try to use the
web connector to retrieve your repository content but rather develop a
connector of your own meant to work with the system you are trying to get
data out of.

Thanks,
Karl


On Wed, Dec 5, 2018 at 4:54 AM jasvinder.singh@gartner.com <
jasvinder.singh@gartner.com> wrote:

> Can you please suggest is there a way to pass custom header info with Seed
> URL
> so that the target application can determine this request is coming for
> crawling
> i.e if my site is xxx.yy.com so when the request hits from ManifoldCF for
> crawling
> can i pass some header which I can parse in my xxx.yy.com  site to
> determine
> the request is for crawling - and I can customize my code for some purposes
> like - Bypass Authentication - The reason I am asking is for some reason
> I could not map my login sequence defined in ManifoldCF - so looking for
> alternatives
> (Since it's my Internet Site - Its ok for me to ByPass Authentication
> based on some
> custom header)
>
> Thanks In Advance for help
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message