JavaScript code that creates dynamic URLs is always a problem for web
crawlers.
Most web sites try to make their content crawlable by creating
alternative static links to the content.
I think Google now does some analysis/execution of JS code, but it's a
tricky problem.
I would suggest modifying the HTML parser to explicitly look for calls
being made to your function, and generate appropriate outlinks.
-- Ken
On Sep 14, 2009, at 8:04am, Mohamed Parvez wrote:
> Can anyone please through some light on this
>
> Thanks/Regards,
> Parvez
>
>
> On Fri, Sep 11, 2009 at 3:23 PM, Mohamed Parvez <parvez@gmail.com>
> wrote:
>
>> We have a JavaScript function, which takes some prams and builds an
>> URL and
>> then uses window.location to send the user to that URL.
>>
>> Our website uses this feature a lot and most of the urls are built
>> using
>> this function.
>>
>> I am trying to crawl using Nutch and I am also using the parse-js
>> plugin.
>>
>> But it does not look like Nautch is able to crawl these URLs.
>>
>> Am I doing something wrong or Nutch is not able to crawl URLs build
>> by
>> JavaScript function.
>>
>> ----
>> Thanks/Regards,
>> Parvez
>>
>>
--------------------------
Ken Krugler
TransPac Software, Inc.
<http://www.transpac.com>
+1 530-210-6378
|