manifoldcf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Steenbeke (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CONNECTORS-1567) export of web connection bandwidth throttling
Date Wed, 09 Jan 2019 07:06:00 GMT

    [ https://issues.apache.org/jira/browse/CONNECTORS-1567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16737909#comment-16737909
] 

Tim Steenbeke edited comment on CONNECTORS-1567 at 1/9/19 7:05 AM:
-------------------------------------------------------------------

But is bandwidth throttles and throttling the same for manifoldcf ? bandwidth throttle is
a different object in the response JSON or am I mistaking ?

Also i don't understand what you mean with old-form, the example is the response from a 'repositoryconnections'
GET call on manifoldCF 2.11.
 In the documentation it also only speaks of throttling and not the bandwidth for both 2.11
and 2.12. ([JSON repository connector 2.12|[https://manifoldcf.apache.org/release/release-2.12/en_US/programmatic-operation.html#Repository+connection+objects]])

*response for curl -X GET [http://localhost:8345/mcf-api-service/json/repositoryconnections] 
-H 'content-type: application/json'*

 
{code:java}
{
            "throttle": {
                "match_description": "testable regex",
                "rate": "1.6666666E-4",
                "match": "test reg"
            },
            "max_connections": "20",
            "configuration": {
                "trust": {
                    "_attribute_trusteverything": "true",
                    "_value_": "",
                    "_attribute_urlregexp": ".*"
                },
                "bindesc": {
                    "maxkbpersecond": {
                        "_value_": "",
                        "_attribute_value": "64"
                    },
                    "_attribute_caseinsensitive": "false",
                    "maxconnections": {
                        "_value_": "",
                        "_attribute_value": "2"
                    },
                    "maxfetchesperminute": {
                        "_value_": "",
                        "_attribute_value": "12"
                    },
                    "_attribute_binregexp": "test regex",
                    "_value_": ""
                },
                "_PARAMETER_": [
                    {
                        "_value_": "tim.steenbeke@formica.digital",
                        "_attribute_name": "Email address"
                    },
                    {
                        "_value_": "all",
                        "_attribute_name": "Robots usage"
                    },
                    {
                        "_value_": "all",
                        "_attribute_name": "Meta robots tags usage"
                    },
                    {
                        "_value_": "proxyhost",
                        "_attribute_name": "Proxy host"
                    },
                    {
                        "_value_": "port",
                        "_attribute_name": "Proxy port"
                    },
                    {
                        "_value_": "domain",
                        "_attribute_name": "Proxy authentication domain"
                    },
                    {
                        "_value_": "admin",
                        "_attribute_name": "Proxy authentication user
name"
                    },
                    {
                        "_value_": "5qNuZnChiobQlUozw2quhCGsgYVazxVVbAUjc3Hk5Mc=",
                        "_attribute_name": "Proxy authentication password"
                    }
                ],
                "accesscredential": [
                    {
                        "_value_": "",
                        "_attribute_type": "basic",
                        "_attribute_username": "admin",
                        "_attribute_urlregexp": "some acces creds",
                        "_attribute_password": "RkBMPT2W2ZC7XebgFp5PSuYSdCDnik4GKd130+PtXRk=",
                        "_attribute_domain": "localhost:8080"
                    },
                    {
                        "_value_": "",
                        "_attribute_type": "session",
                        "_attribute_urlregexp": "url regex"
                    }
                ]
            },
            "name": "abc_test",
            "description": "test abc",
            "isnew": "false",
            "class_name": "org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector"
        }
{code}
 

*For following bandwidth setup:*

!bandwidth_test_abc.png!

*So than I would do the following to set bandwidth and throttling to null:*
{code:java}
{
            "throttle": null,            <<<--- null for throttling
            "max_connections": "20",
            "configuration": {
                "trust": {
                    "_attribute_trusteverything": "true",
                    "_value_": "",
                    "_attribute_urlregexp": ".*"
                },
                "bindesc": null,        <<<--- null for bandwidth
                "_PARAMETER_": [
                    {
                        "_value_": "tim.steenbeke@formica.digital",
                        "_attribute_name": "Email address"
                    },
                    {
                        "_value_": "all",
                        "_attribute_name": "Robots usage"
                    },
                    {
                        "_value_": "all",
                        "_attribute_name": "Meta robots tags usage"
                    },
                    {
                        "_value_": "proxyhost",
                        "_attribute_name": "Proxy host"
                    },
                    {
                        "_value_": "port",
                        "_attribute_name": "Proxy port"
                    },
                    {
                        "_value_": "domain",
                        "_attribute_name": "Proxy authentication domain"
                    },
                    {
                        "_value_": "admin",
                        "_attribute_name": "Proxy authentication user
name"
                    },
                    {
                        "_value_": "5qNuZnChiobQlUozw2quhCGsgYVazxVVbAUjc3Hk5Mc=",
                        "_attribute_name": "Proxy authentication password"
                    }
                ],
                "accesscredential": [
                    {
                        "_value_": "",
                        "_attribute_type": "basic",
                        "_attribute_username": "admin",
                        "_attribute_urlregexp": "some acces creds",
                        "_attribute_password": "RkBMPT2W2ZC7XebgFp5PSuYSdCDnik4GKd130+PtXRk=",
                        "_attribute_domain": "localhost:8080"
                    },
                    {
                        "_value_": "",
                        "_attribute_type": "session",
                        "_attribute_urlregexp": "url regex"
                    }
                ]
            },
            "name": "abc_test",
            "description": "test abc",
            "isnew": "false",
            "class_name": "org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector"
        }{code}
 

I'll test this out and come back to you, maybe this is related to CONNECTORS-1568

 


was (Author: steenti):
But is bandwidth throttles and throttling the same for manifoldcf ? bandwidth throttle is
a different object in the response JSON or am I mistaking ?

Also i don't understand what you mean with old-form, the example is the response from a 'repositoryconnections'
GET call on manifoldCF 2.11.
In the documentation it also only speaks of throttling and not the bandwidth for both 2.11
and 2.12. ([JSON repository connector 2.12|[https://manifoldcf.apache.org/release/release-2.12/en_US/programmatic-operation.html#Repository+connection+objects|https://manifoldcf.apache.org/release/release-2.12/en_US/programmatic-operation.html#Repository+connection+objects])]])

*response for curl -X GET [http://localhost:8345/mcf-api-service/json/repositoryconnections] 
-H 'content-type: application/json'*

 
{code:java}
{
            "throttle": {
                "match_description": "testable regex",
                "rate": "1.6666666E-4",
                "match": "test reg"
            },
            "max_connections": "20",
            "configuration": {
                "trust": {
                    "_attribute_trusteverything": "true",
                    "_value_": "",
                    "_attribute_urlregexp": ".*"
                },
                "bindesc": {
                    "maxkbpersecond": {
                        "_value_": "",
                        "_attribute_value": "64"
                    },
                    "_attribute_caseinsensitive": "false",
                    "maxconnections": {
                        "_value_": "",
                        "_attribute_value": "2"
                    },
                    "maxfetchesperminute": {
                        "_value_": "",
                        "_attribute_value": "12"
                    },
                    "_attribute_binregexp": "test regex",
                    "_value_": ""
                },
                "_PARAMETER_": [
                    {
                        "_value_": "tim.steenbeke@formica.digital",
                        "_attribute_name": "Email address"
                    },
                    {
                        "_value_": "all",
                        "_attribute_name": "Robots usage"
                    },
                    {
                        "_value_": "all",
                        "_attribute_name": "Meta robots tags usage"
                    },
                    {
                        "_value_": "proxyhost",
                        "_attribute_name": "Proxy host"
                    },
                    {
                        "_value_": "port",
                        "_attribute_name": "Proxy port"
                    },
                    {
                        "_value_": "domain",
                        "_attribute_name": "Proxy authentication domain"
                    },
                    {
                        "_value_": "admin",
                        "_attribute_name": "Proxy authentication user
name"
                    },
                    {
                        "_value_": "5qNuZnChiobQlUozw2quhCGsgYVazxVVbAUjc3Hk5Mc=",
                        "_attribute_name": "Proxy authentication password"
                    }
                ],
                "accesscredential": [
                    {
                        "_value_": "",
                        "_attribute_type": "basic",
                        "_attribute_username": "admin",
                        "_attribute_urlregexp": "some acces creds",
                        "_attribute_password": "RkBMPT2W2ZC7XebgFp5PSuYSdCDnik4GKd130+PtXRk=",
                        "_attribute_domain": "localhost:8080"
                    },
                    {
                        "_value_": "",
                        "_attribute_type": "session",
                        "_attribute_urlregexp": "url regex"
                    }
                ]
            },
            "name": "abc_test",
            "description": "test abc",
            "isnew": "false",
            "class_name": "org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector"
        }
{code}
 

*For following bandwidth setup:*

!bandwidth_test_abc.png!

*So than I would do the following to set bandwidth and throttling to null:*
{code:java}
{
            "throttle": null,            <<<--- null for throttling
            "max_connections": "20",
            "configuration": {
                "trust": {
                    "_attribute_trusteverything": "true",
                    "_value_": "",
                    "_attribute_urlregexp": ".*"
                },
                "bindesc": null,        <<<--- null for bandwidth
                "_PARAMETER_": [
                    {
                        "_value_": "tim.steenbeke@formica.digital",
                        "_attribute_name": "Email address"
                    },
                    {
                        "_value_": "all",
                        "_attribute_name": "Robots usage"
                    },
                    {
                        "_value_": "all",
                        "_attribute_name": "Meta robots tags usage"
                    },
                    {
                        "_value_": "proxyhost",
                        "_attribute_name": "Proxy host"
                    },
                    {
                        "_value_": "port",
                        "_attribute_name": "Proxy port"
                    },
                    {
                        "_value_": "domain",
                        "_attribute_name": "Proxy authentication domain"
                    },
                    {
                        "_value_": "admin",
                        "_attribute_name": "Proxy authentication user
name"
                    },
                    {
                        "_value_": "5qNuZnChiobQlUozw2quhCGsgYVazxVVbAUjc3Hk5Mc=",
                        "_attribute_name": "Proxy authentication password"
                    }
                ],
                "accesscredential": [
                    {
                        "_value_": "",
                        "_attribute_type": "basic",
                        "_attribute_username": "admin",
                        "_attribute_urlregexp": "some acces creds",
                        "_attribute_password": "RkBMPT2W2ZC7XebgFp5PSuYSdCDnik4GKd130+PtXRk=",
                        "_attribute_domain": "localhost:8080"
                    },
                    {
                        "_value_": "",
                        "_attribute_type": "session",
                        "_attribute_urlregexp": "url regex"
                    }
                ]
            },
            "name": "abc_test",
            "description": "test abc",
            "isnew": "false",
            "class_name": "org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector"
        }{code}
 

I'll test this out and come back to you, maybe this is related to [CONNECTORS-1568|https://issues.apache.org/jira/browse/CONNECTORS-1568]

 

> export of web connection bandwidth throttling
> ---------------------------------------------
>
>                 Key: CONNECTORS-1567
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1567
>             Project: ManifoldCF
>          Issue Type: Bug
>          Components: Web connector
>    Affects Versions: ManifoldCF 2.11, ManifoldCF 2.12
>            Reporter: Tim Steenbeke
>            Assignee: Karl Wright
>            Priority: Major
>             Fix For: ManifoldCF 2.13
>
>         Attachments: bandwidth.png, bandwidth_test_abc.png
>
>
> When exporting the web connector using the API, it doesn't export the bandwidth throttling.
>  Than when importing this connector to a clean manifoldcf it creates the connector with
basic bandwidth.
>  When using the connector in a job it works properly.
> The issue here is that the connector isn't created with correct bandwidth throttling.
>  And the connector gives issues in the UI when trying to view or edit.
> (related to issue: [CONNECTORS-1568|https://issues.apache.org/jira/projects/CONNECTORS/issues/CONNECTORS-1568])
> e.g.:
> {code:java}
> {
>   "name": "test_web",
>   "configuration": null,
>     "_PARAMETER_": [
>       {
>         "_attribute_name": "Email address",
>         "_value_": "tim.steenbeke@formica.digital"
>       },
>       {
>         "_attribute_name": "Robots usage",
>         "_value_": "all"
>       },
>       {
>         "_attribute_name": "Meta robots tags usage",
>         "_value_": "all"
>       },
>       {
>         "_attribute_name": "Proxy host",
>         "_value_": ""
>       },
>       {
>         "_attribute_name": "Proxy port",
>         "_value_": ""
>       },
>       {
>         "_attribute_name": "Proxy authentication domain",
>         "_value_": ""
>       },
>       {
>         "_attribute_name": "Proxy authentication user name",
>         "_value_": ""
>       },
>       {
>         "_attribute_name": "Proxy authentication password",
>         "_value_": ""
>       }
>     ]
>   },
>   "description": "Website repository standard settup",
>   "throttle": null,
>   "max_connections": 10,
>   "class_name": "org.apache.manifoldcf.crawler.connectors.webcrawler.WebcrawlerConnector",
>   "acl_authority": null
> }{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Mime
View raw message