manifoldcf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karl Wright (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CONNECTORS-1408) Request-URI Too Long
Date Fri, 14 Apr 2017 06:55:41 GMT

    [ https://issues.apache.org/jira/browse/CONNECTORS-1408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968678#comment-15968678
] 

Karl Wright edited comment on CONNECTORS-1408 at 4/14/17 6:55 AM:
------------------------------------------------------------------

Looking carefully at the code, here are some more thoughts:

- If you are using a debugger, you need to be careful not to just back up and attempt to run
the post again, without waiting until the next document.  This won't work because there are
streams involved that will get closed after the first run-through.  I suspect this is the
reason why you saw 'missing content stream'.
- It sounds to me like the POST is in fact happening just fine, but Solr is kicking it out.
That wouldn't happen if the POST was malformed or had too long a URI, because HttpClient wouldn't
allow it.
- I can see no reason why isMultipart would not be 'true' under most situations; it seems
to be gated on the variable useMultiPartPost.  The reason that ModifiedHttpSolrClient even
exists is so that we can set useMultiPartPost to "true".  I am sure it stays "true" too; I
declared it "final" here and it compiles.
- Since we've been using multi-part all along, I have to conclude that the reason we're getting
the URI error is simply because the URI is too big even when the POST is multipart.
- We can easily try to include the request metadata in the multipart fields, as long as I
am sure where they are coming in.  request.getParams()?  or request.getQueryParams()?  If
the metadata is found in request.getParams(), then we should already be sending parameters
in the multipart data, so the problem would have to be on the Solr side.
- If I change this, though, it's still possible that Solr won't be happy with it.  We'll have
to try it and see.

It's also possible to see exactly what is going on by enabling http wire debugging in httpclient.
 Then we can see the data being sent, and the URI too. In logging.ini, you simply set a couple
of lines to make this happen.  Have a look at:

https://hc.apache.org/httpcomponents-client-ga/logging.html

You will want to add:

{code}
log4j.logger.org.apache.http=DEBUG
log4j.logger.org.apache.http.wire=DEBUG
{code}

Please let me know if you can confirm my understanding, and determine for sure whether the
problem is on the Solr side or the ManifoldCF side.  If on the Solr side, you'll want to create
a Solr ticket.  Thanks!!



was (Author: kwright@metacarta.com):
Looking carefully at the code, here are some more thoughts:

- If you are using a debugger, you need to be careful not to just back up and attempt to run
the post again, without waiting until the next document.  This won't work because there are
streams involved that will get closed after the first run-through.  I suspect this is the
reason why you saw 'missing content stream'.
- It sounds to me like the POST is in fact happening just fine, but Solr is kicking it out.
That wouldn't happen if the POST was malformed or had too long a URI, because HttpClient wouldn't
allow it.
- I can see no reason why isMultipart would not be 'true' under most situations; it seems
to be gated on the variable useMultiPartPost.  The reason that ModifiedHttpSolrClient even
exists is so that we can set useMultiPartPost to "true".  I am sure it stays "true" too; I
declared it "final" here and it compiles.
- Since we've been using multi-part all along, I have to conclude that the reason we're getting
the URI error is simply because the URI is too big even when the POST is multipart.
- We can easily try to include the request metadata in the multipart fields, as long as I
am sure where they are coming in.  request.getParams()?  or request.getQueryParams()?  If
the metadata is found in request.getParams(), then we should already be sending parameters
in the multipart data, so the problem would have to be on the Solr side.
- If I change this, though, it's still possible that Solr won't be happy with it.  We'll have
to try it and see.

It's also possible to see exactly what is going on by enabling http wire debugging in httpclient.
 Then we can see the data being sent, and the URI too. In logging.ini, you simply set a couple
of lines to make this happen.  Have a look at:

https://hc.apache.org/httpcomponents-client-ga/logging.html

Please let me know if you can confirm my understanding, and determine for sure whether the
problem is on the Solr side or the ManifoldCF side.  If on the Solr side, you'll want to create
a Solr ticket.  Thanks!!


> Request-URI Too Long
> --------------------
>
>                 Key: CONNECTORS-1408
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1408
>             Project: ManifoldCF
>          Issue Type: Bug
>          Components: Email connector, Solr 6.x component
>    Affects Versions: ManifoldCF 2.6
>            Reporter: Cihad Guzel
>            Assignee: Karl Wright
>             Fix For: ManifoldCF 2.7
>
>
> I run email connector job and follow "Simple History" from UI. I see an error as follow:
> {code}
> Error from server at http://localhost:8983/solr/mycore: non ok status: 414, message:Request-URI
Too Long
> {code}
> It is sent by Solr. 
> Solr logs say: 
> {code}
> HttpParser - URI is too large >8192
> {code}
> and 
> {code}
> HttpParser - bad HTTP parsed: 414 for HttpChannelOverHttp@2b6931dd{r=0,&#8203;c=false,&#8203;a=IDLE,&#8203;uri=null}

> {code}
> ManifoldCF ModifiedHttpSolrClient.java has following code:
> {code}
>  // It is has one stream, it is the post body, put the params in the URL
>       else {
>         String pstr = toQueryString(wparams, false);
>         HttpEntityEnclosingRequestBase postOrPut = SolrRequest.METHOD.POST == request.getMethod()
?
>             new HttpPost(url + pstr) : new HttpPut(url + pstr);
> {code}
> There is "pstr" field appended to the URL. "pstr" field have all Solr params. It contains
email content. We have "URI is too large" error when email has large content.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message