Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id D1EB7200B6B for ; Fri, 9 Sep 2016 19:35:22 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id D081C160ACA; Fri, 9 Sep 2016 17:35:22 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id CB536160AA3 for ; Fri, 9 Sep 2016 19:35:21 +0200 (CEST) Received: (qmail 59362 invoked by uid 500); 9 Sep 2016 17:35:20 -0000 Mailing-List: contact dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@lucene.apache.org Delivered-To: mailing list dev@lucene.apache.org Received: (qmail 59346 invoked by uid 99); 9 Sep 2016 17:35:20 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 09 Sep 2016 17:35:20 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 84A522C014C for ; Fri, 9 Sep 2016 17:35:20 +0000 (UTC) Date: Fri, 9 Sep 2016 17:35:20 +0000 (UTC) From: "Yury Kartsev (JIRA)" To: dev@lucene.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (SOLR-9493) uniqueKey generation fails if content POSTed as "application/javabin". MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Fri, 09 Sep 2016 17:35:23 -0000 [ https://issues.apache.org/jira/browse/SOLR-9493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yury Kartsev updated SOLR-9493: ------------------------------- Description: I have faced a weird issue when the same application code (using SolrJ) fails indexing a document without a unique key (should be auto-generated by SOLR) in SolrCloud and succeeds indexing it in standalone SOLR instance (or even in cloud mode, but from web interface of one of the replicas). Difference is obviously only between clients (CloudSolrClient vs HttpSolrClient) and SOLR URLs (Zokeeper hostname+port vs standalone SOLR instance hostname and port). Failure is seen as "org.apache.solr.client.solrj.SolrServerException: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Document is missing mandatory uniqueKey field: id". I am using SOLR 5.1. In cloud mode I have 1 shard and 3 replicas. After lot of debugging and investigation (see below as well as my [StackOverflow post|http://stackoverflow.com/questions/39401792/uniquekey-generation-does-not-work-in-solrcloud-but-works-if-standalone]) I came to a conclusion that the difference in failing and succeeding calls is simply content type of the POSTing requests. Local proxy clearly shows that the request fails if content is sent as "application/javabin" (see attached) and succeeds if content sent as "application/xml; charset=UTF-8" (see attached). Would you be able to please assist? Thank you very much in advance! ------------------------ Copying whole description and investigation here as well: ------------------------ [Documentation|https://cwiki.apache.org/confluence/display/solr/Other+Schema+Elements] states:{quote}Schema defaults and copyFields cannot be used to populate the uniqueKey field. You can use UUIDUpdateProcessorFactory to have uniqueKey values generated automatically.{quote} Therefore I have added my uniqueKey field to the schema:{code} ... ... id{code}Then I have added updateRequestProcessorChain to my solrconfig:{code} id {code}And made it the default for the UpdateRequestHandler:{code} uuid {code} Adding new documents with null/absent id works fine as from web-interface of one of the replicas, as when using SOLR in standalone mode (non-cloud) from my application. Although when only I'm using SolrCloud and add document from my application (using CloudSolrClient from SolrJ) it fails with "org.apache.solr.client.solrj.SolrServerException: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Document is missing mandatory uniqueKey field: id" All other operations like ping or search for documents work fine in either mode (standalone or cloud). INVESTIGATION (i.e. more details): In standalone mode obviously update request is:{code}POST standalone_host:port/solr/collection_name/update?wt=json{code} In SOLR cloud mode, when adding document from one replica's web interface, update request is (found through inspecting the call made by web interface): {code}POST replica_host:port/solr/collection_name_shard1_replica_1/update?wt=json{code} In both these cases payload is something like:{code}{ "add": { "doc": { ..... }, "boost": 1.0, "overwrite": true, "commitWithin": 1000 } }{code} In case when CloudSolrClient is used, the following happens (found through debugging): Using ZK and some logic, URL list of replicas is constructed that looks like this:{code}[http://replica_1_host:port/solr/collection_name/, http://replica_2_host:port/solr/collection_name/, http://replica_3_host:port/solr/collection_name/]{code} This code is called:{code}LBHttpSolrClient.Req req = new LBHttpSolrClient.Req(request, theUrlList); LBHttpSolrClient.Rsp rsp = lbClient.request(req); return rsp.getResponse();{code} Where the second line fails with the exception. If to debug the second line further, it ends up calling HttpClient.execute (from HttpSolrClient.executeMethod) for:{code}POST http://replica_1_host:port/solr/collection_name/update?wt=javabin&version=2 HTTP/1.1 POST http://replica_2_host:port/solr/collection_name/update?wt=javabin&version=2 HTTP/1.1 POST http://replica_3_host:port/solr/collection_name/update?wt=javabin&version=2 HTTP/1.1{code} And the very first request returns 400 Bad Request with replica 1 logging "Document is missing mandatory uniqueKey field: id" in the logs. The funny thing is that when I execute the same request using POSTMAN (but with JSON instead of binary payload), it works! Am I doing something wrong here? I assume it's definitely something in the way of how the request is made... UPDATE: I have used local proxy in order to see the difference in these 2 requests sent by my application in order to understand what is different there. Looks like the only difference is content type. In case of cloud mode the payload for POSTing document is sent as "application/javabin" while in standalone mode it's sent as "application/xml; charset=UTF-8". Everything else is the same. First request results in 400 while second is 200. was: I have faced a weird issue when the same application code (using SolrJ) fails indexing a document without a unique key (should be auto-generated by SOLR) in SolrCloud and succeeds indexing it in standalone SOLR instance (or even in cloud mode, but from web interface of one of the replicas). Difference is obviously only between clients (CloudSolrClient vs HttpSolrClient) and SOLR URLs (Zokeeper hostname+port vs standalone SOLR instance hostname and port). Failure is seen as "org.apache.solr.client.solrj.SolrServerException: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Document is missing mandatory uniqueKey field: id". I am using SOLR 5.1. In cloud mode I have 1 shard and 3 replicas. After lot of debugging and investigation (see my [StackOverflow post|http://stackoverflow.com/questions/39401792/uniquekey-generation-does-not-work-in-solrcloud-but-works-if-standalone]) I came to a conclusion that the difference in failing and succeeding calls is simply content type of the POSTing requests. Local proxy clearly shows that the request fails if content is sent as "application/javabin" (see attached) and succeeds if content sent as "application/xml; charset=UTF-8" (see attached). Would you be able to please assist? Thank you very much in advance! > uniqueKey generation fails if content POSTed as "application/javabin". > ---------------------------------------------------------------------- > > Key: SOLR-9493 > URL: https://issues.apache.org/jira/browse/SOLR-9493 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Reporter: Yury Kartsev > Attachments: 200.png, 400.png > > > I have faced a weird issue when the same application code (using SolrJ) fails indexing a document without a unique key (should be auto-generated by SOLR) in SolrCloud and succeeds indexing it in standalone SOLR instance (or even in cloud mode, but from web interface of one of the replicas). Difference is obviously only between clients (CloudSolrClient vs HttpSolrClient) and SOLR URLs (Zokeeper hostname+port vs standalone SOLR instance hostname and port). Failure is seen as "org.apache.solr.client.solrj.SolrServerException: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Document is missing mandatory uniqueKey field: id". > I am using SOLR 5.1. In cloud mode I have 1 shard and 3 replicas. > After lot of debugging and investigation (see below as well as my [StackOverflow post|http://stackoverflow.com/questions/39401792/uniquekey-generation-does-not-work-in-solrcloud-but-works-if-standalone]) I came to a conclusion that the difference in failing and succeeding calls is simply content type of the POSTing requests. Local proxy clearly shows that the request fails if content is sent as "application/javabin" (see attached) and succeeds if content sent as "application/xml; charset=UTF-8" (see attached). > Would you be able to please assist? > Thank you very much in advance! > ------------------------ > Copying whole description and investigation here as well: > ------------------------ > [Documentation|https://cwiki.apache.org/confluence/display/solr/Other+Schema+Elements] states:{quote}Schema defaults and copyFields cannot be used to populate the uniqueKey field. You can use UUIDUpdateProcessorFactory to have uniqueKey values generated automatically.{quote} > Therefore I have added my uniqueKey field to the schema:{code} > ... > > ... > id{code}Then I have added updateRequestProcessorChain to my solrconfig:{code} > > id > > > {code}And made it the default for the UpdateRequestHandler:{code} > > uuid > > {code} > Adding new documents with null/absent id works fine as from web-interface of one of the replicas, as when using SOLR in standalone mode (non-cloud) from my application. Although when only I'm using SolrCloud and add document from my application (using CloudSolrClient from SolrJ) it fails with "org.apache.solr.client.solrj.SolrServerException: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Document is missing mandatory uniqueKey field: id" > All other operations like ping or search for documents work fine in either mode (standalone or cloud). > INVESTIGATION (i.e. more details): > In standalone mode obviously update request is:{code}POST standalone_host:port/solr/collection_name/update?wt=json{code} > In SOLR cloud mode, when adding document from one replica's web interface, update request is (found through inspecting the call made by web interface): {code}POST replica_host:port/solr/collection_name_shard1_replica_1/update?wt=json{code} > In both these cases payload is something like:{code}{ > "add": { > "doc": { > ..... > }, > "boost": 1.0, > "overwrite": true, > "commitWithin": 1000 > } > }{code} > In case when CloudSolrClient is used, the following happens (found through debugging): > Using ZK and some logic, URL list of replicas is constructed that looks like this:{code}[http://replica_1_host:port/solr/collection_name/, > http://replica_2_host:port/solr/collection_name/, > http://replica_3_host:port/solr/collection_name/]{code} > This code is called:{code}LBHttpSolrClient.Req req = new LBHttpSolrClient.Req(request, theUrlList); > LBHttpSolrClient.Rsp rsp = lbClient.request(req); > return rsp.getResponse();{code} > Where the second line fails with the exception. > If to debug the second line further, it ends up calling HttpClient.execute (from HttpSolrClient.executeMethod) for:{code}POST http://replica_1_host:port/solr/collection_name/update?wt=javabin&version=2 HTTP/1.1 > POST http://replica_2_host:port/solr/collection_name/update?wt=javabin&version=2 HTTP/1.1 > POST http://replica_3_host:port/solr/collection_name/update?wt=javabin&version=2 HTTP/1.1{code} > And the very first request returns 400 Bad Request with replica 1 logging "Document is missing mandatory uniqueKey field: id" in the logs. > The funny thing is that when I execute the same request using POSTMAN (but with JSON instead of binary payload), it works! Am I doing something wrong here? I assume it's definitely something in the way of how the request is made... > UPDATE: > I have used local proxy in order to see the difference in these 2 requests sent by my application in order to understand what is different there. Looks like the only difference is content type. In case of cloud mode the payload for POSTing document is sent as "application/javabin" while in standalone mode it's sent as "application/xml; charset=UTF-8". Everything else is the same. First request results in 400 while second is 200. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional commands, e-mail: dev-help@lucene.apache.org