lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lance Norskog <goks...@gmail.com>
Subject Re: Which version of Solr?
Date Tue, 15 Feb 2011 03:54:49 GMT
AAAAIIIIIIRRRRGGGG! I feel your pain!

On Mon, Feb 14, 2011 at 3:27 PM, Jeff Schmidt <jas@535consulting.com> wrote:
> Wow,  okay, it's Cassandra's fault... :)
>
> I create unit tests to use HttpClient and even HttpURLConnection, and the former got
the non-response from the server, and the latter got unexpected end of file.  But, if I use
curl or telnet, things would work. Anyway, I noticed (Mac OS X 10.6.6):
>
> [imac:apache/cassandra/apache-cassandra-0.7.0] jas% netstat -an | grep 8080
> tcp4       0      0  *.8080                 *.*                
   LISTEN
> tcp46      0      0  *.8080                 *.*                
   LISTEN
> [imac:apache/cassandra/apache-cassandra-0.7.0] jas%
>
> After shutting down tomcat, the tcp4 line would still show up. Only after also shutting
down Cassandra were there no listeners on port 8080. Starting tomcat and Cassandra in either
order, neither failed to bind to 8080.  Why my Java programs tried to talk to Cassandra,
and telnet, Firefox, curl etc. managed to hook up with Solr, I don't know.
>
> I moved tomcat to port 8090 and things are good... Sigh..  What a big waste of time.
>
> Cheers,
>
> Jeff
>
> On Feb 14, 2011, at 2:29 PM, Jeff Schmidt wrote:
>
>> I figured instead of trying to index content, I'd simply issue a query via SolrJ.
This seems related to my problem below.  I create a CommonsHttpSolrServer instance in the
manner already described and in a new method:
>>
>>       @Override
>>       public List<String> getNodeIdsForProductId(final String productId,
final String partnerId) {
>>
>>               final List<String> nodes = new ArrayList<String>();
>>
>>               final CommonsHttpSolrServer solrServer = (CommonsHttpSolrServer)getSolrServer(partnerId);
>>               final SolrQuery query = new SolrQuery();
>>               query.setQuery("productId:" + productId);
>>               query.addField("nodeId");
>>               try {
>>                       final QueryResponse response = solrServer.query(query);
>>                       final SolrDocumentList docs = response.getResults();
>>                       log.info(String.format("getNodeIdsForProductId -
got %d nodes for productId: %s",
>>                                       docs.getNumFound(), productId));
>>                       for (SolrDocument doc : docs) {
>>                               log.info(doc);
>>                       }
>>               } catch (SolrServerException ex) {
>>                       final String msg = String.format("Unable to query
Solr server %s, for query: %s", solrServer.getBaseURL(), query);
>>                       log.error(msg);
>>                       throw new ServiceException(msg, ex);
>>               }
>>
>>               return nodes;
>>       }
>>
>> When issuing the query I get:
>>
>> 2011-02-14 13:13:28 INFO  solr.SolrProductIndexService - getSolrServer - Solr url:
http://localhost:8080/solr/partner-tmo
>> 2011-02-14 13:13:28 INFO  solr.SolrProductIndexService - getSolrServer - construct
server for url: http://localhost:8080/solr/partner-tmo
>> 2011-02-14 13:13:28 ERROR solr.SolrProductIndexService - Unable to query Solr server
http://localhost:8080/solr/partner-tmo, for query: q=productId%3Aproduct4&fl=nodeId
>> ...
>> Caused by: org.apache.solr.client.solrj.SolrServerException: org.apache.commons.httpclient.NoHttpResponseException:
The server localhost failed to respond
>>       at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:484)
>> ...
>> Caused by: org.apache.commons.httpclient.NoHttpResponseException: The server localhost
failed to respond
>>       at org.apache.commons.httpclient.HttpMethodBase.readStatusLine(HttpMethodBase.java:1976)
>>
>> If I run this through the proxy again, I can see the request being made as:
>>
>> GET /solr/partner-tmo/select?q=productId%3Aproduct4&fl=nodeId&wt=xml&version=2.2
HTTP/1.1
>> User-Agent: Solr[org.apache.solr.client.solrj.impl.CommonsHttpSolrServer] 1.0
>> Host: localhost:8080
>>
>> And I get no response from Solr.  If instead I use this URL in Firefox:
>>
>> http://localhost:8080/solr/partner-tmo/select?q=productId%3Aproduct4&fl=nodeId&wt=xml&version=2.2
>>
>> I get search results.  What is it about SolrJ that is just not working out?  What
basic thing am I missing? Using Firefox here, or curl below, I can talk to Solr (running in
Tomcat 6) just fine. But when going via SolrJ, I cannot update or query.  All of this stuff
is running on a single system.  I guess I'll try a simpler app/unit test to see what happens...
>>
>> This is really a big problem for me. Any suggests are greatly appreciated.
>>
>> Thanks,
>>
>> Jeff
>>
>> On Feb 13, 2011, at 9:15 PM, Jeff Schmidt wrote:
>>
>>> Hello again:
>>>
>>> Back to the javabin iissue:
>>>
>>> On Feb 12, 2011, at 6:07 PM, Lance Norskog wrote:
>>>
>>>> --- But I'm unable to get SolrJ to work due to the 'javabin' version
>>>> mismatch. I'm using the 1.4.1 version of SolrJ, but I always get an
>>>> HTTP response code of 200, but the return entity is simply a null
>>>> byte, which does not match the version number of 1 defined in Solr
>>>> common.  ---
>>>>
>>>> I've never seen this problem. At this point you are better off
>>>> starting with 3.x instead of chasing this problem down.
>>>
>>> I'm now using the latest branch_3x built Solr and SolrJ.  Other places I've
seen the message:
>>>
>>> Caused by: java.lang.RuntimeException: Invalid version (expected 2, but 0) or
the data in not in 'javabin' format at org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:99)
at org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse(BinaryResponseParser.java:41)
>>>
>>> One was told to make sure the version of Solr and SolrJ are compatible, and that
the schema is valid. Unlike 1.4, I see 3.1 actually outputs the expected and received version
numbers, which is helpful. You can see the invalid version of 0 is indicated which is the
zero byte I receive in response.
>>>
>>> I have Solr running within Tomcat by following the wiki.  I have the conf/Catalina/localhost/solr.xml
file set as:
>>>
>>> <?xml version="1.0" encoding="utf-8"?>
>>> <Context docBase="/usr/local/ingenuity/isec/solr/apache-solr-3.1-SNAPSHOT.war"
debug="0" crossContext="true">
>>> <Environment name="solr/home" type="java.lang.String"
>>>     value="/Users/jas/535Consulting/Clients/Ingenuity/ProfServices/svn/trunk/ing/isec/src/main/solr/multicore"
>>>     override="true"/>
>>> </Context>
>>>
>>> With that, I'm able to use my browser to index some content (DIH, curl etc.)
and issue queries, so it seems Solr is running okay in tomcat (apache-tomcat-6.0.30). To index
some Products, I have this simple method:
>>>
>>>      @Override
>>>      public void addProducts(final Collection<Product> products, final
String indexName) {
>>>
>>>              log.info(String.format("addProducts - indexing %d products
to Solr core: %s",
>>>                              products.size(), indexName));
>>>
>>>              Assert.notNull(indexName);
>>>
>>>              final Collection<SolrInputDocument> docs = new ArrayList<SolrInputDocument>();
>>>              for (Product product : products) {
>>>
>>>                      final SolrInputDocument doc = createDocumentForProduct(product);
>>>                      docs.add(doc);
>>>                      log.info("addProduct: document to index: " +
doc);
>>>              }
>>>
>>>              final SolrServer solrServer = getSolrServer(indexName);
>>>              try {
>>>                      solrServer.add(docs);
>>>                      solrServer.commit(commitWaitFlush, commitWaitSearcher);
>>>              } catch (Exception ex) {
>>>                      final String msg = String.format("Unable to add
and commit %d documents to core: %s",
>>>                                      products.size(), indexName);
>>>                      log.error(msg);
>>>                      throw new ServiceException(msg, ex);
>>>              }
>>>      }
>>>
>>> And I have:
>>>
>>>      protected SolrServer getSolrServer(final String indexName) {
>>>
>>>              final String url = solrServerBaseUrl + indexName;
>>>              log.info("getSolrServer - construct server for url: " + url);
>>>              try {
>>>                      final CommonsHttpSolrServer solrServer = new
CommonsHttpSolrServer(solrServerBaseUrl + indexName);
>>>                      //solrServer.setParser(new BinaryResponseParser());
>>>                      //solrServer.setParser(new XMLResponseParser());
>>>                      solrServer.setRequestWriter(new BinaryRequestWriter());
>>>                      return solrServer;
>>>              } catch (Exception ex) {
>>>                      final String msg = String.format("Unable to create
Solr server for url: %s", url);
>>>                      log.error(msg);
>>>                      throw new ServiceException(msg, ex);
>>>              }
>>>      }
>>>
>>> Note that this is code for prototyping. :)
>>>
>>> As you can see, in getSolrServer() I'm trying various settings.  http://wiki.apache.org/solr/Solrj
is tagged Solr 1.4, but I'm assuming it's at least very similar in 3.1. For the core in question,
solrconfig.xml does have:
>>>
>>> <requestHandler name="/update" class="solr.XmlUpdateRequestHandler" />
>>> <requestHandler name="/update/javabin" class="solr.BinaryUpdateRequestHandler"
/>
>>>
>>> I can see in the Solr log:
>>>
>>> Feb 13, 2011 3:35:14 PM org.apache.solr.core.RequestHandlers initHandlersFromConfig
>>> INFO: created /update/javabin: solr.BinaryUpdateRequestHandler
>>>
>>> Running this through the burp proxy to try to see what's going on, I can see
my application making the following request to Solr via SolrJ:
>>>
>>> ------------------------------------------
>>> POST /solr/partner-tmo/update/javabin?wt=javabin&version=2 HTTP/1.1
>>> User-Agent: Solr[org.apache.solr.client.solrj.impl.CommonsHttpSolrServer] 1.0
>>> Host: localhost:8090
>>> Content-Type: application/octet-stream
>>> Content-Length: 543
>>>
>>>  Äà&paramsÀà'delByIdà&delByQà$docs …Áà%boostÃà$name"idà#val-ING:afa|08520åÃæ&nodeIdç'ING:afaåÃæ)productIdç%08520åÃæ0nodeSourceIdTypeç"EGåÃæ,nodeSourceIdç#672åÃæ+productTypeç≠(chemical%shRNAåÃæ-description_tç?
>>> This is the description for product 08520åÃæ'brand_sç%FLUKAåÃæ%sku_sç(A980-852å…ÁåÃæ"idç-ING:afa|08530åÃæ&nodeIdç'ING:afaåÃæ)productIdç%08530åÃæ0nodeSourceIdTypeç"EGåÃæ,nodeSourceIdç#672åÃæ+productTypeç™(chemicalåÃæ-description_tç?
>>> This is the description for product 08530åÃæ'brand_sç%FLUKAåÃæ%sku_sç(A980-853å
>>> ------------------------------------------
>>>
>>> That looks pretty binary. In response I see:
>>>
>>> ------------------------------------------
>>> HTTP/1.0 200 OK
>>> Content-type: application/octet-stream
>>> Content-length: 1
>>>
>>>
>>> ------------------------------------------
>>>
>>> Looking at the hex view, I can see the one byte of data is 0x00.
>>>
>>> My other approach was to go the XML route. So, to do this, I comment out the
setting of the request writer and go with the default, which the wiki says is XML. Running
this through the proxy I see:
>>>
>>> ------------------------------------------
>>> POST /solr/partner-tmo/update?wt=javabin&version=2 HTTP/1.1
>>> User-Agent: Solr[org.apache.solr.client.solrj.impl.CommonsHttpSolrServer] 1.0
>>> Host: localhost:8090
>>> Content-Type: text/xml; charset=utf-8
>>> Content-Length: 856
>>>
>>> <add><doc boost="1.0"><field name="id">ING:afa|08520</field><field
name="nodeId">ING:afa</field><field name="productId">08520</field><field
name="nodeSourceIdType">EG</field><field name="nodeSourceId">672</field><field
name="productType">chemical</field><field name="productType">shRNA</field><field
name="description_t">This is the description for product 08520</field><field name="brand_s">FLUKA</field><field
name="sku_s">A980-852</field></doc><doc boost="1.0"><field name="id">ING:afa|08530</field><field
name="nodeId">ING:afa</field><field name="productId">08530</field><field
name="nodeSourceIdType">EG</field><field name="nodeSourceId">672</field><field
name="productType">chemical</field><field name="description_t">This is the
description for product 08530</field><field name="brand_s">FLUKA</field><field
name="sku_s">A980-853</field></doc></add>
>>> ------------------------------------------
>>>
>>> I can see it has specified the proper update handler in the URI and XML is being
uploaded.  In response though, I get the exact same one as when going binary, including the
application/octet-stream content type.  I end up with the same javabin version mismatch stacktrace
as well, even though I'm trying to talk XML. But, the presence of wt=javabin&version=2
is not very encouraging when going the default XML route.
>>>
>>> Just for grins, I added:
>>>
>>> solrServer.setParser(new XMLResponseParser());
>>>
>>> And now the exception is:
>>>
>>> Caused by:
>>> com.ctc.wstx.exc.WstxUnexpectedCharException: Illegal character (NULL, unicode
0) encountered: not valid in any content| at [row,col {unknown-source}]: [1,1]
>>>      at com.ctc.wstx.sr.StreamScanner.constructNullCharException(StreamScanner.java:640)
>>>
>>> So, apparently javabin is out of the way, and now the zero byte returned is mucking
up the XML parser.  According to proxy, the request is similar to the previous one, but:
>>>
>>> POST /solr/partner-tmo/update?wt=xml&version=2.2 HTTP/1.1
>>>
>>> So, great we are going the XML route, but the response is the same HTTP 200 and
a zero byte...
>>>
>>> So, I'm not sure what's going on.  I can say though that I am not seeing any
log activity solr.2011-*.log once Solr has started and I attempt to issue these requests.
Maybe it's tomcat? But, if I go directly to Solr outside of SolrJ and add some content to
that same core, I do see log activity, and get a valid response:
>>>
>>> [imac:solr/input-data/tmo-products] jas% curl --header "Content-type: text/xml;
charset=utf-8" --request POST -w "\nhttp code: %{http_code}\n" -d @nodes-products.xml "http://localhost:8080/solr/partner-tmo/update"
>>> <?xml version="1.0" encoding="UTF-8"?>
>>> <response>
>>> <lst name="responseHeader"><int name="status">0</int><int
name="QTime">11</int></lst>
>>> </response>
>>>
>>> http code: 200
>>> [imac:solr/input-data/tmo-products] jas%
>>>
>>> Much better than a single null byte.
>>>
>>> Any idea of what is going on?  Apologies for the long email, but I'm trying
to provide all the details. This problem must reside in something I'm doing. I'm sure others
are using SolrJ successfully.
>>>
>>> Thanks!
>>>
>>> Jeff
>>>
>>>>
>>>> On Sat, Feb 12, 2011 at 1:37 PM, Jeff Schmidt <jas@535consulting.com>
wrote:
>>>>> Hello:
>>>>>
>>>>> I'm working on incorporating Solr into a SaaS based life sciences semantic
search project. This will be released in about six months. I'm trying to determine which version
of Solr makes the most sense. When going to the Solr download page, there are 1.3.0, 1.4.0,
and 1.4.1. I've been using 1.4.1 while going through some examples in my Packt book ("Solr
1.4 Enterprise Search Server").
>>>>>
>>>>> But, I also see that Solr 3.1 and 4.0 are in the works.  According to:
>>>>>
>>>>>      https://issues.apache.org/jira/browse/#selectedTab=com.atlassian.jira.plugin.system.project%3Aroadmap-panel
>>>>>
>>>>> there is a high degree of progress on both of those releases; including
a slew of bug fixes, new features, performance enhancements etc. Should I be making use of
one of the newer versions?  The hierarchical faceting seems like it could be quite useful.
 Are there any guesses on when either 3.1 or 4.0 will be officially released?
>>>>>
>>>>> So far, 1.4.1 has been good. But I'm unable to get SolrJ to work due
to the 'javabin' version mismatch. I'm using the 1.4.1 version of SolrJ, but I always get
an HTTP response code of 200, but the return entity is simply a null byte, which does not
match the version number of 1 defined in Solr common.  Anyway, I can follow up on that issue
if 1.4.1 is still the most appropriate version to use these days. Otherwise, I'll try again
with whatever version you suggest.
>>>>>
>>>>> Thanks a lot!
>>>>>
>>>>> Jeff
>>>>> --
>>>>> Jeff Schmidt
>>>>> 535 Consulting
>>>>> jas@535consulting.com
>>>>> (650) 423-1068
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Lance Norskog
>>>> goksron@gmail.com
>>>
>>> --
>>> Jeff Schmidt
>>> 535 Consulting
>>> jas@535consulting.com
>>> (650) 423-1068
>>> http://www.535consulting.com
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>>
>>
>> --
>> Jeff Schmidt
>> 535 Consulting
>> jas@535consulting.com
>> (650) 423-1068
>> http://www.535consulting.com
>>
>>
>>
>>
>>
>>
>>
>
>
>
> --
> Jeff Schmidt
> 535 Consulting
> jas@535consulting.com
> (650) 423-1068
> http://www.535consulting.com
>
>
>
>
>
>
>
>



-- 
Lance Norskog
goksron@gmail.com

Mime
View raw message