lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From dhamu <>
Subject How to send web pages(urls) to solr cell via solrj?
Date Thu, 04 Feb 2010 09:47:47 GMT

I am newbie to solr and exploring solr last few days.
I am using solr cell with tika for parsing, indexing and searching
Posting the rich text documents via Solrj.
My actual requirement is instead of using local documents(pdf, doc & docx),
i want to use webpages(urls for eg..,( 

req.addFile(new File("docs/mailing_lists.html"));
req.url(new urlconnection("")
anything like the above is there in solrj.

Actually i am using curl for testing. it works fine

-F "stream.url=" 

but i am in need to use otherthan curl.
Below code works fine for local document indexing and searching. But instead
i want to post urls.

here is my code.,

                String url = "http://localhost:8983/solr";
                SolrServer server = new CommonsHttpSolrServer(url);
		ContentStreamUpdateRequest req = new ContentStreamUpdateRequest(
		req.addFile(new File("docs/mailing_lists.html"));
		req.setParam("", "index1");
		req.setParam("uprefix", "attr_");
		req.setParam("fmap.content", "attr_content");
		req.setAction(AbstractUpdateRequest.ACTION.COMMIT, true, true);
		NamedList result = server.request(req);
		assertNotNull("Couldn't upload index.pdf", result);
		QueryResponse rsp = server.query(new SolrQuery("*:*"));
		Assert.assertEquals(1, rsp.getResults().getNumFound());

any suggestion or answer will be appreciated.

View this message in context:
Sent from the Solr - User mailing list archive at

View raw message