lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matteo Grolla <matteo.gro...@gmail.com>
Subject Re: scanning all documents in the collection
Date Mon, 02 Feb 2015 17:08:56 GMT
Wow!!!
	thanks Joe!

Il giorno 02/feb/2015, alle ore 15:05, Joseph Obernberger ha scritto:

> I have a similar use-case.  Check out the export capability and using cursorMark.
> 
> -Joe
> 
> On 2/2/2015 8:14 AM, Matteo Grolla wrote:
>> Hi,
>> 	I'm thinking about having an instance of solr (SolrA) with all fields stored and
just id indexed in addition with a normal production instance of solr (SolrB) that is used
for the searches.
>> This would allow me to read only what changed from previous crawl, update SolrA and
send the full document to SolrB. Without forcing SolrB to have all fields stored.
>> In addition I have some batch jobs that work on the whole collection and making them
work on SolrA would allow me to detect the document that changed and submit only those to
SolrB.
>> The point is that to run this job I'll need to scan through all documents from SolrA,
I'll query on *:* and then go through all pages, which is not the typical usage of Solr.
>> SolrA will contain a few tens of GB of data coming from hundreds of thousands docs.
>> Do you think I'm gonna run into troubles using Solr this way?
>> I'd like to use Solr (for SolrA) for ease of maintenance, because Sys admin are already
trained with Solr
>> 
>> thanks
> 


Mime
View raw message