lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From souravm <SOUR...@infosys.com>
Subject Using Solr with Hadoop ....
Date Fri, 28 Nov 2008 18:31:09 GMT
Hi All,

I have huge number of documents to index (say per hr) and within a hr I cannot compete it
using a single machine. Having them distributed in multiple boxes and indexing them in parallel
is not an option as my target doc size per hr itself can be very huge (3-6M). So I am considering
using HDFS and MapReduce to do the indexing job within time.

In that regard I have following queries regarding using Solr with Hadoop. 

1. After creating the index using Hadoop whether storing them for query purpose again in HDFS
would mean additional performance overhead (compared to storing them in in actual disk in
one machine.) ?

2. What type of change is needed to make Solr wuery read from an index which is stored in
HDFS ?

Regards,
Sourav

**************** CAUTION - Disclaimer *****************
This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely 
for the use of the addressee(s). If you are not the intended recipient, please 
notify the sender by e-mail and delete the original message. Further, you are not 
to copy, disclose, or distribute this e-mail or its contents to any other person and 
any such actions are unlawful. This e-mail may contain viruses. Infosys has taken 
every reasonable precaution to minimize this risk, but is not liable for any damage 
you may sustain as a result of any virus in this e-mail. You should carry out your 
own virus checks before opening the e-mail or attachment. Infosys reserves the 
right to monitor and review the content of all messages sent to or from this e-mail 
address. Messages sent to or from this e-mail address may be stored on the 
Infosys e-mail system.
***INFOSYS******** End of Disclaimer ********INFOSYS***

Mime
View raw message