lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lance Norskog <goks...@gmail.com>
Subject Re: Is Solr right for my business situation ?
Date Thu, 30 Sep 2010 03:09:31 GMT
Some of these are big questions- try them in different emails.

On Wed, Sep 29, 2010 at 9:40 AM, Sharma, Raghvendra
<sraghvendra@corelogic.com> wrote:
> Some questions.
>
> 1. I have about 3-5 tables. Now designing schema.xml for a single table looks ok, but
whats the direction for handling multiple table structures is something I am not sure about.
Would it be like a big huge xml, wherein those three tables (assuming its three) would show
up as three different tag-trees, nullable.
>
> My source provides me a single flat file per table (tab delimited).
>
> Do you think having multiple indexes could be a solution for this case ?? or do I really
need to spend effort in denormalizing the data ?
>
> 2. Further, loading into solr can use some perf tuning.. any tips ? best practices ?
>
> 3. Also, is there a way to specify a xslt at the server side, and make it default, i.e.
whenever a response is returned, that xslt is applied to the response automatically...
>
> 4. And last question for the day - :) there was one post saying that the spatial support
is really basic in solr and is going to be improved in next versions... Can you ppl help me
get a definitive yes or no on spatial support... in the current form, does it work on not
? I would store lat and long, and would need to make them searchable...
>
> --raghav..
>
> -----Original Message-----
> From: Sharma, Raghvendra [mailto:sraghvendra@corelogic.com]
> Sent: Tuesday, September 28, 2010 11:45 AM
> To: solr-user@lucene.apache.org
> Subject: RE: Is Solr right for my business situation ?
>
> Thanks for the responses people.
>
> @Grant
>
> 1. can you show me some direction on that.. loading data from an incoming stream.. do
I need some third party tools, or need to build something myself...
>
> 4. I am basically attempting to build a very fast search interface for the existing data.
The volume I mentioned is more like static one (data is already there). The sql statements
I mentioned are daily updates coming. The good thing is that the history is not there, so
the overall volume is not growing, but I need to apply the update statements.
>
> One workaround I had in mind is, (though not so great performance) is to apply the updates
to a copy of rdbms, and then feed the rdbms extract to solr.  Sounds like overkill, but I
don't have another idea right now. Perhaps business discussions would yield something.
>
> @All -
>
> Some more questions guys.
>
> 1. I have about 3-5 tables. Now designing schema.xml for a single table looks ok, but
whats the direction for handling multiple table structures is something I am not sure about.
Would it be like a big huge xml, wherein those three tables (assuming its three) would show
up as three different tag-trees, nullable.
>
> My source provides me a single flat file per table (tab delimited).
>
> 2. Further, loading into solr can use some perf tuning.. any tips ? best practices ?
>
> 3. Also, is there a way to specify a xslt at the server side, and make it default, i.e.
whenever a response is returned, that xslt is applied to the response automatically...
>
> 4. And last question for the day - :) there was one post saying that the spatial support
is really basic in solr and is going to be improved in next versions... Can you ppl help me
get a definitive yes or no on spatial support... in the current form, does it work on not
? I would store lat and long, and would need to make them searchable...
>
> Looks like I m close to my solution.. :)
>
> --raghav
>
> -----Original Message-----
> From: Grant Ingersoll [mailto:gsingers@apache.org]
> Sent: Tuesday, September 28, 2010 1:05 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Is Solr right for my business situation ?
>
> Inline.
>
> On Sep 27, 2010, at 1:26 PM, Walter Underwood wrote:
>
>> When do you need to deploy?
>>
>> As I understand it, the spatial search in Solr is being rewritten and is slated for
Solr 4.0, the release after next.
>
> It will be in 3.x, the next release
>
>>
>> The existing spatial search has some serious problems and is deprecated.
>>
>> Right now, I think the only way to get spatial search in Solr is to deploy a nightly
snapshot from the active development on trunk. If you are deploying a year from now, that
might change.
>>
>> There is not any support for SQL-like statements or for joins. The best practice
for Solr is to think of your data as a single table, essentially creating a view from your
database. The rows become Solr documents, the columns become Solr fields.
>
> There is now group-by capabilities in trunk as well, which may or may not help.
>
>>
>> wunder
>>
>> On Sep 27, 2010, at 9:34 AM, Sharma, Raghvendra wrote:
>>
>>> I am sure these kind of questions keep coming to you guys, but I want to raise
the same question in a different context...my own business situation.
>>> I am very very new to solr and though I have tried to read through the documentation,
I have nowhere near completing the whole read.
>>>
>>> The need is like this -
>>>
>>> We have a huge rdbms database/table. A single table perhaps houses 100+ million
rows. Though oracle is doing a fine job of handling the insertion and updation of data, the
querying is where our main concerns lie.  Since we have spatial data, the index building
takes hours and hours for such tables.
>>>
>>> That's when we thought of moving away from standard rdbms and thought of trying
something different and fast.
>>> My last week has been spent in a journey reading through bigtable to hadoop to
hbase, to hive and then finally landed on solr. As far as I am in my tests, it looks pretty
good, but I have a few unanswered questions still. Trying this group for them  :)  (I am
sure I can find some answers if I read/google more on the topic, but now I m being lazy and
feel asking the people who are already using it/or perhaps developing it is a better bet).
>>>
>>> 1. Can I get my solr instance to load data (fresh data for indexing) from a stream
(imagine a mq kind of queue, or similar) ?
>
> Yes, with a little bit of work.
>
>>> 2. Can I host my solr instance to use hbase as the database/file system (read
HDFS) ?
>
> Probably, but I doubt it will be fast.  Local disk is usually the best.  100+ M rows
is large but not unreasonable.
>
>>> 3. are there somewhere any reports available (as in benchmarks ) for a solr instance's
performance ?
>
> You can probably search the web for these.  I've personally seen several installs w/
1B+ docs and subsecond search and faceting and heard of others.  You might look at the stuff
the Hathi trust has put up.
>
>>> 4. are there any APIs available which might help me apply ANSI sql kind of statements
to my solr data ?
>
> No.  Question back?  What kinds of things are you trying to do?
>
>>>
>>> It would be great if people could help share their experience in the area...
if it's too much trouble writing all of it, perhaps url would be easier... I welcome all kinds
of help here... any advice/suggestions are good ...
>>>
>>> Looking forward to your viewpoints..
>>>
>>> --raghav..
>>> ******************************************************************************************
>>> This message may contain confidential or proprietary information intended only
for the use of the
>>> addressee(s) named above or may contain information that is legally privileged.
If you are
>>> not the intended addressee, or the person responsible for delivering it to the
intended addressee,
>>> you are hereby notified that reading, disseminating, distributing or copying
this message is strictly
>>> prohibited. If you have received this message by mistake, please immediately
notify us by
>>> replying to the message and delete the original message and any copies immediately
thereafter.
>>>
>>> Thank you.
>>> ******************************************************************************************
>>> CLLD
>>>
>>
>>
>>
>>
>
> --------------------------
> Grant Ingersoll
> http://lucenerevolution.org Apache Lucene/Solr Conference, Boston Oct 7-8
>
>



-- 
Lance Norskog
goksron@gmail.com

Mime
View raw message