lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis.gospodne...@gmail.com>
Subject Re: Pointing to Hbase for Docuements or Directly Saving Documents at Hbase
Date Tue, 16 Apr 2013 22:33:00 GMT
Use Solr.  It's pretty clear you don't yet have any problems that
would make you think about alternatives.  Using Solr to store and not
just index will make your life simpler (and your app simpler and
likely faster).

Otis
--
Solr & ElasticSearch Support
http://sematext.com/





On Tue, Apr 16, 2013 at 6:31 PM, Furkan KAMACI <furkankamaci@gmail.com> wrote:
> Thanks again for your answer. If I find any document about such comparisons
> that I would like to read.
>
> By the way, is there any advantage for using Lucene instead of anything
> else as like that:
>
> Using Lucene is naturally supported at Solr and if I use anything else I
> may face with some compatibility problems or communicating issues?
>
>
> 2013/4/17 Otis Gospodnetic <otis.gospodnetic@gmail.com>
>
>> People do use other data stores to retrieve data sometimes. e.g. Mongo
>> is popular for that.  Like I hinted in another email, I wouldn't
>> necessarily recommend this for common cases.  Don't do it unless you
>> really know you need it.  Otherwise, just store in Solr.
>>
>> Otis
>> --
>> Solr & ElasticSearch Support
>> http://sematext.com/
>>
>>
>>
>>
>>
>> On Tue, Apr 16, 2013 at 5:32 PM, Furkan KAMACI <furkankamaci@gmail.com>
>> wrote:
>> > Hi Otis and Jack;
>> >
>> > I have made a research about highlights and debugged code. I see that
>> > highlight are query dependent and not stored. Why Solr uses Lucene for
>> > storing text, I mean i.e. content of a web page. Is there any comparison
>> > about to store texts at Hbase or any other databases versus Lucene.
>> >
>> > Also I want to learn that is there anybody who has used anything else
>> from
>> > Lucene to store text of document at our solr user list?
>> >
>> > 2013/4/11 Otis Gospodnetic <otis.gospodnetic@gmail.com>
>> >
>> >> Source code is your best bet.  Wiki has info about how to use it, but
>> >> not how highlighting is implemented.  But you don't need to understand
>> >> the implementation details to understand that they are dynamic,
>> >> computed specifically for each query for each matching document, so
>> >> you cannot store them anywhere ahead of time.
>> >>
>> >> Otis
>> >> --
>> >> Solr & ElasticSearch Support
>> >> http://sematext.com/
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> On Thu, Apr 11, 2013 at 11:22 AM, Furkan KAMACI <furkankamaci@gmail.com
>> >
>> >> wrote:
>> >> > Hi Otis;
>> >> >
>> >> > It seems that I should read more about highlights. Is there any where
>> >> that
>> >> > explains in detail how highlights are generated at Solr?
>> >> >
>> >> > 2013/4/11 Otis Gospodnetic <otis.gospodnetic@gmail.com>
>> >> >
>> >> >> Hi,
>> >> >>
>> >> >> You can't store highlights ahead of time because they are query
>> >> >> dependent.  You could store documents in HBase and use Solr just
for
>> >> >> indexing.  Is that what you want to do?  If so, a custom
>> >> >> SearchComponent executed after QueryComponent could fetch data
from
>> >> >> external store like HBase.  I'm not sure if I'd recommend that.
>> >> >>
>> >> >> Otis
>> >> >> --
>> >> >> Solr & ElasticSearch Support
>> >> >> http://sematext.com/
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >> On Thu, Apr 11, 2013 at 10:01 AM, Furkan KAMACI <
>> furkankamaci@gmail.com
>> >> >
>> >> >> wrote:
>> >> >> > Actually I don't think to store documents at Solr. I want
to store
>> >> just
>> >> >> > highlights (snippets) at Hbase and I want to retrieve them
from
>> Hbase
>> >> >> when
>> >> >> > needed.
>> >> >> > What do you think about separating just highlights from Solr
and
>> >> storing
>> >> >> > them into Hbase at Solrclod. By the way if you explain at
which
>> >> process
>> >> >> and
>> >> >> > how highlights are genareted at Solr you are welcome.
>> >> >> >
>> >> >> >
>> >> >> > 2013/4/9 Otis Gospodnetic <otis.gospodnetic@gmail.com>
>> >> >> >
>> >> >> >> You may also be interested in looking at things like solrbase
(on
>> >> >> Github).
>> >> >> >>
>> >> >> >> Otis
>> >> >> >> --
>> >> >> >> Solr & ElasticSearch Support
>> >> >> >> http://sematext.com/
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> On Sat, Apr 6, 2013 at 6:01 PM, Furkan KAMACI <
>> >> furkankamaci@gmail.com>
>> >> >> >> wrote:
>> >> >> >> > Hi;
>> >> >> >> >
>> >> >> >> > First of all should mention that I am new to Solr
and making a
>> >> >> research
>> >> >> >> > about it. What I am trying to do that I will crawl
some websites
>> >> with
>> >> >> >> Nutch
>> >> >> >> > and then I will index them with Solr. (Nutch 2.1,
Solr-SolrCloud
>> >> 4.2 )
>> >> >> >> >
>> >> >> >> > I wonder about something. I have a cloud of machines
that crawls
>> >> >> websites
>> >> >> >> > and stores that documents. Then I send that documents
into
>> >> SolrCloud.
>> >> >> >> Solr
>> >> >> >> > indexes that documents and generates indexes and
save them. I
>> know
>> >> >> that
>> >> >> >> > from Information Retrieval theory: it *may* not be
efficient to
>> >> store
>> >> >> >> > indexes at a NoSQL database (they are something like
linked
>> lists
>> >> and
>> >> >> if
>> >> >> >> > you store them in such kind of database you *may*
have a sparse
>> >> >> >> > representation -by the way there may be some solutions
for it.
>> If
>> >> you
>> >> >> >> > explain them you are welcome.)
>> >> >> >> >
>> >> >> >> > However Solr stores some documents too (i.e. highlights)
So some
>> >> of my
>> >> >> >> > documents will be doubled somehow. If I consider
that I will
>> have
>> >> many
>> >> >> >> > documents, that dobuled documents may cause a problem
for me.
>> So is
>> >> >> there
>> >> >> >> > any way not storing that documents at Solr and pointing
to them
>> at
>> >> >> >> > Hbase(where I save my crawled documents) or instead
of pointing
>> >> >> directly
>> >> >> >> > storing them at Hbase (is it efficient or not)?
>> >> >> >>
>> >> >>
>> >>
>>

Mime
View raw message