lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jack Park <jackp...@topicquests.org>
Subject Re: Flow Chart of Solr
Date Wed, 03 Apr 2013 16:56:56 GMT
Jack,

Is that new book up to the 4.+ series?

Thanks
The other Jack

On Wed, Apr 3, 2013 at 9:19 AM, Jack Krupansky <jack@basetechnology.com> wrote:
> And another one on the way:
> http://www.amazon.com/Lucene-Solr-Definitive-comprehensive-realtime/dp/1449359957
>
> Hopefully that help a lot as well. Plenty of diagrams. Lots of examples.
>
> -- Jack Krupansky
>
> -----Original Message----- From: Jack Park
> Sent: Wednesday, April 03, 2013 11:25 AM
>
> To: solr-user@lucene.apache.org
> Subject: Re: Flow Chart of Solr
>
> There are three books on Solr, two with that in the title, and one,
> Taming Text, each of which have been very valuable in understanding
> Solr.
>
> Jack
>
> On Wed, Apr 3, 2013 at 5:25 AM, Jack Krupansky <jack@basetechnology.com>
> wrote:
>>
>> Sure, yes. But... it comes down to what level of detail you want and need
>> for a specific task. In other words, there are probably a dozen or more
>> levels of detail. The reality is that if you are going to work at the Solr
>> code level, that is very, very different than being a "user" of Solr, and
>> at
>> that point your first step is to become familiar with the code itself.
>>
>> When you talk about "parsing" and "stemming", you are really talking about
>> the user-level, not the Solr code level. Maybe what you really need is a
>> cheat sheet that maps a user-visible feature to the main Solr code
>> component
>> for that implements that user feature.
>>
>> There are a number of different forms of "parsing" in Solr - parsing of
>> what? Queries? Requests? Solr documents? Function queries?
>>
>> Stemming? Well, in truth, Solr doesn't even do stemming - Lucene does
>> that.
>> Lucene does all of the "token filtering". Are you asking for details on
>> how
>> Lucene works? Maybe you meant to ask how "term analysis" works, which is
>> split between Solr and Lucene. Or maybe you simply wanted to know when and
>> where term analysis is done. Tell us your specific problem or specific
>> question and we can probably quickly give you an answer.
>>
>> In truth, NOBODY uses "flow charts" anymore. Sure, there are some
>> user-level
>> diagrams, but not down to the code level.
>>
>> If you could focus on specific questions, we could give you specific
>> answers.
>>
>> "Main steps"? That depends on what level you are working at. Tell us what
>> problem you are trying to solve and we can point you to the relevant
>> areas.
>>
>> In truth, if you become generally familiar with Solr at the user level
>> (study the wikis), you will already know what the "main steps" are.
>>
>> So, it is not "main steps of Solr", but main steps of some specific
>> "request" of Solr, and for a specified level of detail, and for a
>> specified
>> area of Solr if greater detail is needed. Be more specific, and then we
>> can
>> be more specific.
>>
>> For now, the general advice for people who need or want to go far beyond
>> the
>> user level is to "get familiar with the code" - just LOOK at it - a lot of
>> the package and class names are OBVIOUS, really, and follow the class
>> hierarchy and code flow using the standard features of any modern Java
>> IDE.
>> If you are wondering where to start for some specific user-level feature,
>> please ask specifically about that feature. But... make a diligent effort
>> to
>> discover and learn on your own before asking open-ended questions.
>>
>> Sure, there are lots of things in Lucene and Solr that are rather complex
>> and seemingly convoluted, and not obvious, but people are more than
>> willing
>> to help you out if you simply ask a specific question. I mean, not
>> everybody
>> needs to know the fine detail of query parsing, analysis, building a
>> Lucene-level stemmer, etc. If we tried to put all of that in a diagram,
>> most
>> people would be more confused than enlightened.
>>
>> At which step are scores calculated? That's more of a Lucene question. Or,
>> are you really asking what code in Solr invokes Lucene search methods that
>> calculate basic scores?
>>
>> In short, you need to be more specific. Don't force us to guess what
>> problem
>> you are trying to solve.
>>
>> -- Jack Krupansky
>>
>> -----Original Message----- From: Furkan KAMACI
>> Sent: Wednesday, April 03, 2013 6:52 AM
>> To: solr-user@lucene.apache.org
>> Subject: Re: Flow Chart of Solr
>>
>>
>> So, all in all, is there anybody who can write down just main steps of
>> Solr(including parsing, stemming etc.)?
>>
>>
>> 2013/4/2 Furkan KAMACI <furkankamaci@gmail.com>
>>
>>> I think about myself as an example. I have started to make research about
>>> Solr just for some weeks. I have learned Solr and its related projects.
>>> My
>>> next step writing down the main steps Solr. We have separated learning
>>> curve of Solr into two main categories.
>>> First one is who are using it as out of the box components. Second one is
>>> developer side.
>>>
>>> Actually developer side branches into two way.
>>>
>>> First one is general steps of it. i.e. document comes into Solr (i.e.
>>> crawled data of Nutch). which analyzing processes are going to done
>>> (stamming, hamming etc.), what will be doing after parsing step by step.
>>> When a search query happens what happens step by step, at which step
>>> scores
>>> are calculated so on so forth.
>>> Second one is more code specific i.e. which handlers takes into account
>>> data that will going to be indexed(no need the explain every handler at
>>> this step) . Which are the analyzer, tokenizer classes and what are the
>>> flow between them. How response handlers works and what are they.
>>>
>>> Also explaining about cloud side is other work.
>>>
>>> Some of explanations are currently presents at wiki (but some of them are
>>> at very deep places at wiki and it is not easy to find the parent topic
>>> of
>>> it, maybe starting wiki from a top age and branching all other topics as
>>> possible as from it could be better)
>>>
>>> If we could show the big picture, and beside of it the smaller pictures
>>> within it, it would be great (if you know the main parts it will be easy
>>> to
>>> go deep into the code i.e. you don't need to explain every handler, if
>>> you
>>> show the way to the developer he/she could debug and find the needs)
>>>
>>> When I think about myself as an example, I have to write down the steps
>>> of
>>> Solr a bit detail  even I read many pages at wiki and a book about it, I
>>> see that it is not easy even writing down the big picture of developer
>>> side.
>>>
>>>
>>> 2013/4/2 Alexandre Rafalovitch <arafalov@gmail.com>
>>>
>>>> Yago,
>>>>
>>>> My point - perhaps lost in too much text - was that Solr is presented -
>>>> and
>>>> can function - as a black-box. Which makes it different from more
>>>> traditional open-source project. So, the stage-2 happens exactly when
>>>> the
>>>> non-programmers have to cross the boundary from the black-box into
>>>> code-first approach and the hand-off is not particularly smooth. Or even
>>>> when - say - php or .Net programmer  tries to get beyond the basic
>>>> operations their client library and has the understand the server-side
>>>> aspects of Solr.
>>>>
>>>> Regards,
>>>>    Alex.
>>>>
>>>> On Tue, Apr 2, 2013 at 1:19 PM, Yago Riveiro <yago.riveiro@gmail.com>
>>>> wrote:
>>>>
>>>> > Alexandre,
>>>> >
>>>> > You describe the normal path when a beginner try to use a source of
>
>>>> > code
>>>> > that doesn't understand, black-box, reading code, hacking, ok now I
>
>>>> > know
>>>> > 10% of the project, with lucky :p.
>>>> >
>>>>
>>>>
>>>> Personal blog: http://blog.outerthoughts.com/
>>>> LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
>>>> - Time is the quality of nature that keeps events from happening all at
>>>> once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
>>>> book)
>>>>
>>>
>>>
>>
>

Mime
View raw message