lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jack Park <>
Subject Re: Flow Chart of Solr
Date Wed, 03 Apr 2013 16:56:56 GMT

Is that new book up to the 4.+ series?

The other Jack

On Wed, Apr 3, 2013 at 9:19 AM, Jack Krupansky <> wrote:
> And another one on the way:
> Hopefully that help a lot as well. Plenty of diagrams. Lots of examples.
> -- Jack Krupansky
> -----Original Message----- From: Jack Park
> Sent: Wednesday, April 03, 2013 11:25 AM
> To:
> Subject: Re: Flow Chart of Solr
> There are three books on Solr, two with that in the title, and one,
> Taming Text, each of which have been very valuable in understanding
> Solr.
> Jack
> On Wed, Apr 3, 2013 at 5:25 AM, Jack Krupansky <>
> wrote:
>> Sure, yes. But... it comes down to what level of detail you want and need
>> for a specific task. In other words, there are probably a dozen or more
>> levels of detail. The reality is that if you are going to work at the Solr
>> code level, that is very, very different than being a "user" of Solr, and
>> at
>> that point your first step is to become familiar with the code itself.
>> When you talk about "parsing" and "stemming", you are really talking about
>> the user-level, not the Solr code level. Maybe what you really need is a
>> cheat sheet that maps a user-visible feature to the main Solr code
>> component
>> for that implements that user feature.
>> There are a number of different forms of "parsing" in Solr - parsing of
>> what? Queries? Requests? Solr documents? Function queries?
>> Stemming? Well, in truth, Solr doesn't even do stemming - Lucene does
>> that.
>> Lucene does all of the "token filtering". Are you asking for details on
>> how
>> Lucene works? Maybe you meant to ask how "term analysis" works, which is
>> split between Solr and Lucene. Or maybe you simply wanted to know when and
>> where term analysis is done. Tell us your specific problem or specific
>> question and we can probably quickly give you an answer.
>> In truth, NOBODY uses "flow charts" anymore. Sure, there are some
>> user-level
>> diagrams, but not down to the code level.
>> If you could focus on specific questions, we could give you specific
>> answers.
>> "Main steps"? That depends on what level you are working at. Tell us what
>> problem you are trying to solve and we can point you to the relevant
>> areas.
>> In truth, if you become generally familiar with Solr at the user level
>> (study the wikis), you will already know what the "main steps" are.
>> So, it is not "main steps of Solr", but main steps of some specific
>> "request" of Solr, and for a specified level of detail, and for a
>> specified
>> area of Solr if greater detail is needed. Be more specific, and then we
>> can
>> be more specific.
>> For now, the general advice for people who need or want to go far beyond
>> the
>> user level is to "get familiar with the code" - just LOOK at it - a lot of
>> the package and class names are OBVIOUS, really, and follow the class
>> hierarchy and code flow using the standard features of any modern Java
>> IDE.
>> If you are wondering where to start for some specific user-level feature,
>> please ask specifically about that feature. But... make a diligent effort
>> to
>> discover and learn on your own before asking open-ended questions.
>> Sure, there are lots of things in Lucene and Solr that are rather complex
>> and seemingly convoluted, and not obvious, but people are more than
>> willing
>> to help you out if you simply ask a specific question. I mean, not
>> everybody
>> needs to know the fine detail of query parsing, analysis, building a
>> Lucene-level stemmer, etc. If we tried to put all of that in a diagram,
>> most
>> people would be more confused than enlightened.
>> At which step are scores calculated? That's more of a Lucene question. Or,
>> are you really asking what code in Solr invokes Lucene search methods that
>> calculate basic scores?
>> In short, you need to be more specific. Don't force us to guess what
>> problem
>> you are trying to solve.
>> -- Jack Krupansky
>> -----Original Message----- From: Furkan KAMACI
>> Sent: Wednesday, April 03, 2013 6:52 AM
>> To:
>> Subject: Re: Flow Chart of Solr
>> So, all in all, is there anybody who can write down just main steps of
>> Solr(including parsing, stemming etc.)?
>> 2013/4/2 Furkan KAMACI <>
>>> I think about myself as an example. I have started to make research about
>>> Solr just for some weeks. I have learned Solr and its related projects.
>>> My
>>> next step writing down the main steps Solr. We have separated learning
>>> curve of Solr into two main categories.
>>> First one is who are using it as out of the box components. Second one is
>>> developer side.
>>> Actually developer side branches into two way.
>>> First one is general steps of it. i.e. document comes into Solr (i.e.
>>> crawled data of Nutch). which analyzing processes are going to done
>>> (stamming, hamming etc.), what will be doing after parsing step by step.
>>> When a search query happens what happens step by step, at which step
>>> scores
>>> are calculated so on so forth.
>>> Second one is more code specific i.e. which handlers takes into account
>>> data that will going to be indexed(no need the explain every handler at
>>> this step) . Which are the analyzer, tokenizer classes and what are the
>>> flow between them. How response handlers works and what are they.
>>> Also explaining about cloud side is other work.
>>> Some of explanations are currently presents at wiki (but some of them are
>>> at very deep places at wiki and it is not easy to find the parent topic
>>> of
>>> it, maybe starting wiki from a top age and branching all other topics as
>>> possible as from it could be better)
>>> If we could show the big picture, and beside of it the smaller pictures
>>> within it, it would be great (if you know the main parts it will be easy
>>> to
>>> go deep into the code i.e. you don't need to explain every handler, if
>>> you
>>> show the way to the developer he/she could debug and find the needs)
>>> When I think about myself as an example, I have to write down the steps
>>> of
>>> Solr a bit detail  even I read many pages at wiki and a book about it, I
>>> see that it is not easy even writing down the big picture of developer
>>> side.
>>> 2013/4/2 Alexandre Rafalovitch <>
>>>> Yago,
>>>> My point - perhaps lost in too much text - was that Solr is presented -
>>>> and
>>>> can function - as a black-box. Which makes it different from more
>>>> traditional open-source project. So, the stage-2 happens exactly when
>>>> the
>>>> non-programmers have to cross the boundary from the black-box into
>>>> code-first approach and the hand-off is not particularly smooth. Or even
>>>> when - say - php or .Net programmer  tries to get beyond the basic
>>>> operations their client library and has the understand the server-side
>>>> aspects of Solr.
>>>> Regards,
>>>>    Alex.
>>>> On Tue, Apr 2, 2013 at 1:19 PM, Yago Riveiro <>
>>>> wrote:
>>>> > Alexandre,
>>>> >
>>>> > You describe the normal path when a beginner try to use a source of
>>>> > code
>>>> > that doesn't understand, black-box, reading code, hacking, ok now I
>>>> > know
>>>> > 10% of the project, with lucky :p.
>>>> >
>>>> Personal blog:
>>>> LinkedIn:
>>>> - Time is the quality of nature that keeps events from happening all at
>>>> once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
>>>> book)

View raw message