lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jason Cater <>
Subject Re: Commented: (SOLR-216) Improvements to
Date Tue, 29 May 2007 19:41:37 GMT

I've had my in production use internally for about a month now. 
So, as you can imagine, I've worked through a few oddball bugs that 
occasionally pop up.  It seems pretty stable for me.

I'm planning to upload a new file attachment to this issue containing my 
changes, plus fixing the bug reports that were filed against my open 
ticket.  But first, a few quick questions....

I would prefer to have a complete directory structure (i.e.,, 
unit tests, samples, etc) instead of just the file.  Would 
anyone see a problem with this?

Also, on some of your comments:

>  - list comprehensions solely to perform looped execution are harder to parse and slower
than explicitly writing a for loop

List comprehensions seem to be a matter of contention for some.  
However, it's a battle I'm not interested in fighting, so changed it to 
a for loop.

>  - shadowing builtins is generally a bad idea

Any shadowing of builtins was unintentional.  Did you see specific 
examples?  I run the code through pychecker and pylint to try to avoid 
such cases.

>  - SolrConnection is an old-style class, but Response is new-style 

This was a holdover from the old SolrConnection class I copied from. I'm 
fixing this.

> functionality:
>  - why are 'status'/'QTime' returned as floats?

This was just a misunderstanding on my part of what QTime was actually 
returning.  Fixing.

>  - all NamedList's appearing in the output are converted to dicts--this loses information
(in particular, it will be unnecessarily hard for the user to use highlighting/debug data).
 Using the python/json response format would prevent this.  Not returning highlight/debug
data in the standard response format (and yet providing said parameters in the query() method)
seems odd.  Am I missing something?  Oh, they are set as dynamic attributes of Response, I
see.  Definitely needs documentation.

Yes, this needs to be documented.  (Please c.f. to my question about 
allowing a complete directory structure.)

>  - passing fields='' to query() will return all fields, when the desired return is likely
no fields

I've changed the default for fields= to be '*', instead of None or "".  
This way, passing in 'fields=""' will result in 'fl=' being passed to 
the backend.  However, I still don't see the point, as passing both 
'fl=' and 'fl=*' return the exact same set of fields (i.e., "all") on my 
test setup.

>  - it might be better to settle on an api that permits doc/field boosts.  How about using
a tuple as the field name in the field dict?
> conn.add_many([{'id': 1, ('field2', 2.33): u"some text"}])
> doc boosts could be handled by optionally providing the fielddict as a (<fielddict>,
boost) tuple.

I agree. I was not aware of field boosts at the time. I'll code this change.

> - for 2.5+, a cool addition might be:
> if sys.version > 2.5
>    import contextlib   
>    def batched(solrconn):
>           solrconn.begin_batch()
> 	yield solrconn
> 	solrconn.end_batch()
>   batched = contextlib.contextmanager(batched)
> Use as:
> with batched(solrconn):
>        solrconn.add(...)
>        solrconn.add(...)
>        solrconn.add(...)


View raw message