lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Simon Willnauer (JIRA)" <j...@apache.org>
Subject [jira] [Issue Comment Edited] (LUCENE-2793) Directory createOutput and openInput should take an IOContext
Date Mon, 30 May 2011 00:19:47 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-2793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13040906#comment-13040906
] 

Simon Willnauer edited comment on LUCENE-2793 at 5/30/11 12:19 AM:
-------------------------------------------------------------------

bq. I already posted a patch to this issue a while back...

Jason I appreciate that you are still around on this issue. Varun is doing his GSoC project
on this issue and others so there might be some duplication here and there or similarities
with you patch but in this case this is ok though. He needs to get started as well as getting
a feeling how things work here. So I hope you don't mind. I can only encourage you to help
reviewing and commenting!


bq. If this is the right way to go ?

Thanks for the patch man! Good to get started eventually. here are some comments for you patch:

* When I look at IOContext I think it should be a little more sophisticated. What I have in
mind is something similar to ScorerContext in org.apache.lucene.search.Weight.java. Some kind
of a copy-on-write builder pattern that lets us provide some defaults for Merging, Searching
and Indexing so by default we can reuse the same instance in many places. If we follow a this
copy on write model we could also make this class non-final to let people customize the context
if they needs to.
* I am not sure if we really need the enumeration in IO Context unless me make decisions based
on what an input / output is used for. IMO it might make more sense to have default instances
for Searching Indexing and Merging and set the flags like SEQUENTIAL and the buffers to good
defaults and only tweak them when we are aware of the context ie. when we pull the input /
output. The most of the usecases need good defaults and only some need tweaks but if we hold
the use-case information on IOContext some directories might want to be very smart and this
might duplicate code. 
* I wonder if it makes sense to bind the IO Context instance to the directory and add a factory
to the directory like Directory#getIOContext(Merge|Search|Index) to enable the directory impl
to set platform specific defaults. I think that would make things easier and customization
would be straight forward.
* You should add a space after a comma in the arguments list like here: int bufferSize,IOContext
context) 
* Enum elements should be CamelCase and start with a leading Uppercase character




      was (Author: simonw):
    bq. I already posted a patch to this issue a while back...

Jason I appreciate that you are still around on this issue. Varun is doing his GSoC project
on this issue and others so there might be some duplication here and there or similarities
with you patch but in this case this is ok though. He needs to get started as well as getting
a feeling how things work here. So I hope you don't mind. I can only encourage you to help
reviewing and commenting!


bq. If this is the right way to go ?

Thanks for the patch man! Good to get started eventually. here are some comments for you patch:

* When I look at IOContext I think it should be a little more sophisticated. What I have in
mind is something similar to ScorerContext in org.apache.lucene.search.Weight.java. Some kind
of a copy-on-write builder pattern that lets us provide some defaults for Merging, Searching
and Indexing so by default we can reuse the same instance in many places. If we follow a this
copy on write model we could also make this class non-final to let people customize the context
if they needs to.

* I am not sure if we really need the enumeration in IO Context unless me make decisions based
on what an input / output is used for. IMO it might make more sense to have default instances
for Searching Indexing and Merging and set the flags like SEQUENTIAL and the buffers to good
defaults and only tweak them when we are aware of the context ie. when we pull the input /
output. The most of the usecases need good defaults and only some need tweaks but if we hold
the use-case information on IOContext some directories might want to be very smart and this
might duplicate code. 
* I wonder if it makes sense to bind the IO Context instance to the directory and add a factory
to the directory like Directory#getIOContext(Merge|Search|Index) to enable the directory impl
to set platform specific defaults. I think that would make things easier and customization
would be straight forward.
* You should add a space after a comma in the arguments list like here: int bufferSize,IOContext
context) 
* Enum elements should be CamelCase and start with a leading Uppercase character



  
> Directory createOutput and openInput should take an IOContext
> -------------------------------------------------------------
>
>                 Key: LUCENE-2793
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2793
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/store
>            Reporter: Michael McCandless
>            Assignee: Simon Willnauer
>              Labels: gsoc2011, lucene-gsoc-11, mentor
>         Attachments: LUCENE-2793.patch, LUCENE-2793.patch
>
>
> Today for merging we pass down a larger readBufferSize than for searching because we
get better performance.
> I think we should generalize this to a class (IOContext), which would hold the buffer
size, but then could hold other flags like DIRECT (bypass OS's buffer cache), SEQUENTIAL,
etc.
> Then, we can make the DirectIOLinuxDirectory fully usable because we would only use DIRECT/SEQUENTIAL
during merging.
> This will require fixing how IW pools readers, so that a reader opened for merging is
not then used for searching, and vice/versa.  Really, it's only all the open file handles
that need to be different -- we could in theory share del docs, norms, etc, if that were somehow
possible.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message