lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shai Erera <ser...@gmail.com>
Subject Re: IndexFileNames
Date Tue, 23 Feb 2010 11:46:09 GMT
I don't think performance is the issue here, but rather correctness. Someone
cannot just ask filename.endsWith(DELETION_EXT) as files like "file1del"
would match as well. So whenever you make such check, you need to add ".".
Again, not performance, but correctness.

If we make it public, then we can document that clearly. I don't mind if we
stay w/o the ".". I just thought that we can correct it, and gain few ticks
of performance as a side effect. But I don't want to argue about it :).

So do you agree to change it to public? I can do that and also make sure
Lucene's code always calls segmentFileName ...

Shai

On Tue, Feb 23, 2010 at 1:31 PM, Michael McCandless <
lucene@mikemccandless.com> wrote:

> Well, there are two issues:
>
>  * We don't always call IFN.segmentFileName -- often we simply
> concatenate strings directly throughout the sources.
>
>  * Should the extensions include '.' or not?
>
> I'd really like to fix the first issue.  If we do that, then there's
> only 1 place that performs string concatenation when building file
> names, and then we are free to alter whether extensions include '.' or
> not, but...
>
> On the 2nd issue, I think it's sort of trivial -- we could have gone
> either way -- kinda like how Intel and Sun could've gone either way
> when they picked byte order ;) -- but since we've been doing no dots
> for so long, I'm not sure we should suddenly change that "convention"
> now.  Performance cost should be miniscule right?  I mean creating
> file names shouldn't be happening in any hot spots...
>
> Mike
>
> On Tue, Feb 23, 2010 at 6:22 AM, Shai Erera <serera@gmail.com> wrote:
> > But segmentFileName performs the concatenation internally:
> >   static String segmentFileName(String segmentName, String ext) {
> >     return segmentName + "." + ext;
> >   }
> > So that would not avoid anything right?
> >
> > And still, if someone needs to know whether a certain file is a core
> Lucene
> > file or his own added file, he'll need to add the ".". I guess I don't
> > understand why not to define the constants with the ".'? Although I
> didn't
> > do a through scan through the code, I didn't find in places I checked
> that a
> > reference to *just* the extension name is needed.
> >
> > And thanks for correcting me on the package-private and back-compat
> thing.
> > In my mind it was already public :).
> >
> > Shai
> >
> > On Tue, Feb 23, 2010 at 1:16 PM, Michael McCandless
> > <lucene@mikemccandless.com> wrote:
> >>
> >> This class makes me somewhat nervous, with the changes coming in flex,
> >> because the extensions are no longer static but rather a function of
> >> the particular codec you're using in the index.  I've changed some of
> >> the constants accordingly (on flex).
> >>
> >> Still, I think it's OK to make it public (flex has in fact done so,
> >> since codecs, in different packages, need to reference the constants),
> >> but mark it as @lucene.internal.
> >>
> >> There is no back compat issue here -- it's package private currently.
> >> Even once it's public with @lucene.internal, there's still no back
> >> compat since it's internal.
> >>
> >> I would love to never do our own concatenation in Lucene's sources,
> >> and instead consistently use IndexFileNames.segmentFileName method
> >> whenever we need to build up a file.  Then it's OK that the extensions
> >> do not include the '.'?
> >>
> >> Mike
> >>
> >> On Tue, Feb 23, 2010 at 1:42 AM, Shai Erera <serera@gmail.com> wrote:
> >> > Hi,
> >> >
> >> > I've noticed that IFN is package private, including its constants and
> >> > methods. Any reason why not make it public? I need to create my own
> >> > Directory, and would like to query whether a file is 'my file' or
> >> > Lucene's
> >> > file. Getting access to the extensions can help. Currently I have to
> put
> >> > my
> >> > code in o.a.l package, which I prefer not to.
> >> >
> >> > Also, I think the constants could be wrong in some situations. They
> only
> >> > include the extension itself, w/o the "." part. That leads to many
> >> > places in
> >> > the code where I see this <code>fname + "." + IFN.extension</code>.
> Two
> >> > problems:
> >> > (1) Doing "." + ext is just a waste - it's really not needed.
> >> > (2) If someone only checks for the extension, without adding the "."
> >> > part,
> >> > he can accidentally think that fnamedel is a deletion file ..
> >> >
> >> > I'd like to patch this, but want to validate with the community before
> I
> >> > create an issue. Making it public is trivial. Changing the extension
> >> > name is
> >> > more problematic for back-compat, but I think currently it's just
> wrong
> >> > ...
> >> > Unless we come up w/ new names for the extensions, which is annoying.
> >> > Can we
> >> > sacrifice back-compat for this thing?
> >> >
> >> > Thanks,
> >> > Shai
> >> >
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-dev-help@lucene.apache.org
> >>
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

Mime
View raw message