lucenenet-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Troy Howard <>
Subject Re: Lucene.NET Community Status
Date Thu, 04 Nov 2010 06:26:19 GMT
As a follow up, this is as close as I can find in the policy to something
stating that the code must be hosted at ASF (although that is implied
heavily throughout the policy)...

Specifically under "initial code dump", and other code import procedures.


On Wed, Nov 3, 2010 at 11:16 PM, Troy Howard <> wrote:

> Phil,
> I've been unsuccessful at finding the specific reference in the ASF policy
> that covers this, but in a nutshell, yes, the code must be hosted by ASF, as
> well as the websites, docs, etc... This will prevent anything other than a
> mirror or branch existing on
> If the project leaves ASF this is not a concern of course, but will need to
> change it's name.
> There are currently four commiters (taken from
> ):
> George Aroush
> Doug Sale
> Michael Garski
> Based on SVN logs commit activity, DIGY is the most recent committer.
> mgarksi's last commit was in 03/2010, and aroush in 12/2009.
> Currently only George and DIGY are showing interest in the project in the
> mailing lists. I would say that they are the people who would fit the bill
> of "core active project leaders".
> Thanks,
> Troy
> On Wed, Nov 3, 2010 at 9:52 PM, Phil Haack <> wrote:
>> I have a couple of naive questions, so forgive me. I see that Apache
>> projects use SVN
>> But is it required to host Apache projects in this svn? The reason I ask
>> is that a small change to hosting in a forge like would provide
>> the project huge exposure to more .NET developers. You could do this, but
>> keep all other processes the same.
>> Also, it makes keeping documentation really easy since they support
>> MetaWeblog API, and thus Windows Live Writer. It might seem like I'm trying
>> to hawk technology answers to organizational projects, but you'd be
>> surprised by how much reducing the friction of documentation makes people
>> more willing to write documentation. At least, for the projects I work on,
>> I'm more than willing to contribute documentation when I can just point a
>> blog client to it and publish.
>> The other question I ask is who are the core active project leaders for
>> Lucene.NET? I'd really like to understand what they want. I and others have
>> many ideas, but it'd be helpful to understand what direction they want to
>> take things and what things are non-negotiable so we have a framework to
>> work with.
>> Thanks,
>> Phil
>> ________________________________________
>> From: Troy Howard []
>> Sent: Wednesday, November 03, 2010 8:47 PM
>> To:
>> Subject: Re: Lucene.NET Community Status
>> All,
>> I'm entering this conversation late as well. I'll apologize in advance, as
>> I
>> know this will be lengthy.
>> Briefly, I'll list my "credentials" and reasons for concern here:
>>  - I've been using Lucene.Net for many years since the early versions and
>> have built significant products for my company using it. Those products
>> are
>> a core source of our revenue, which is measured in the millions of $$s.
>> The
>> success of my company's products are directly dependent on the success of
>> the Lucene.Net project.
>>  - I run software development at my company and make the final decisions
>> about what we do and how we use our resources. The developers here work on
>> open source code on our clock. I would like to have them start doing this
>> for Lucene.Net. We have very smart and productive people who could be a
>> huge
>> asset to this project. I hope that the opportunity to leverage my
>> company's
>> team will not be bypassed by the people running this project.
>>  - I have hacked extensively on the Lucene.Net internals to improve
>> performance in our product and have been manually maintaining our local
>> branch, merging in changes from the main project. I feel I have enough
>> knowledge of both the CS theory behind search engines and in particular
>> this
>> codebase to not be intimidated by any aspect of the needs of this project.
>>  - I started a similar kind of open source project in that it is a .Net
>> implementation of an existing C++ open source project and struggled with
>> the
>> "syntactic port" vs "conceptual port" issue, and so have perspective to
>> provide on that discussion
>> Relationship To ASF and Lucene
>> -----------------------------------------------
>> I'd like to address one thing upfront: This should definitely remain an
>> Apache Software Foundation project. As Grant and George have stated
>> clearly
>> and accurately, this is a huge benefit for this project in terms of it's
>> credibility. This is not just because the name is well respected. It's
>> because of WHY the Apache name is so well respected: the processes and
>> values of the Foundation set excellent standards which encourages
>> excellent
>> code. This is not just my opinion, but can be objectively proven by the
>> enormous success of the Apache projects. Complying with ASF's standards
>> may
>> be difficult, but it's  extremely valuable.
>> I feel that Grant's recommendation of attempting to become a TLP at Apache
>> is the wrong direction. This should remain part of the Lucene project. It
>> is
>> not unique in any substantial way from Lucene and thus doesn't warrant
>> being
>> separate.
>> Also, there was some mention of Lucene's file format and maintaining that
>> compatibility. This is essential. If this ever changes, Lucene.Net will be
>> useless. Being cross platform and having a very stable on disk format is
>> one
>> of it's most compelling aspects.
>> Microsoft's Interest and Involvement
>> ---------------------------------------------------
>> Another thing to mention: Phil Haack and Scott Hanselman, while both are
>> Microsoft employees, are more than just a representative of the company
>> they
>> work for. They are both outstanding advocates of open source software and
>> have been instrumental in the change of attitude that Microsoft has shown
>> in
>> recent years towards this community. The fact that they have shown
>> interest
>> in this issue doesn't mean Microsoft is interested, it means that this is
>> a
>> significant issue for the .Net open source community. The fact
>> they they work for Microsoft means that they may be able to leverage
>> resources and wield clout from that vantage point that can benefit our
>> community greatly.
>> Regarding the question "What can Microsoft do to help"?.... I'll take a
>> somewhat radical stance here.
>> We need Visual J# not to have been abandoned... We need IronJava, like
>> IronPython or IronRuby. We need a native, MS developed and supported,
>> fully
>> optimized and performant compiler for plain old Java code that runs on the
>> .Net runtime and exposes Java libraries to other .Net languages like F#,
>> C#,
>> VB, etc..
>> There is a huge wealth of open source Java code out there, much of it in
>> the
>> Apache project archives, which would all be "ported" at once. Currently
>> our
>> community only gets access to Lucene.Net and iTextSharp and a few other
>> libraries where dedicated people like George put in hard hours of direct
>> syntax porting to implement these things in C#.
>> We need more than that.
>> I need Hadoop to run in .Net and HDFS, Hbase, Solr, Nutch, Tika, and
>> everything else in that ecosystem.
>> My company is actually at a critical point now, where we are considering
>> abandoning .Net/WCF as our service layer platform, and switching to Java,
>> so
>> that we can leverage those excellent Java projects. Our business needs
>> demand that we have what Hadoop does. It will be easier for me to migrate
>> my
>> application code to Java than to attempt to find equivalent functionality
>> in
>> the existing .Net world or write my own framework, or port Hadoop.
>> So, if there was ONE thing that Microsoft could do to *significantly* help
>> the .Net developer community, it would be providing a *real*
>> implementation
>> of IronJava which would obviate the need to port code completely, and
>> simply
>> allow those libraries and applications to run in .Net natively.
>> That said, assuming that Visual J# remains "retired" (see:
>> ) this project is one of
>> the
>> few things we .Net developers have to work with.
>> Java or .Net Code Idioms
>> -------------------------------------
>> I agree that moving to a codebase that is more .Net idiomatic will both
>> improve the user experience of end users of Lucene.Net but will also
>> improve
>> the level of involvement that we can get from the community. To put it
>> simply, right now, hacking on the Lucene.Net core code means you
>> must understand Java idioms well, and how to translate those to .Net. This
>> is a skill set which is somewhat uncommon.
>> The "direct port" methodology also leads to code that is not fully
>> optimized
>> for .Net. I have changed our local branch in a number of significant ways,
>> and improved performance significantly by doing so. I didn't change APIs,
>> I
>> just change the implementations to be more appropriate for .Net, and
>> included generics.
>> The test suite provided with Lucene/Lucene.Net is a great benefit in that
>> regard, and helped me ensure that my changes didn't break functionality.
>> That said, the project need to improve in this regard. The classes
>> themselves need to be implemented in a more "testable" manner. Abstract
>> base
>> classes instead of interfaces makes the code less mockable and thus less
>> testable. It also makes it harder to implement customized components into
>> the system. There are a number of things that are sealed or internal that
>> do
>> not need to be.
>> Lucene (for Java) was awesome because it ran well as managed code and was
>> elegant and efficient in Java's environment. Any port of Lucene should
>> *retain those features* as well. The library should make sense and be
>> implemented in the most elegant and efficient way that it can be on the
>> platform it's implemented on. Lucene.Net should not be a port of Java
>> Lucene
>> to .Net, it should be an *implementation* of Lucene running in .Net.
>> Porting
>> implies line-for-line similarity. Implementing just implies that the
>> features are all represented.
>> For that reason, I support moving to a more idiomatic .Net implementation,
>> verified by the unit tests. The argument that "it will require smart
>> people"
>> to understand the core code -- that's a *GOOD* requirement. If you don't
>> understand how it works, conceptually, perhaps you should not be
>> attempting
>> to  implementing it. Merely porting or auto-converting code that "seems to
>> be the same" and "passes the unit tests", without really understanding the
>> details is not a safe way to ensure correct operation. What if there was a
>> subtle difference between the two syntaxes which led to differing (ie
>> incorrect) behaviour in some scenarios? What if the unit tests didn't
>> cover
>> that scenario?
>> Regarding the help and support provided by the Lucene community, and the
>> books and examples that provide code samples.. Changing to a more .Net
>> idiomatic codebase, even if that meant top level API changes, would not be
>> a
>> substantial issue that would prevent a .Net developer from understanding
>> example code written in Java. If the API is *basically* the same, but uses
>> foo.Size instead of foo.getSize()/foo.setSize() or List<T> instead of
>> ArrayList... those differences are minor and will not
>> cause significant issues for groking cross-language examples. People will
>> still get it... and .Net developers will be much happier.
>> So, take away is:
>> - My team and I will help hack on Lucene.Net and get paid to do it
>> - Lucene.Net should not change project status
>> - Microsoft should implement IronJava
>> - Moving towards idiomatic .Net code is the direction the project should
>> go
>> and is not that big of a deal
>> Also, as a side-note. We're hiring in the Portland, Oregon area, and could
>> use developers who know Lucene.Net, and want to hack on it on the clock.
>> Send me your resume.
>> Thanks,
>> Troy Howard
>> Director of Software Development | discover-e Legal, LLC |

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message