lucenenet-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Troy Howard <thowar...@gmail.com>
Subject Re: Lucene.NET Community Status
Date Thu, 04 Nov 2010 06:26:19 GMT
As a follow up, this is as close as I can find in the policy to something
stating that the code must be hosted at ASF (although that is implied
heavily throughout the policy)...

http://incubator.apache.org/guides/mentor.html#initial-ip-clearance

Specifically under "initial code dump", and other code import procedures.

<http://incubator.apache.org/guides/mentor.html#initial-import-code-dump>
Thanks,
Troy


On Wed, Nov 3, 2010 at 11:16 PM, Troy Howard <thoward37@gmail.com> wrote:

> Phil,
>
> I've been unsuccessful at finding the specific reference in the ASF policy
> that covers this, but in a nutshell, yes, the code must be hosted by ASF, as
> well as the websites, docs, etc... This will prevent anything other than a
> mirror or branch existing on CodePlex.com.
>
> If the project leaves ASF this is not a concern of course, but will need to
> change it's name.
>
> There are currently four commiters (taken from
> http://lucene.apache.org/lucene.net/ ):
>
> George Aroush george@aroush.net
> Işık YİĞİT (DIGY) digydigy@gmail.com
> Doug Sale dsale@myspace-inc.com
> Michael Garski mgarski@myspace-inc.com
>
>
> Based on SVN logs commit activity, DIGY is the most recent committer.
> mgarksi's last commit was in 03/2010, and aroush in 12/2009.
>
> Currently only George and DIGY are showing interest in the project in the
> mailing lists. I would say that they are the people who would fit the bill
> of "core active project leaders".
>
> Thanks,
> Troy
>
>
> On Wed, Nov 3, 2010 at 9:52 PM, Phil Haack <philha@microsoft.com> wrote:
>
>> I have a couple of naive questions, so forgive me. I see that Apache
>> projects use SVN http://www.apache.org/dev/version-control.html
>>
>> But is it required to host Apache projects in this svn? The reason I ask
>> is that a small change to hosting in a forge like CodePlex.com would provide
>> the project huge exposure to more .NET developers. You could do this, but
>> keep all other processes the same.
>>
>> Also, it makes keeping documentation really easy since they support
>> MetaWeblog API, and thus Windows Live Writer. It might seem like I'm trying
>> to hawk technology answers to organizational projects, but you'd be
>> surprised by how much reducing the friction of documentation makes people
>> more willing to write documentation. At least, for the projects I work on,
>> I'm more than willing to contribute documentation when I can just point a
>> blog client to it and publish.
>>
>> The other question I ask is who are the core active project leaders for
>> Lucene.NET? I'd really like to understand what they want. I and others have
>> many ideas, but it'd be helpful to understand what direction they want to
>> take things and what things are non-negotiable so we have a framework to
>> work with.
>>
>> Thanks,
>> Phil
>>
>>
>>
>>
>> ________________________________________
>> From: Troy Howard [thoward37@gmail.com]
>> Sent: Wednesday, November 03, 2010 8:47 PM
>> To: lucene-net-user@lucene.apache.org
>> Subject: Re: Lucene.NET Community Status
>>
>> All,
>>
>> I'm entering this conversation late as well. I'll apologize in advance, as
>> I
>> know this will be lengthy.
>>
>> Briefly, I'll list my "credentials" and reasons for concern here:
>>
>>  - I've been using Lucene.Net for many years since the early versions and
>> have built significant products for my company using it. Those products
>> are
>> a core source of our revenue, which is measured in the millions of $$s.
>> The
>> success of my company's products are directly dependent on the success of
>> the Lucene.Net project.
>>
>>  - I run software development at my company and make the final decisions
>> about what we do and how we use our resources. The developers here work on
>> open source code on our clock. I would like to have them start doing this
>> for Lucene.Net. We have very smart and productive people who could be a
>> huge
>> asset to this project. I hope that the opportunity to leverage my
>> company's
>> team will not be bypassed by the people running this project.
>>
>>  - I have hacked extensively on the Lucene.Net internals to improve
>> performance in our product and have been manually maintaining our local
>> branch, merging in changes from the main project. I feel I have enough
>> knowledge of both the CS theory behind search engines and in particular
>> this
>> codebase to not be intimidated by any aspect of the needs of this project.
>>
>>  - I started a similar kind of open source project in that it is a .Net
>> implementation of an existing C++ open source project and struggled with
>> the
>> "syntactic port" vs "conceptual port" issue, and so have perspective to
>> provide on that discussion
>>
>>
>> Relationship To ASF and Lucene
>> -----------------------------------------------
>>
>> I'd like to address one thing upfront: This should definitely remain an
>> Apache Software Foundation project. As Grant and George have stated
>> clearly
>> and accurately, this is a huge benefit for this project in terms of it's
>> credibility. This is not just because the name is well respected. It's
>> because of WHY the Apache name is so well respected: the processes and
>> values of the Foundation set excellent standards which encourages
>> excellent
>> code. This is not just my opinion, but can be objectively proven by the
>> enormous success of the Apache projects. Complying with ASF's standards
>> may
>> be difficult, but it's  extremely valuable.
>>
>> I feel that Grant's recommendation of attempting to become a TLP at Apache
>> is the wrong direction. This should remain part of the Lucene project. It
>> is
>> not unique in any substantial way from Lucene and thus doesn't warrant
>> being
>> separate.
>>
>> Also, there was some mention of Lucene's file format and maintaining that
>> compatibility. This is essential. If this ever changes, Lucene.Net will be
>> useless. Being cross platform and having a very stable on disk format is
>> one
>> of it's most compelling aspects.
>>
>>
>> Microsoft's Interest and Involvement
>> ---------------------------------------------------
>>
>> Another thing to mention: Phil Haack and Scott Hanselman, while both are
>> Microsoft employees, are more than just a representative of the company
>> they
>> work for. They are both outstanding advocates of open source software and
>> have been instrumental in the change of attitude that Microsoft has shown
>> in
>> recent years towards this community. The fact that they have shown
>> interest
>> in this issue doesn't mean Microsoft is interested, it means that this is
>> a
>> significant issue for the .Net open source community. The fact
>> they they work for Microsoft means that they may be able to leverage
>> resources and wield clout from that vantage point that can benefit our
>> community greatly.
>>
>> Regarding the question "What can Microsoft do to help"?.... I'll take a
>> somewhat radical stance here.
>>
>> We need Visual J# not to have been abandoned... We need IronJava, like
>> IronPython or IronRuby. We need a native, MS developed and supported,
>> fully
>> optimized and performant compiler for plain old Java code that runs on the
>> .Net runtime and exposes Java libraries to other .Net languages like F#,
>> C#,
>> VB, etc..
>>
>> There is a huge wealth of open source Java code out there, much of it in
>> the
>> Apache project archives, which would all be "ported" at once. Currently
>> our
>> community only gets access to Lucene.Net and iTextSharp and a few other
>> libraries where dedicated people like George put in hard hours of direct
>> syntax porting to implement these things in C#.
>>
>> We need more than that.
>>
>> I need Hadoop to run in .Net and HDFS, Hbase, Solr, Nutch, Tika, and
>> everything else in that ecosystem.
>>
>> My company is actually at a critical point now, where we are considering
>> abandoning .Net/WCF as our service layer platform, and switching to Java,
>> so
>> that we can leverage those excellent Java projects. Our business needs
>> demand that we have what Hadoop does. It will be easier for me to migrate
>> my
>> application code to Java than to attempt to find equivalent functionality
>> in
>> the existing .Net world or write my own framework, or port Hadoop.
>>
>> So, if there was ONE thing that Microsoft could do to *significantly* help
>> the .Net developer community, it would be providing a *real*
>> implementation
>> of IronJava which would obviate the need to port code completely, and
>> simply
>> allow those libraries and applications to run in .Net natively.
>>
>> That said, assuming that Visual J# remains "retired" (see:
>> http://msdn.microsoft.com/en-us/vjsharp/default ) this project is one of
>> the
>> few things we .Net developers have to work with.
>>
>>
>> Java or .Net Code Idioms
>> -------------------------------------
>>
>> I agree that moving to a codebase that is more .Net idiomatic will both
>> improve the user experience of end users of Lucene.Net but will also
>> improve
>> the level of involvement that we can get from the community. To put it
>> simply, right now, hacking on the Lucene.Net core code means you
>> must understand Java idioms well, and how to translate those to .Net. This
>> is a skill set which is somewhat uncommon.
>>
>> The "direct port" methodology also leads to code that is not fully
>> optimized
>> for .Net. I have changed our local branch in a number of significant ways,
>> and improved performance significantly by doing so. I didn't change APIs,
>> I
>> just change the implementations to be more appropriate for .Net, and
>> included generics.
>>
>> The test suite provided with Lucene/Lucene.Net is a great benefit in that
>> regard, and helped me ensure that my changes didn't break functionality.
>> That said, the project need to improve in this regard. The classes
>> themselves need to be implemented in a more "testable" manner. Abstract
>> base
>> classes instead of interfaces makes the code less mockable and thus less
>> testable. It also makes it harder to implement customized components into
>> the system. There are a number of things that are sealed or internal that
>> do
>> not need to be.
>>
>> Lucene (for Java) was awesome because it ran well as managed code and was
>> elegant and efficient in Java's environment. Any port of Lucene should
>> *retain those features* as well. The library should make sense and be
>> implemented in the most elegant and efficient way that it can be on the
>> platform it's implemented on. Lucene.Net should not be a port of Java
>> Lucene
>> to .Net, it should be an *implementation* of Lucene running in .Net.
>> Porting
>> implies line-for-line similarity. Implementing just implies that the
>> features are all represented.
>>
>> For that reason, I support moving to a more idiomatic .Net implementation,
>> verified by the unit tests. The argument that "it will require smart
>> people"
>> to understand the core code -- that's a *GOOD* requirement. If you don't
>> understand how it works, conceptually, perhaps you should not be
>> attempting
>> to  implementing it. Merely porting or auto-converting code that "seems to
>> be the same" and "passes the unit tests", without really understanding the
>> details is not a safe way to ensure correct operation. What if there was a
>> subtle difference between the two syntaxes which led to differing (ie
>> incorrect) behaviour in some scenarios? What if the unit tests didn't
>> cover
>> that scenario?
>>
>> Regarding the help and support provided by the Lucene community, and the
>> books and examples that provide code samples.. Changing to a more .Net
>> idiomatic codebase, even if that meant top level API changes, would not be
>> a
>> substantial issue that would prevent a .Net developer from understanding
>> example code written in Java. If the API is *basically* the same, but uses
>> foo.Size instead of foo.getSize()/foo.setSize() or List<T> instead of
>> ArrayList... those differences are minor and will not
>> cause significant issues for groking cross-language examples. People will
>> still get it... and .Net developers will be much happier.
>>
>>
>> So, take away is:
>> - My team and I will help hack on Lucene.Net and get paid to do it
>> - Lucene.Net should not change project status
>> - Microsoft should implement IronJava
>> - Moving towards idiomatic .Net code is the direction the project should
>> go
>> and is not that big of a deal
>>
>>
>> Also, as a side-note. We're hiring in the Portland, Oregon area, and could
>> use developers who know Lucene.Net, and want to hack on it on the clock.
>> Send me your resume.
>>
>>
>> Thanks,
>>
>> Troy Howard
>> Director of Software Development | discover-e Legal, LLC |
>> thoward37@gmail.com
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message