incubator-blur-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrick Hunt <ph...@apache.org>
Subject Re: [VOTE] Merge lucene-4.0.0 branch to master
Date Wed, 24 Oct 2012 18:24:04 GMT
Hi Gagan, I did find the cause, but not a good solution. Relying on
everyone to set their umask is going to be onerous. It would be great
if you could provide a proper solution - the one you suggested sounds
good.

Regards,

Patrick

On Tue, Oct 23, 2012 at 11:53 PM, Gagan Juneja
<gagandeepjuneja@gmail.com> wrote:
> Oops! I missed Patrick's last post.
>
> On Wed, Oct 24, 2012 at 12:07 PM, Gagan Juneja <gagandeepjuneja@gmail.com>wrote:
>
>> I have simulated this issue on ubuntu box. I found that by default ubuntu
>> creates directory with *775 *permissions. And there is one property in
>> Hadoop Configuration named "dfs.datanode.data.dir.perm" and default value
>> for this is *755*. Somewhere in code permissions for data directories are
>> verified and it fails there and then.
>>
>> If we set this property in Configuration object with value *775,* all the
>> test cases are passing and build is Successful.
>>
>> We can set this in *startDfs* method of  *org.apache.blur.MiniCluster*class. Please
verify this, if problem got resolved at your end then I can
>> provide patch for this.
>>
>> Regards,
>> Gagan
>>
>>
>>
>> On Wed, Oct 24, 2012 at 4:32 AM, Patrick Hunt <phunt@apache.org> wrote:
>>
>>> Pushed a small cleanup to move all test file output into respective
>>> target directories and use absolute paths for test file locations.
>>>
>>> I thought this might fix the BlurClusterTest however that's not the case:
>>>
>>> Starting DataNode 0 with dfs.data.dir:
>>>
>>> /home/phunt/dev/blur/src/blur-core/target/tmp/cluster/dfs/data/data1,/home/phunt/dev/blur/src/blur-core/target/tmp/cluster/dfs/data/data2
>>> ERROR 20121023_15:58:10:010_PDT [main] datanode.DataNode: All
>>> directories in dfs.data.dir are invalid.
>>> ERROR 20121023_15:58:10:010_PDT [main] datanode.DataNode: All
>>> directories in dfs.data.dir are invalid.
>>> ERROR 20121023_15:58:10:010_PDT [main] blur.MiniCluster: error opening
>>> file system
>>> java.lang.NullPointerException
>>>         at
>>> org.apache.hadoop.hdfs.MiniDFSCluster.startDataNodes(MiniDFSCluster.java:422)
>>>         at
>>> org.apache.hadoop.hdfs.MiniDFSCluster.&lt;init&gt;(MiniDFSCluster.java:280)
>>>         at
>>> org.apache.hadoop.hdfs.MiniDFSCluster.&lt;init&gt;(MiniDFSCluster.java:124)
>>>
>>> Patrick
>>>
>>> On Tue, Oct 23, 2012 at 2:43 PM, Patrick Hunt <phunt@apache.org> wrote:
>>> > I pushed a small cleanup to versioning in the poms.
>>> >
>>> > Patrick
>>> >
>>> > On Tue, Oct 23, 2012 at 2:38 PM, Patrick Hunt <phunt@apache.org> wrote:
>>> >> I'll work on fixing the tmp issue, that's something I can handle. ;-)
>>> >> Everything should be in target.
>>> >>
>>> >> Patrick
>>> >>
>>> >> On Tue, Oct 23, 2012 at 2:37 PM, Aaron McCurry <amccurry@gmail.com>
>>> wrote:
>>> >>> Hmm, I will take a look at that one next.
>>> >>>
>>> >>> Aaron
>>> >>>
>>> >>> On Tue, Oct 23, 2012 at 5:20 PM, Patrick Hunt <phunt@apache.org>
>>> wrote:
>>> >>>> Thanks Aaron. The other failing test "BlurClusterTest" is somehow
due
>>> >>>> to the directory used. "./tmp/cluster". If I change to
>>> >>>> "file://tmp/cluster" the test passes. Any ideas? Seems somehow
>>> related
>>> >>>> to using relative paths?
>>> >>>>
>>> >>>> Patrick
>>> >>>>
>>> >>>> On Tue, Oct 23, 2012 at 2:13 PM, Aaron McCurry <amccurry@gmail.com>
>>> wrote:
>>> >>>>> Found it, the test did not setup the indexing options correctly.
 I
>>> >>>>> have committed a fix for the test.
>>> >>>>>
>>> >>>>> Aaron
>>> >>>>>
>>> >>>>> On Tue, Oct 23, 2012 at 5:08 PM, Aaron McCurry <amccurry@gmail.com>
>>> wrote:
>>> >>>>>> After cleaning up the test, I have gotten the same NPE.
 Strange
>>> >>>>>> behavior, still working on why.
>>> >>>>>>
>>> >>>>>> Aaron
>>> >>>>>>
>>> >>>>>> On Tue, Oct 23, 2012 at 3:06 PM, Patrick Hunt <phunt@apache.org>
>>> wrote:
>>> >>>>>>> NP. here's the output. I'm on ubuntu 12.04. 1.6.0_26
>>> >>>>>>>
>>> >>>>>>> "mvn clean test" results in: (I also removed the
tmp directories
>>> >>>>>>> manually, btw, we should move this to mvn target
 dir)
>>> >>>>>>>
>>> >>>>>>>
>>> -------------------------------------------------------------------------------
>>> >>>>>>> Test set: org.apache.blur.utils.TermDocIterableTest
>>> >>>>>>>
>>> -------------------------------------------------------------------------------
>>> >>>>>>> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0,
Time elapsed:
>>> 0.005
>>> >>>>>>> sec <<< FAILURE!
>>> >>>>>>> testTermDocIterable(org.apache.blur.utils.TermDocIterableTest)
>>>  Time
>>> >>>>>>> elapsed: 0.005 sec  <<< ERROR!
>>> >>>>>>> java.lang.NullPointerException
>>> >>>>>>>         at
>>> org.apache.blur.utils.TermDocIterable.getNext(TermDocIterable.java:82)
>>> >>>>>>>         at
>>> org.apache.blur.utils.TermDocIterable.access$000(TermDocIterable.java:29)
>>> >>>>>>>         at
>>> org.apache.blur.utils.TermDocIterable$1.<init>(TermDocIterable.java:48)
>>> >>>>>>>         at
>>> org.apache.blur.utils.TermDocIterable.iterator(TermDocIterable.java:47)
>>> >>>>>>>         at
>>> org.apache.blur.utils.TermDocIterableTest.testTermDocIterable(TermDocIterableTest.java:65)
>>> >>>>>>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
>>> Method)
>>> >>>>>>>         at
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>> >>>>>>>         at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>> >>>>>>>         at java.lang.reflect.Method.invoke(Method.java:597)
>>> >>>>>>>         at
>>> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
>>> >>>>>>>         at
>>> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
>>> >>>>>>>         at
>>> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
>>> >>>>>>>         at
>>> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
>>> >>>>>>>         at
>>> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
>>> >>>>>>>         at
>>> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:76)
>>> >>>>>>>         at
>>> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>>> >>>>>>>         at
>>> org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
>>> >>>>>>>         at
>>> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
>>> >>>>>>>         at
>>> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
>>> >>>>>>>         at
>>> org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
>>> >>>>>>>         at
>>> org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
>>> >>>>>>>         at
>>> org.junit.runners.ParentRunner.run(ParentRunner.java:236)
>>> >>>>>>>         at
>>> org.apache.maven.surefire.junit4.JUnit4TestSet.execute(JUnit4TestSet.java:53)
>>> >>>>>>>         at
>>> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:123)
>>> >>>>>>>         at
>>> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:104)
>>> >>>>>>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
>>> Method)
>>> >>>>>>>         at
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>> >>>>>>>         at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>> >>>>>>>         at java.lang.reflect.Method.invoke(Method.java:597)
>>> >>>>>>>         at
>>> org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:164)
>>> >>>>>>>         at
>>> org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:110)
>>> >>>>>>>         at
>>> org.apache.maven.surefire.booter.SurefireStarter.invokeProvider(SurefireStarter.java:175)
>>> >>>>>>>         at
>>> org.apache.maven.surefire.booter.SurefireStarter.runSuitesInProcessWhenForked(SurefireStarter.java:107)
>>> >>>>>>>         at
>>> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:68)
>>> >>>>>>>
>>> >>>>>>>
>>> >>>>>>> On Tue, Oct 23, 2012 at 12:02 PM, Aaron McCurry
<
>>> amccurry@gmail.com> wrote:
>>> >>>>>>>> Sorry, just missed that message.  Hmm, I will
look around and
>>> try to
>>> >>>>>>>> see if I can find something.  Thanks.
>>> >>>>>>>>
>>> >>>>>>>> Aaron
>>> >>>>>>>>
>>> >>>>>>>> On Tue, Oct 23, 2012 at 2:59 PM, Patrick Hunt
<phunt@apache.org>
>>> wrote:
>>> >>>>>>>>> this is null in termdocsitertest
>>> >>>>>>>>>
>>> >>>>>>>>>         DocsEnum termDocs = atomicReader.termDocsEnum(new
>>> Term("id",
>>> >>>>>>>>> Integer.toString(id)));
>>> >>>>>>>>>
>>> >>>>>>>>> due to fields() being null in termDocsEnum
method
>>> >>>>>>>>>
>>> >>>>>>>>> I don't see why yet though. Given the segment
file exists on the
>>> >>>>>>>>> filesystem, etc...
>>> >>>>>>>>>
>>> >>>>>>>>> Patrick
>>> >>>>>>>>>
>>> >>>>>>>>> On Tue, Oct 23, 2012 at 11:50 AM, Aaron
McCurry <
>>> amccurry@gmail.com> wrote:
>>> >>>>>>>>>> Trying to reproduce on Ubuntu.
>>> >>>>>>>>>>
>>> >>>>>>>>>> On Tue, Oct 23, 2012 at 1:58 PM, Patrick
Hunt <
>>> phunt@apache.org> wrote:
>>> >>>>>>>>>>> Hm, I just updated and I'm seeing
two errors (which is 1 less
>>> issue
>>> >>>>>>>>>>> than before):
>>> >>>>>>>>>>>
>>> >>>>>>>>>>>
>>> testTermDocIterable(org.apache.blur.utils.TermDocIterableTest)
>>> >>>>>>>>>>>   org.apache.blur.thrift.BlurClusterTest:
>>> java.lang.NullPointerException
>>> >>>>>>>>>>>
>>> >>>>>>>>>>> Let me look and see if I can at
least determine what the
>>> underlying
>>> >>>>>>>>>>> problems are.
>>> >>>>>>>>>>>
>>> >>>>>>>>>>> Patrick
>>> >>>>>>>>>>>
>>> >>>>>>>>>>> On Tue, Oct 23, 2012 at 10:12 AM,
Aaron McCurry <
>>> amccurry@gmail.com> wrote:
>>> >>>>>>>>>>>> I ran into some errors with
ZookeeperClusterStatusTest tests
>>> and have
>>> >>>>>>>>>>>> resolved the issues I found.
 All units tests pass on OSX, I
>>> have not
>>> >>>>>>>>>>>> had a chance to run them on
Linux yet.  I also fixed the
>>> nasty NPE
>>> >>>>>>>>>>>> exception on the BlurClusterTest
(it was affecting the
>>> functional
>>> >>>>>>>>>>>> tests as well).  I ran a few
burn-in tests on a VM running a
>>> 2
>>> >>>>>>>>>>>> controller + 3 shard server
Blur cluster.  The tests
>>> included loaded
>>> >>>>>>>>>>>> data as fast as possibly while
running searches against that
>>> data as
>>> >>>>>>>>>>>> fast as possible.  The tests
ran without issue (basically
>>> like they
>>> >>>>>>>>>>>> did before the upgrade to Lucene
4).  I feel like the code
>>> is in a
>>> >>>>>>>>>>>> good state at this point.  I'm
going to merge this code to
>>> master and
>>> >>>>>>>>>>>> create another branch to begin
modifying the RPC API.
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> Anyone have any objections?
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> Aaron
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> On Mon, Oct 22, 2012 at 8:29
PM, Patrick Hunt <
>>> phunt@apache.org> wrote:
>>> >>>>>>>>>>>>> On Mon, Oct 22, 2012 at
5:23 PM, Aaron McCurry <
>>> amccurry@gmail.com> wrote:
>>> >>>>>>>>>>>>>> Hmm.
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>> On Mon, Oct 22, 2012
at 8:17 PM, Patrick Hunt <
>>> phunt@apache.org> wrote:
>>> >>>>>>>>>>>>>>> Sounds good to me.
>>> >>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>> Not sure if anyone
else is seeing this but the unit tests
>>> are not
>>> >>>>>>>>>>>>>>> passing for me on
ubuntu. I see one failure and two
>>> errors.
>>> >>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>> Failed tests:
>>> >>>>>>>>>>>>>>>
>>>  testSafeModeSetInFuture(org.apache.blur.manager.clusterstatus.ZookeeperClusterStatusTest)
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>> Haven't seen this.
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>> Tests in error:
>>> >>>>>>>>>>>>>>>
>>> testTermDocIterable(org.apache.blur.utils.TermDocIterableTest)
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>> This either.
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>   org.apache.blur.thrift.BlurClusterTest:
>>> java.lang.NullPointerException
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>> I think I have been
seeing this one during some functional
>>> tests.
>>> >>>>>>>>>>>>>> Haven't figured out
the cause yet, but it seems like it's
>>> a nasty
>>> >>>>>>>>>>>>>> threading problem. 
Because when I drop the mutate threads
>>> back 1
>>> >>>>>>>>>>>>>> everything works fine.
 However the test was passing on
>>> OSX.
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>> Just me or is this
expected?
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>> Not expected.  I'm going
to setup a VM on computer to run
>>> tests in
>>> >>>>>>>>>>>>>> Linux as well.
>>> >>>>>>>>>>>>>
>>> >>>>>>>>>>>>> Ok. Let me know how it goes
and I can try and debug it a
>>> bit, although
>>> >>>>>>>>>>>>> you're running much faster
than I can at this point. ;-)
>>> Definitely
>>> >>>>>>>>>>>>> let me know if you can't
reproduce it and I'll dig into it
>>> for sure.
>>> >>>>>>>>>>>>>
>>> >>>>>>>>>>>>> Patrick
>>> >>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>> Patrick
>>> >>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>> On Sun, Oct 21,
2012 at 10:38 AM, Aaron McCurry <
>>> amccurry@gmail.com> wrote:
>>> >>>>>>>>>>>>>>>> We can fix the
jira issues.
>>> >>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>> On Sun, Oct
21, 2012 at 1:36 PM, Garrett Barton
>>> >>>>>>>>>>>>>>>> <garrett.barton@gmail.com>
wrote:
>>> >>>>>>>>>>>>>>>>> Sounds good
to me Aaron, call it 0.2. Does that mess up
>>> Jira if you have
>>> >>>>>>>>>>>>>>>>> things scheduled
against releases?
>>> >>>>>>>>>>>>>>>>> On Oct 21,
2012 9:44 AM, "Aaron McCurry" <
>>> amccurry@gmail.com> wrote:
>>> >>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>> Ok,
I think it will be some time before all the
>>> changes for the new
>>> >>>>>>>>>>>>>>>>>> api
are in place and fully functional.  So perhaps we
>>> should merge the
>>> >>>>>>>>>>>>>>>>>> lucene-4.0.0
branch into master and fix whatever bugs
>>> are found.  I
>>> >>>>>>>>>>>>>>>>>> did
some system testing yesterday and only found one
>>> big issue.  There
>>> >>>>>>>>>>>>>>>>>> seems
to be a threading problem with the BlurAnalyzer.
>>>  If a single
>>> >>>>>>>>>>>>>>>>>> instance
is in use across multiple threads some weird
>>> behaviors
>>> >>>>>>>>>>>>>>>>>> happen.
 Otherwise everything else seems to work,
>>> normally (I will
>>> >>>>>>>>>>>>>>>>>> create
a jira issue).
>>> >>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>> If we
do merge the lucene-4.0.0 branch, I feel like we
>>> should change
>>> >>>>>>>>>>>>>>>>>> the
version to 0.2.  The reason is, the indexes in
>>> 0.1.x are not going
>>> >>>>>>>>>>>>>>>>>> to be
backwards compatible (at least not with out some
>>> work).  Does
>>> >>>>>>>>>>>>>>>>>> anyone
have any strong feelings on this?
>>> >>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>> Aaron
>>> >>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>> On Sat,
Oct 20, 2012 at 10:10 PM, Gagan Juneja
>>> >>>>>>>>>>>>>>>>>> <gagandeepjuneja@gmail.com>
wrote:
>>> >>>>>>>>>>>>>>>>>> >
I agree with Garrett. We can merge this branch to
>>> the place from where we
>>> >>>>>>>>>>>>>>>>>> >
cut it. Again as Garrett said If we want to keep
>>> only new api thing then
>>> >>>>>>>>>>>>>>>>>> we
>>> >>>>>>>>>>>>>>>>>> >
can merge it to master as well.
>>> >>>>>>>>>>>>>>>>>> >
>>> >>>>>>>>>>>>>>>>>> >
Regards,
>>> >>>>>>>>>>>>>>>>>> >
Gagan
>>> >>>>>>>>>>>>>>>>>> >
>>> >>>>>>>>>>>>>>>>>> >
On Sat, Oct 20, 2012 at 9:50 PM, Garrett Barton <
>>> >>>>>>>>>>>>>>>>>> garrett.barton@gmail.com>wrote:
>>> >>>>>>>>>>>>>>>>>> >
>>> >>>>>>>>>>>>>>>>>> >>
I guess it depends on if your planning a 1.4
>>> release with lucene 4. If
>>> >>>>>>>>>>>>>>>>>> yes
>>> >>>>>>>>>>>>>>>>>> >>
then merge and work towards making everything
>>> functional. If not then
>>> >>>>>>>>>>>>>>>>>> leave
>>> >>>>>>>>>>>>>>>>>> >>
the 1.3.x in master for bug fixing or whatnot and
>>> merge this branch into
>>> >>>>>>>>>>>>>>>>>> >>
the new api one.
>>> >>>>>>>>>>>>>>>>>> >>
On Oct 20, 2012 11:03 AM, "Aaron McCurry" <
>>> amccurry@gmail.com> wrote:
>>> >>>>>>>>>>>>>>>>>> >>
>>> >>>>>>>>>>>>>>>>>> >>
> I think that we can merge the lucene-4.0.0 branch
>>> back into the
>>> >>>>>>>>>>>>>>>>>> >>
> master, since tests and code are compiling.  I
>>> haven't done any
>>> >>>>>>>>>>>>>>>>>> >>
> functional testing yet, but if much of the RPC
>>> and internals are going
>>> >>>>>>>>>>>>>>>>>> >>
> to change I think that it may be a waste of time
>>> to test and fix
>>> >>>>>>>>>>>>>>>>>> >>
> everything that we are about to change.  What do
>>> others think?
>>> >>>>>>>>>>>>>>>>>> >>
>
>>> >>>>>>>>>>>>>>>>>> >>
> Aaron
>>> >>>>>>>>>>>>>>>>>> >>
>
>>> >>>>>>>>>>>>>>>>>> >>
>>> >>>>>>>>>>>>>>>>>>
>>>
>>
>>

Mime
View raw message