lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitry Kan <solrexp...@gmail.com>
Subject Re: MapReduceIndexerTool does not respect Lucene version in solrconfig Was: converting 4.7 index to 4.3.1
Date Fri, 11 Apr 2014 06:42:15 GMT
Thanks! So solr 4.7 does not seem to respect the luceneMatchVersion on the
binary (index) level. Or perhaps, I misunderstand the meaning of the
luceneMatchVersion.

This is what I see when loading index from hdfs via luke and launching the
Index Checker tool:

[clip]
Segments file=segments_2 numSegments=1 version=4.7 format=
userData={commitTimeMSec=1397157712399}
  1 of 1: name=_0 docCount=82
    codec=Lucene46
    compound=false
    numFiles=10
    size (MB)=0.027
    diagnostics = {timestamp=1397157712512, os=Linux,
os.version=3.2.0-61-generic, source=flush, lucene.version=4.7.0 1570806 -
simon - 2014-02-22 08:25:23, os.arch=amd64, java.version=1.7.0_51,
java.vendor=Oracle Corporation}
    no deletions
    test: open reader.........OK
    test: fields..............OK [11 fields]
    test: field norms.........OK [0 fields]
    test: terms, freq, prox...OK [1161 terms; 2949 terms/docs pairs; 2768
tokens]
    test: stored fields.......OK [902 total field count; avg 11 fields per
doc]
    test: term vectors........OK [0 total vector count; avg 0 term/freq
vector fields per doc]
    test: docvalues...........OK [1 docvalues fields; 0 BINARY; 0 NUMERIC;
1 SORTED; 0 SORTED_SET]

No problems were detected with this index.
[/clip]

I wonder whether there is any possibility of defining the version of the
codec in solr config/schema.

Dmitry


On Thu, Apr 10, 2014 at 11:58 PM, Wolfgang Hoschek <whoschek@cloudera.com>wrote:

> There's no such other location in there. BTW, you can disable the mtree
> merge via --reducers=-2 (or --reducers=0 in old versions) .
>
> Wolfgang.
>
> On Apr 10, 2014, at 3:44 PM, Dmitry Kan <solrexpert@gmail.com> wrote:
>
> > a correction: actually when I tested the above change I had so little
> data,
> > that it didn't trigger sub-shard slicing and thus merging of the slices.
> > Still, looks as if somewhere in the map-reduce contrib code there is a
> > "link" to what lucene version to use.
> >
> > Wolfgang, do you happen to know where that other Version.* is specified?
> >
> >
> > On Thu, Apr 10, 2014 at 12:59 PM, Dmitry Kan <solrexpert@gmail.com>
> wrote:
> >
> >> Thanks for responding, Wolfgang.
> >>
> >> Changing to LUCENE_43:
> >>
> >> IndexWriterConfig writerConfig = new
> IndexWriterConfig(Version.LUCENE_43,
> >> null);
> >>
> >> didn't affect on the index format version, because, I believe, if the
> >> format of the index to merge has been of higher version (4.1 in this
> case),
> >> it will merge to the same and not lower version (4.0). But format
> version
> >> certainly could be read from the solrconfig, you are right.
> >>
> >> Dmitry
> >>
> >>
> >> On Wed, Apr 9, 2014 at 11:51 PM, Wolfgang Hoschek <
> whoschek@cloudera.com>wrote:
> >>
> >>> There is a current limitation in that the code doesn't actually look
> into
> >>> solrconfig.xml for the version. We should fix this, indeed. See
> >>>
> >>>
> >>>
> https://github.com/apache/lucene-solr/blob/trunk/solr/contrib/map-reduce/src/java/org/apache/solr/hadoop/TreeMergeOutputFormat.java#L100-101
> >>>
> >>> Wolfgang.
> >>>
> >>> On Apr 8, 2014, at 11:49 AM, Dmitry Kan <solrexpert@gmail.com> wrote:
> >>>
> >>>> Hello,
> >>>>
> >>>> When we instantiate the MapReduceIndexerTool with the collections'
> conf
> >>>> directory, we expect, that the Lucene version is respected and the
> index
> >>>> gets generated in a format compatible with the defined version.
> >>>>
> >>>> This does not seem to happen, however.
> >>>>
> >>>> Checking with luke:
> >>>>
> >>>> the expected Lucene index format: Lucene 4.0
> >>>> the output Lucene index format: Lucene 4.1
> >>>>
> >>>> Can anybody shed some light onto the semantics behind specifying the
> >>> Lucene
> >>>> version in this context? Does this have something to do with what
> >>> version
> >>>> of solr core is used by the morphline library?
> >>>>
> >>>> Thanks,
> >>>>
> >>>> Dmitry
> >>>>
> >>>> ---------- Forwarded message ----------
> >>>>
> >>>> Dear list,
> >>>>
> >>>> We have been generating solr indices with the solr-hadoop contrib
> module
> >>>> (SOLR-1301). Our current solr in use is of 4.3.1 version. Is there any
> >>> tool
> >>>> that could do the backward conversion, i.e. 4.7->4.3.1? Or is the
> >>> upgrade
> >>>> the only way to go?
> >>>>
> >>>> --
> >>>> Dmitry
> >>>> Blog: http://dmitrykan.blogspot.com
> >>>> Twitter: http://twitter.com/dmitrykan
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> Dmitry
> >>>> Blog: http://dmitrykan.blogspot.com
> >>>> Twitter: http://twitter.com/dmitrykan
> >>>
> >>>
> >>
> >>
> >> --
> >> Dmitry
> >> Blog: http://dmitrykan.blogspot.com
> >> Twitter: http://twitter.com/dmitrykan
> >>
> >
> >
> >
> > --
> > Dmitry
> > Blog: http://dmitrykan.blogspot.com
> > Twitter: http://twitter.com/dmitrykan
>
>


-- 
Dmitry
Blog: http://dmitrykan.blogspot.com
Twitter: http://twitter.com/dmitrykan

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message