cocoon-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Diego Mateos" <coc...@inidis.com>
Subject Index LUCENE problem with others analyzers.
Date Wed, 20 Jun 2007 06:52:41 GMT
Hi.
I've a xml file with web contents to be indexed through
LuceneIndexTransformer, an example:

<?xml version="1.0" encoding="ISO-8859-1"?>
<lucene:index xmlns:lucene="http://apache.org/cocoon/lucene/1.0"
analyzer="org.apache.lucene.analysis.standard.StandardAnalyzer"
directory="../LUCENE" create="true" merge-factor="20">
<lucene:document url="/quienes-somos/index">
<title lucene:store="true">Quiénes Somos</title>
<content lucene:store="true">	
	  &iquest;Qui&eacute;nes Somos?
		El mundialmente conocido grupo editorial &#8220;Pepe
                  Iglesias.net&#8221;, est&aacute; compuesto por, ... yo. 
                  &iquest;
                  Para qu&eacute; vamos a enga&ntilde;arnos?
                  Ustedes se preguntar&aacute;n: &#8220;&iquest;Y
qui&eacute;n
                  co&ntilde;o es usted?&#8221;, pues para quien le interese,
                  a continuaci&oacute;n lo contar&eacute;, aunque les
aseguro
                  que es un rollo 
                  aburrid&iacute;simo, una estupidez que no merece la menor
pena
                  leer. Si me aceptan un consejo, vayan a los botones de
Art&iacute;culos,
                  Vinos, Recetas, 
                  Asturias gastron&oacute;mica, etc., all&iacute; si hay
chicha.
</content>
</lucene:document>

<!-- more here -->

</lucene:index>
 
The standard analyzer build the index correctly, but I've a problem, the
index not apart special characters when the users accomplish queries, so
that "gastronomía" and "gastronomia" don't provide the same result. Then I
have tried to change analyzer attribute to
org.apache.lucene.analysis.standard.StrandardTokenizer that can become
adapted better what I'm looking for, but when I build the index I've the
following error:

java.lang.NullPointerException
	at
org.apache.lucene.index.DocumentWriter.invertDocument(DocumentWriter.java:14
1)
	at
org.apache.lucene.index.DocumentWriter.addDocument(DocumentWriter.java:81)
	at
org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:307)
	at
org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:294)
	at
org.apache.cocoon.transformation.LuceneIndexTransformer.reindexDocument(Luce
neIndexTransformer.java:429)
	at
org.apache.cocoon.transformation.LuceneIndexTransformer.endElement(LuceneInd
exTransformer.java:323)
	at org.apache.xerces.parsers.AbstractSAXParser.endElement(Unknown
Source)
	at
org.apache.xerces.impl.dtd.XMLNSDTDValidator.endNamespaceScope(Unknown
Source)
	at
org.apache.xerces.impl.dtd.XMLDTDValidator.handleEndElement(Unknown Source)
	at org.apache.xerces.impl.dtd.XMLDTDValidator.endElement(Unknown
Source)
	at
org.apache.xerces.impl.XMLNSDocumentScannerImpl.scanEndElement(Unknown
Source)
	at
org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatc
her.dispatch(Unknown Source)
	at
org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown
Source)
	at org.apache.xerces.parsers.XML11Configuration.parse(Unknown
Source)
	at org.apache.xerces.parsers.XML11Configuration.parse(Unknown
Source)
	at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
	at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
	at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown
Source)
	at
org.apache.excalibur.xml.impl.JaxpParser.parse(JaxpParser.java:315)
	at
org.apache.excalibur.xml.impl.JaxpParser.parse(JaxpParser.java:334)
	at
org.apache.cocoon.components.source.SourceUtil.parse(SourceUtil.java:325)
	at
org.apache.cocoon.generation.FileGenerator.generate(FileGenerator.java:115)
	at
org.apache.cocoon.components.pipeline.impl.AbstractCachingProcessingPipeline
.processXMLPipeline(AbstractCachingProcessingPipeline.java:369)
	at
org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.process(Abs
tractProcessingPipeline.java:480)
	at
org.apache.cocoon.components.treeprocessor.sitemap.SerializeNode.invoke(Seri
alizeNode.java:120)
	at
org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invo
keNodes(AbstractParentProcessingNode.java:46)
	at
org.apache.cocoon.components.treeprocessor.sitemap.PreparableMatchNode.invok
e(PreparableMatchNode.java:130)
	at
org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invo
keNodes(AbstractParentProcessingNode.java:46)
	at
org.apache.cocoon.components.treeprocessor.sitemap.ActTypeNode.invoke(ActTyp
eNode.java:138)
	at
org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invo
keNodes(AbstractParentProcessingNode.java:68)
	at
org.apache.cocoon.components.treeprocessor.sitemap.PipelineNode.invoke(Pipel
ineNode.java:142)
	at
org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invo
keNodes(AbstractParentProcessingNode.java:68)
	at
org.apache.cocoon.components.treeprocessor.sitemap.PipelinesNode.invoke(Pipe
linesNode.java:92)
	at
org.apache.cocoon.components.treeprocessor.ConcreteTreeProcessor.process(Con
creteTreeProcessor.java:234)
	at
org.apache.cocoon.components.treeprocessor.ConcreteTreeProcessor.process(Con
creteTreeProcessor.java:176)
	at
org.apache.cocoon.components.treeprocessor.TreeProcessor.process(TreeProcess
or.java:252)
	at
org.apache.cocoon.components.treeprocessor.sitemap.MountNode.invoke(MountNod
e.java:117)
	at
org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invo
keNodes(AbstractParentProcessingNode.java:46)
	at
org.apache.cocoon.components.treeprocessor.sitemap.PreparableMatchNode.invok
e(PreparableMatchNode.java:130)
	at
org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invo
keNodes(AbstractParentProcessingNode.java:68)
	at
org.apache.cocoon.components.treeprocessor.sitemap.PipelineNode.invoke(Pipel
ineNode.java:142)
	at
org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invo
keNodes(AbstractParentProcessingNode.java:68)
	at
org.apache.cocoon.components.treeprocessor.sitemap.PipelinesNode.invoke(Pipe
linesNode.java:92)
	at
org.apache.cocoon.components.treeprocessor.ConcreteTreeProcessor.process(Con
creteTreeProcessor.java:234)
	at
org.apache.cocoon.components.treeprocessor.ConcreteTreeProcessor.process(Con
creteTreeProcessor.java:176)
	at
org.apache.cocoon.components.treeprocessor.TreeProcessor.process(TreeProcess
or.java:252)
	at org.apache.cocoon.Cocoon.process(Cocoon.java:686)
	at
org.apache.cocoon.servlet.CocoonServlet.service(CocoonServlet.java:1153)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
	at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Application
FilterChain.java:252)
	at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterCh
ain.java:173)
	at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.ja
va:213)
	at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.ja
va:178)
	at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:126
)
	at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:105
)
	at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java
:107)
	at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:148)
	at
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:869)
	at
org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processC
onnection(Http11BaseProtocol.java:664)
	at
org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.jav
a:527)
	at
org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollowerWo
rkerThread.java:80)
	at
org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.jav
a:684)
	at java.lang.Thread.run(Thread.java:595)
	
Also I've tried, for analogy with spanish, with
org.apache.lucene.analysis.br.BrazilianAnalyzer analizer, but I've the same
error.
What can I be failing in?

I use cocoon 2.1.9 builded with java 1.5.0_10

Thanks for any help.
Diego Mateos


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Mime
View raw message