Return-Path: Delivered-To: apmail-lucene-solr-user-archive@locus.apache.org Received: (qmail 2054 invoked from network); 6 Nov 2008 03:31:35 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 6 Nov 2008 03:31:35 -0000 Received: (qmail 7045 invoked by uid 500); 6 Nov 2008 03:31:38 -0000 Delivered-To: apmail-lucene-solr-user-archive@lucene.apache.org Received: (qmail 7007 invoked by uid 500); 6 Nov 2008 03:31:38 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 6996 invoked by uid 99); 6 Nov 2008 03:31:38 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 05 Nov 2008 19:31:38 -0800 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of noble.paul@gmail.com designates 64.233.182.190 as permitted sender) Received: from [64.233.182.190] (HELO nf-out-0910.google.com) (64.233.182.190) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 06 Nov 2008 03:30:22 +0000 Received: by nf-out-0910.google.com with SMTP id g16so206513nfd.15 for ; Wed, 05 Nov 2008 19:30:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:in-reply-to:mime-version:content-type :content-transfer-encoding:content-disposition:references; bh=nu2zgZWafip6g0gt0wY2gOsdbUQpjERhGRKHAVhQ4Wg=; b=XLP66vxEipBA8t/HgJ8m2/RvxUpkZjtBvidOiJMMaHGgo+tVnVXja3KTUWQPeBYL7G UeXyJFFf7NkEzQaNlgm1axsDskLKu5SLZf7oNxpWVtgroDlOFlMUDeW6R0mPPBUa3zn9 KcB5fuxHSJYzlWVGCwGa4WIsmy7MgW9Z8n3FQ= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:in-reply-to:mime-version :content-type:content-transfer-encoding:content-disposition :references; b=gQZg/tuQWfa8+G+ynVE+HoIHQc5i90sTxG54T3Ia+NvZNmEkIFLil8f34tZpWGbZpW OfP9TlO7gpxCjsNkYvMUvU5SNYrUJ4FTi3J/g3xlSZqlMiHQ9P7wRUtTrfNR08Ldwx3r y+guWkljXb1T7gWjmiywe4G3GBcKZoCeVvS1Y= Received: by 10.210.56.7 with SMTP id e7mr1837044eba.165.1225942190575; Wed, 05 Nov 2008 19:29:50 -0800 (PST) Received: by 10.210.60.3 with HTTP; Wed, 5 Nov 2008 19:29:50 -0800 (PST) Message-ID: <5e76b0ad0811051929w3a7bb62bp452c88a728106b94@mail.gmail.com> Date: Thu, 6 Nov 2008 08:59:50 +0530 From: "=?UTF-8?B?Tm9ibGUgUGF1bCDgtKg=?= =?UTF-8?B?4LWL4LSs4LS/4LSz4LWN4oCNIOCkqOCli+CkrOCljeCks+CljQ==?=" To: solr-user@lucene.apache.org Subject: Re: Regex Transformer Error In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: X-Virus-Checked: Checked by ClamAV on apache.org did you try w/o escaping the '<' characters? On Wed, Nov 5, 2008 at 11:48 PM, Ahmed Hammad wrote: > Hi, > > I am using Solr 1.3 data import handler. One of my table fields has html > tags, I want to strip it of the field text. So obviously I need the Regex > Transformer. > > I added transformer="RegexTransformer" attribute to my entity and a new > field with: > > replaceWith="XXXXX"/> > > Every thing works fine. The text is replace without any problem. The provlem > happend with my regular experession to strip html tags. So I use > regex="<(.|\n)*?>". Of course the charecters '<' and '>' are not allowed in > XML. I tried the following > regex="<(.|\n)*?>" and regex="C;(.|\n)*?E;" but I get the > following error: > > The value of attribute "regex" associated with an element type "field" must > not contain the '<' character. at > com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(Unknown Source) > ... > > The full stack trace is following: > > *FATAL: Could not create importer. DataImporter config invalid > org.apache.solr.common.SolrException: FATAL: Could not create importer. > DataImporter config invalid at > org.apache.solr.handler.dataimport.DataImportHandler.inform(DataImportHandler.java:114) > at > org.apache.solr.handler.dataimport.DataImportHandler.handleRequestBody(DataImportHandler.java:206) > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) > at org.apache.solr.core.SolrCore.execute(SolrCore.java:1204) at > org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:303) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:232) > at > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) > at > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) > at > org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) > at > org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) > at > org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128) > at > org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) > at > org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) > at > org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286) > at > org.apache.coyote.http11.Http11AprProcessor.process(Http11AprProcessor.java:857) > at > org.apache.coyote.http11.Http11AprProtocol$Http11ConnectionHandler.process(Http11AprProtocol.java:565) > at org.apache.tomcat.util.net.AprEndpoint$Worker.run(AprEndpoint.java:1509) > at java.lang.Thread.run(Unknown Source) Caused by: > org.apache.solr.handler.dataimport.DataImportHandlerException: Exception > occurred while initializing context Processing Document # at > org.apache.solr.handler.dataimport.DataImporter.loadDataConfig(DataImporter.java:176) > at > org.apache.solr.handler.dataimport.DataImporter.(DataImporter.java:93) > at > org.apache.solr.handler.dataimport.DataImportHandler.inform(DataImportHandler.java:106) > ... 17 more Caused by: org.xml.sax.SAXParseException: The value of attribute > "regex" associated with an element type "field" must not contain the '<' > character. at > com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(Unknown Source) > at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(Unknown > Source) at > org.apache.solr.handler.dataimport.DataImporter.loadDataConfig(DataImporter.java:166) > ... 19 more * > > *description* *The server encountered an internal error (FATAL: Could not > create importer. DataImporter config invalid > org.apache.solr.common.SolrException: FATAL: Could not create importer. > DataImporter config invalid at > org.apache.solr.handler.dataimport.DataImportHandler.inform(DataImportHandler.java:114) > at > org.apache.solr.handler.dataimport.DataImportHandler.handleRequestBody(DataImportHandler.java:206) > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) > at org.apache.solr.core.SolrCore.execute(SolrCore.java:1204) at > org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:303) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:232) > at > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) > at > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) > at > org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) > at > org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) > at > org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128) > at > org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) > at > org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) > at > org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286) > at > org.apache.coyote.http11.Http11AprProcessor.process(Http11AprProcessor.java:857) > at > org.apache.coyote.http11.Http11AprProtocol$Http11ConnectionHandler.process(Http11AprProtocol.java:565) > at org.apache.tomcat.util.net.AprEndpoint$Worker.run(AprEndpoint.java:1509) > at java.lang.Thread.run(Unknown Source) Caused by: > org.apache.solr.handler.dataimport.DataImportHandlerException: Exception > occurred while initializing context Processing Document # at > org.apache.solr.handler.dataimport.DataImporter.loadDataConfig(DataImporter.java:176) > at > org.apache.solr.handler.dataimport.DataImporter.(DataImporter.java:93) > at > org.apache.solr.handler.dataimport.DataImportHandler.inform(DataImportHandler.java:106) > ... 17 more Caused by: org.xml.sax.SAXParseException: The value of attribute > "regex" associated with an element type "field" must not contain the '<' > character. at > com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(Unknown Source) > at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(Unknown > Source) at > org.apache.solr.handler.dataimport.DataImporter.loadDataConfig(DataImporter.java:166) > ... 19 more ) that prevented it from fulfilling this request.* > > I appreciate your help. > > Regards, > ahmd > -- --Noble Paul