Return-Path: X-Original-To: apmail-lucene-solr-user-archive@minotaur.apache.org Delivered-To: apmail-lucene-solr-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2EFF918ED3 for ; Wed, 8 Jul 2015 17:49:51 +0000 (UTC) Received: (qmail 97240 invoked by uid 500); 8 Jul 2015 17:49:46 -0000 Delivered-To: apmail-lucene-solr-user-archive@lucene.apache.org Received: (qmail 97172 invoked by uid 500); 8 Jul 2015 17:49:46 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 97158 invoked by uid 99); 8 Jul 2015 17:49:46 -0000 Received: from Unknown (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 08 Jul 2015 17:49:46 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 6E20B1A6BC1 for ; Wed, 8 Jul 2015 17:49:45 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 4.067 X-Spam-Level: **** X-Spam-Status: No, score=4.067 tagged_above=-999 required=6.31 tests=[DC_IMAGE_SPAM_HTML=0.141, DC_IMAGE_SPAM_TEXT=0.123, HTML_IMAGE_RATIO_02=0.805, HTML_MESSAGE=3, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001] autolearn=disabled Received: from mx1-us-east.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id dldtp8rjOANf for ; Wed, 8 Jul 2015 17:49:38 +0000 (UTC) Received: from mail1.bemta12.messagelabs.com (mail1.bemta12.messagelabs.com [216.82.251.5]) by mx1-us-east.apache.org (ASF Mail Server at mx1-us-east.apache.org) with ESMTPS id 0EB3F43BB0 for ; Wed, 8 Jul 2015 17:49:38 +0000 (UTC) Received: from [216.82.249.51] by server-5.bemta-12.messagelabs.com id 6A/3B-22860-BA26D955; Wed, 08 Jul 2015 17:49:31 +0000 X-Env-Sender: MTarala@bh.com X-Msg-Ref: server-16.tower-190.messagelabs.com!1436377769!17839765!1 X-Originating-IP: [20.132.68.17] X-StarScan-Received: X-StarScan-Version: 6.13.16; banners=-,-,- X-VirusChecked: Checked Received: (qmail 21377 invoked from network); 8 Jul 2015 17:49:30 -0000 Received: from srv4.textron.com (HELO TXAINFNWH046.textron.com) (20.132.68.17) by server-16.tower-190.messagelabs.com with DHE-RSA-AES256-SHA encrypted SMTP; 8 Jul 2015 17:49:30 -0000 Received: from TXAMASNWH023.ent.textron.com (txamasnwh023.ent.textron.com [10.244.221.23]) by TXAINFNWH046.textron.com with smtp id 7e5c_066a_49949a59_d741_4ce9_b097_e10af03933de; Wed, 08 Jul 2015 17:49:11 +0000 Received: from TXAMASNWH024.ent.textron.com ([169.254.3.245]) by TXAMASNWH023.ent.textron.com ([169.254.2.178]) with mapi id 14.03.0210.002; Wed, 8 Jul 2015 13:49:23 -0400 From: "Tarala, Magesh" To: "solr-user@lucene.apache.org" Subject: Solr Encoding Issue? Thread-Topic: Solr Encoding Issue? Thread-Index: AdC5pmtFKDBZmeg7T2eMvZIxSCQDfA== Date: Wed, 8 Jul 2015 17:49:21 +0000 Message-ID: <03CB4390D49ADE4384AD3EA707F15DA4A6790F7B@TXAMASNWH024.ent.textron.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: yes X-MS-TNEF-Correlator: x-originating-ip: [10.244.221.235] Content-Type: multipart/related; boundary="_005_03CB4390D49ADE4384AD3EA707F15DA4A6790F7BTXAMASNWH024ent_"; type="multipart/alternative" MIME-Version: 1.0 --_005_03CB4390D49ADE4384AD3EA707F15DA4A6790F7BTXAMASNWH024ent_ Content-Type: multipart/alternative; boundary="_000_03CB4390D49ADE4384AD3EA707F15DA4A6790F7BTXAMASNWH024ent_" --_000_03CB4390D49ADE4384AD3EA707F15DA4A6790F7BTXAMASNWH024ent_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable I'm ingesting a .TXT file with HTML content into Solr. The content has the = following character highlighted below: The file we get from CRM (also attached): [cid:image001.png@01D0B972.75BE23F0] After ingesting into solr, I see a different character. This is query respo= nse from solr management console. [cid:image003.png@01D0B972.D1AED290] Anybody know how I can prevent this from happening? Thanks! --_000_03CB4390D49ADE4384AD3EA707F15DA4A6790F7BTXAMASNWH024ent_ Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

I’m ingesting a .TXT file with HTML content in= to Solr. The content has the following character highlighted below:

The file we get from CRM (also attached):

3D"cid:image001.png@01D=

 

 

After ingesting into solr, I see a different charact= er. This is query response from solr management console.

 

3D"cid:image003.png@01D=

 

 

Anybody know how I can prevent this from happening? =

 

Thanks!

--_000_03CB4390D49ADE4384AD3EA707F15DA4A6790F7BTXAMASNWH024ent_-- --_005_03CB4390D49ADE4384AD3EA707F15DA4A6790F7BTXAMASNWH024ent_--