lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dan Davis <dansm...@gmail.com>
Subject Re: Odp.: solr issue with pdf forms
Date Wed, 22 Apr 2015 15:39:18 GMT
+1 - I like Erick's answer.  Let me know if that turns out to be the
problem - I'm interested in this problem and would be happy to help.

On Wed, Apr 22, 2015 at 11:11 AM, Erick Erickson <erickerickson@gmail.com>
wrote:

> Are they not _indexed_ correctly or not being displayed correctly?
> Take a look at admin UI>>schema browser>> your field and press the
> "load terms" button. That'll show you what is _in_ the index as
> opposed to what the raw data looked like.
>
> When you return the field in a Solr search, you get a verbatim,
> un-analyzed copy of your original input. My guess is that your browser
> isn't using the compatible character encoding for display.
>
> Best,
> Erick
>
> On Wed, Apr 22, 2015 at 7:08 AM,  <Steve.Scholl@t-systems.com> wrote:
> > Thanks for your answer. Maybe my English is not good enough, what are
> you trying to say? Sorry I didn't get the point.
> > :-(
> >
> >
> > -----Ursprüngliche Nachricht-----
> > Von: LAFK [mailto:tomasz.borek@gmail.com]
> > Gesendet: Mittwoch, 22. April 2015 14:01
> > An: solr-user@lucene.apache.org; solr-user@lucene.apache.org
> > Betreff: Odp.: solr issue with pdf forms
> >
> > Out of my head I'd follow how are writable PDFs created and encoded.
> >
> > @LAFK_PL
> >   Oryginalna wiadomość
> > Od: Steve.Scholl@t-systems.com
> > Wysłano: środa, 22 kwietnia 2015 12:41
> > Do: solr-user@lucene.apache.org
> > Odpowiedz: solr-user@lucene.apache.org
> > Temat: solr issue with pdf forms
> >
> > Hi guys,
> >
> > hopefully you can help me with my issue. We are using a solr setup and
> have the following issue:
> > - usual pdf files are indexed just fine
> > - pdf files with writable form-fields look like this:
> >
> Ich�bestätige�mit�meiner�Unterschrift,�dass�alle�Angaben�korrekt�und�vollständig�sind
> >
> > Somehow the blank space character is not indexed correctly.
> >
> > Is this a know issue? Does anybody have an idea?
> >
> > Thanks a lot
> > Best
> > Steve
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message