lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Huiying Ma <mahuiying...@gmail.com>
Subject Re: posting html files
Date Mon, 03 Aug 2015 18:00:49 GMT
Thanks Erik,

I'm trying to index some html files in the same format and I need to index
them according to classes and tags. I've tried data_driven_schema_configs
but I can only get the title and id but not other tags and classes I
wanted. So now I want to edit the schema in the basic_configs but turned
out that error. So do you have any good idea for me? Also, I also tried to
use bin/post to post an xml file to that same core and it worked so I'm
wondering why the html file won't work. Thank you so much!! Since I don't
know much about solr, it's really good that some one can help!

Best,
Huiying

On Mon, Aug 3, 2015 at 1:54 PM, Erik Hatcher <erik.hatcher@gmail.com> wrote:

> My hunch is that the basic_configs is *too* basic for your needs here.
> basic_configs does not include /update/extract - it’s very basic - stripped
> of all the “extra” components.
>
> Try using the default, data_driven_schema_configs instead.
>
> If you’re still having issues, please provide full details of what you’ve
> tried.
>
> —
> Erik Hatcher, Senior Solutions Architect
> http://www.lucidworks.com <http://www.lucidworks.com/>
>
>
>
>
> > On Aug 3, 2015, at 1:43 PM, Huiying Ma <mahuiying713@gmail.com> wrote:
> >
> > Hi everyone,
> >
> > I created a core with the basic config sets and schema, when I use
> bin/post
> > to post one html file, I got the error:
> >
> > SimplePostTool: WARNING: IOException while reading response:
> > java.io.FileNotFoundException......
> > HTTP ERROR 404
> >
> > when I go to localhost:8983/solr/core/update, I got:
> > <response>
> > <lst name="responseHeader">
> > <int name="status">400</int>
> > <int name="QTime">3<int>
> > </lst>
> > <lst name="error">
> > <str name="msg">missing content stream</str>
> > <int name="code">400</int>
> > </lst>
> > </response>
> >
> > I'm really new to solr and wondering if anyone know how to index html
> files
> > according to my own schema and how to configure the schema.xml or
> > solrconfig file. Thank you so much!
> >
> > Thanks,
> > Huiying
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message