commons-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Simone Tripodi <simonetrip...@apache.org>
Subject Re: [digester] How to deal with flexible XML ?
Date Sun, 27 Feb 2011 08:55:59 GMT
Hi Patrick,
I quickly had a look at your code and l didn't see anything wrong, the
Digester should work either the <geo> tag is empty or not.

When you will have documents such

<doc>
..
<geo></geo>
</doc>

the `collection/doc/geo/(latitude|longitude)` pattern will never
match, so set(Latitude|Longitude) methods won't be invoked.
I can suggest you 2 options:

 * quick solution: when building the Lucene document, check if the
latitude/longitude is not null before setting it

    if (flickrDoc.getLatitude() != null) {
        document.add(new Field("latitude", flickrDoc.getLatitude(),
Field.Store.YES, Field.Index.ANALYZED));
    }

 * a little more complex - but more efficient - solution I wrote for
you and paste on[1], it parses & index the document into Lucene
Document in one shot; the LuceneFieldRule is parametrized just in case
you need to configure the Lucene Field depending on the matching
pattern.

HTH,
Simo

[1] http://pastie.org/1612471

http://people.apache.org/~simonetripodi/
http://www.99soft.org/



On Fri, Feb 25, 2011 at 9:21 PM, Patrick Diviacco
<patrick.diviacco@gmail.com> wrote:
> hi,
>
> I need to understand how to deal changing xml fields such as these ones:
>
> <doc>
> ..
> <geo></geo>
> </doc>
>
> <doc>
> ..
> <geo>
>  <latitude>2432</latitude>
>  <longitude>2342</longitude>
> </geo>
> </doc>
>
> As you can see geo element can be empty or parent element. I need to
> build an apposite parser to deal with it. THis is my current code, but
> I get error since latitude not always works...
> http://codepad.org/jpKXmGZq
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org


Mime
View raw message