lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pulkit Singhal <pulkitsing...@gmail.com>
Subject Re: How to skip fields when using DIH?
Date Tue, 20 Sep 2011 22:50:04 GMT
OMG, I'm so sorry, please ignore.

Its so simple, just had to use:
row.remove( 'salesRankShortTerm' );
because the script runs at the end after the entire entity has been
processed (I suppose) rather than per field.

Thanks!

On Tue, Sep 20, 2011 at 5:42 PM, Pulkit Singhal <pulkitsinghal@gmail.com> wrote:
> The data I'm running through the DIH looks like:
>
> <products>
>  <product>
>    <new>false</new>
>    <active>false</active>
>    <regularPrice>349.99</regularPrice>
>    <salesRankShortTerm/>
>  </product>
> </products>
>
> As you can see, in this particular instance of a product, there is no
> value for "salesRankShortTerm" which happens to be defined in my
> schema like so:
> <field name="salesRankShortTerm" type="slong"  indexed="true"  stored="true" />
>
> Having an empty value in the incoming DIH data leads to an exception:
> Caused by: java.lang.NumberFormatException: For input string: ""
>
> 1) How can I skip this field if its empty?
>
> If I use script transformer like so:
>  <script>
>        <![CDATA[
>        function skipRow(row) {
>            var salesRankShortTerm = row.get( 'salesRankShortTerm' );
>            if ( salesRankShortTerm == null || salesRankShortTerm == '' ) {
>                row.put( '$skipRow', 'true' );
>            }
>            return row;
>        }
>        ]]>
>  </script>
> THEN, I will end up skipping the entire document :(
>
> 2) So please help me understand how I can configure it to only skip a
> field and not the document?
>
> Thanks,
> - Pulkit
>

Mime
View raw message