lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "James Dyer (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-4047) dataimporter.functions.encodeUrl throughs Unable to encode expression: field.name with value: null
Date Tue, 27 Nov 2012 17:01:59 GMT

    [ https://issues.apache.org/jira/browse/SOLR-4047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13504762#comment-13504762
] 

James Dyer commented on SOLR-4047:
----------------------------------

Igor,  I just committed a fix for SOLR-2141 & SOLR-3842 that also includes a test that
demonstrates this issue also.  However, this test passes and I'm not sure anything is actually
broken, at least not on the latest revision in Trunk or Branch_4x.  Note though this test
does not use Tika. However, the code for resolving the Tike URL is similar to the code for
other Entity processors and it should work the same.

See TestVariableResolverEndToEnd, which generates a data-config.xml like this:

{code}
<dataConfig> 
<dataSource name="hsqldb" driver="org.hsqldb.jdbcDriver" url="jdbc:hsqldb:mem:." />

<document name="TestEvaluators"> 
<entity name="FIRST" processor="SqlEntityProcessor" dataSource="hsqldb"  query="select
 1 as id,  'SELECT' as SELECT_KEYWORD,  CURRENT_TIMESTAMP as FIRST_TS from DUAL " >
  <field column="SELECT_KEYWORD" name="select_keyword_s" /> 
  <entity name="SECOND" processor="SqlEntityProcessor" dataSource="hsqldb" transformer="TemplateTransformer"
   query="${dataimporter.functions.encodeUrl(FIRST.SELECT_KEYWORD)}  1 as SORT,  CURRENT_TIMESTAMP
as SECOND_TS,  '${dataimporter.functions.formatDate(FIRST.FIRST_TS, 'yyyy', 'ms_MY')}' as
SECOND1_S,   'PORK' AS MEAT,  'GRILL' AS METHOD,  'ROUND' AS CUTS,  'BEEF_CUTS' AS WHATKIND
from DUAL WHERE 1=${FIRST.ID} UNION ${dataimporter.functions.encodeUrl(FIRST.SELECT_KEYWORD)}
 2 as SORT,  CURRENT_TIMESTAMP as SECOND_TS,  '${dataimporter.functions.formatDate(FIRST.FIRST_TS,
'yyyy', 'ms_MY')}' as SECOND1_S,   'FISH' AS MEAT,  'FRY' AS METHOD,  'SIRLOIN' AS CUTS, 
'BEEF_CUTS' AS WHATKIND from DUAL WHERE 1=${FIRST.ID} ORDER BY SORT ">
   <field column="SECOND_S" name="second_s" /> 
   <field column="SECOND1_S" name="second1_s" /> 
   <field column="second2_s" template="${dataimporter.functions.formatDate(SECOND.SECOND_TS,
'yyyy', 'ms_MY')}" /> 
   <field column="second3_s" template="${dih.functions.formatDate(SECOND.SECOND_TS, 'yyyy',
'ms_MY')}" /> 
   <field column="METHOD" name="${SECOND.MEAT}_s"/>
   <field column="CUTS" name="${SECOND.WHATKIND}_mult_s"/>
  </entity>
</entity>
</document> 
</dataConfig> 
{code}

As you can see the Sql Query on the child entity, instead of having "select", it uses ${dataimporter.functions.encodeUrl(FIRST.SELECT_KEYWORD)},
getting the word "select" from the data in the parent entity.

The response shows it is correctly executing the inner entity:
{code}
  "response":{"numFound":1,"start":0,"docs":[
      {
        "select_keyword_s":"SELECT",
        "id":"1",
        "second3_s":"2012",
        "second2_s":"2012",
        "PORK_s":"GRILL",
        "BEEF_CUTS_mult_s":["ROUND",
          "SIRLOIN"],
        "second1_s":"2012",
        "FISH_s":"FRY",
        "timestamp":"2012-11-27T16:55:39.409Z"}]
  }
{code}

Unless someone can demonstrate this is an actual problem (once again, a good failing unit
test would help a lot), I will close this as "not a problem" in the next week or so.
                
> dataimporter.functions.encodeUrl throughs Unable to encode expression: field.name with
value: null
> --------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-4047
>                 URL: https://issues.apache.org/jira/browse/SOLR-4047
>             Project: Solr
>          Issue Type: Bug
>          Components: contrib - DataImportHandler
>    Affects Versions: 4.0
>         Environment: Windows 7
>            Reporter: Igor Dobritskiy
>            Priority: Critical
>         Attachments: db-data-config.xml, db.sql, schema.xml, solrconfig.xml
>
>
> For some reason dataimporter.functions.encoude URL stopped work after update to solr
4.0 from 3.5.
> Here is the error
> {code}
> Full Import failed:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.solr.handler.dataimport.DataImportHandlerException:
Unable to encode expression: attach.name with value: null Processing Document # 1
> {code}
> Here is the data import config snippet:
> {code}
> ...
>             <entity name="account"
>                     query="select name from accounts where account_id = '${attach.account_id}'">
>                     <entity name="img_index" processor="TikaEntityProcessor" 
>                             dataSource="bin"
>                             format="text" 
>                             url="http://example.com/data/${account.name}/attaches/${attach.item_id}/${dataimporter.functions.encodeUrl(attach.name)}">
>                             <field column="text" name="body" />
>                     </entity> 
>             </entity>
> ...
> {code}
> When I'm changing it to *not* use dataimporter.functions.encodeUrl it works but I need
to url encode file names as they have special chars in theirs names.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message