lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dallan Quass (JIRA)" <j...@apache.org>
Subject [jira] Issue Comment Edited: (SOLR-1069) CSV document and field boosting support
Date Thu, 25 Feb 2010 20:34:28 GMT

    [ https://issues.apache.org/jira/browse/SOLR-1069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12838526#action_12838526
] 

Dallan Quass edited comment on SOLR-1069 at 2/25/10 8:33 PM:
-------------------------------------------------------------

FWIW, I made a few changes to CSVRequestHandler.java, which mainly involve extracting CSVLoader
into a separate public class and making a few variables/functions visible outside the package.
 The attached files show the changes I made.  

Doing this allowed me to create a subclass of CSVLoader that does boosting:

{code}
public class BoostingCSVRequestHandler extends ContentStreamHandlerBase {
   protected ContentStreamLoader newLoader(SolrQueryRequest req, UpdateRequestProcessor processor)
{
      return new BoostingCSVLoader(req, processor);
   }

   //////////////////////// SolrInfoMBeans methods //////////////////////
   @Override
   public String getDescription() {
     return "boost CSV documents";
   }

   @Override
   public String getVersion() {
     return "";
   }

   @Override
   public String getSourceId() {
     return "";
   }

   @Override
   public String getSource() {
     return "";
   }
}

class BoostingCSVLoader extends CSVLoader {
   int boostFieldNum;

   BoostingCSVLoader(SolrQueryRequest req, UpdateRequestProcessor processor) {
      super(req, processor);
   }

   private String[] removeElement(String[] a, int pos) {
      String[] n = new String[a.length-1];
      if (pos > 0) System.arraycopy(a, 0, n, 0, pos);
      if (pos < n.length) System.arraycopy(a, pos+1, n, pos, n.length - pos);
      return n;
   }

   @Override
   protected void prepareFields() {
      boostFieldNum = -1;
      for (int i = 0; i < fieldnames.length; i++) {
         if (fieldnames[i].equals("boost")) {
            boostFieldNum = i;
            break;
         }
      }
      if (boostFieldNum >= 0) {
         fieldnames = removeElement(fieldnames, boostFieldNum);
      }

      super.prepareFields();
   }

   public void addDoc(int line, String[] vals) throws IOException {
      templateAdd.indexedId = null;
      SolrInputDocument doc = new SolrInputDocument();
      if (boostFieldNum >= 0) {
         float boost = Float.parseFloat(vals[boostFieldNum]);
         doc.setDocumentBoost(boost);
         vals = removeElement(vals, boostFieldNum);
      }

      doAdd(line, vals, doc, templateAdd);
   }
}
{code}

      was (Author: dallanq):
    FWIW, I made a few changes to CSVRequestHandler.java, which mainly involve extracting
CSVLoader into a separate public class and making a few variables/functions visible outside
the package.  The attached files show the changes I made.  

Doing this allowed me to create a subclass of CSVLoader that does boosting:

public class BoostingCSVRequestHandler extends ContentStreamHandlerBase {
   protected ContentStreamLoader newLoader(SolrQueryRequest req, UpdateRequestProcessor processor)
{
      return new BoostingCSVLoader(req, processor);
   }

   //////////////////////// SolrInfoMBeans methods //////////////////////
   @Override
   public String getDescription() {
     return "boost CSV documents";
   }

   @Override
   public String getVersion() {
     return "";
   }

   @Override
   public String getSourceId() {
     return "";
   }

   @Override
   public String getSource() {
     return "";
   }
}

class BoostingCSVLoader extends CSVLoader {
   int boostFieldNum;

   BoostingCSVLoader(SolrQueryRequest req, UpdateRequestProcessor processor) {
      super(req, processor);
   }

   private String[] removeElement(String[] a, int pos) {
      String[] n = new String[a.length-1];
      if (pos > 0) System.arraycopy(a, 0, n, 0, pos);
      if (pos < n.length) System.arraycopy(a, pos+1, n, pos, n.length - pos);
      return n;
   }

   @Override
   protected void prepareFields() {
      boostFieldNum = -1;
      for (int i = 0; i < fieldnames.length; i++) {
         if (fieldnames[i].equals("boost")) {
            boostFieldNum = i;
            break;
         }
      }
      if (boostFieldNum >= 0) {
         fieldnames = removeElement(fieldnames, boostFieldNum);
      }

      super.prepareFields();
   }

   public void addDoc(int line, String[] vals) throws IOException {
      templateAdd.indexedId = null;
      SolrInputDocument doc = new SolrInputDocument();
      if (boostFieldNum >= 0) {
         float boost = Float.parseFloat(vals[boostFieldNum]);
         doc.setDocumentBoost(boost);
         vals = removeElement(vals, boostFieldNum);
      }

      doAdd(line, vals, doc, templateAdd);
   }
}

  
> CSV document and field boosting support
> ---------------------------------------
>
>                 Key: SOLR-1069
>                 URL: https://issues.apache.org/jira/browse/SOLR-1069
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Grant Ingersoll
>            Priority: Minor
>         Attachments: CSVLoader.java, CSVRequestHandler.java.diff
>
>
> It would be good if CSV loader could do document and field boosting.  
> I believe this could be handled via additional "special" columns that are tacked on such
as "doc.boost" and <field.name>.boost, which are then filled in with boost values on
a per row basis.  Obviously, this approach would prevent someone having an actual column named
<field.name>.boost, so maybe we can make that configurable as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message