lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hoss Man (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-2731) CSVResponseWriter should optionally return numfound
Date Fri, 26 Aug 2011 01:05:30 GMT

    [ https://issues.apache.org/jira/browse/SOLR-2731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13091483#comment-13091483
] 

Hoss Man commented on SOLR-2731:
--------------------------------

i think yonik's 1st example would be the best for people loading the data into a spreedsheet
tool or parsing with conventional CSV tools.  (even better then #2 because it's easy to cut/paste
that data into a different sheet and still have clean separation between headers/data. or
parsing with conventional CSV tools)

but i would suggest that if we're at the point of thinking about having a "metadata" section
and a "results" section we shouldn't limit ourselves to two sections.

instead of just including metadata about the main doclist, we could allow arbitrary sections
or arbitrary lengths (like facet counts) ... i haven't thought hard about what the params
should look like, but i would suggest that for easy output parsing a simple 1 row/column row
count prefix value telling you the number of (csv) rows for each "section", followed by the
(csv) rows of data (including a header row for each section if "csv.header=true") would be
easy for people to parse (assuming they were expecting it because they asked for it)

ie...

{noformat}
2
numFound,maxScore,start
103,1.414,100
4
id,score
doc1,1.3
doc2,1.1
doc3,1.05
{noformat}

..or if csv.header=false ...

{noformat}
1
103,1.414,100
3
doc1,1.3
doc2,1.1
doc3,1.05
{noformat}

We can worry about what other "sections" might be supported later as long as the basic param
syntax gets fleshed out ... i would suggest maybe something like:

* multivalued "csv.section" param
* sections are written out in the order that they are passed as param
* default is "csv.section=results"
* if only one value is specified for csv.section, then no row count prefix is used for that
section
* only one other value for csv.section supported initially: "csv.section=results.meta" 
** adds the numFound,maxScore,start for the results



> CSVResponseWriter should optionally return numfound
> ---------------------------------------------------
>
>                 Key: SOLR-2731
>                 URL: https://issues.apache.org/jira/browse/SOLR-2731
>             Project: Solr
>          Issue Type: Improvement
>          Components: Response Writers
>    Affects Versions: 3.1, 3.3, 4.0
>            Reporter: Jon Hoffman
>              Labels: patch
>             Fix For: 3.1.1, 3.3, 4.0
>
>         Attachments: SOLR-2731.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> an optional parameter "csv.numfound=true" can be added to the request which causes the
first line of the response to be the numfound.  This would have no impact on existing behavior,
and those who are interested in that value can simply read off the first line before sending
to their usual csv parser.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message