lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Noble Paul (JIRA)" <j...@apache.org>
Subject [jira] Issue Comment Edited: (SOLR-829) replication Compression
Date Fri, 21 Nov 2008 03:48:44 GMT

    [ https://issues.apache.org/jira/browse/SOLR-829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12648149#action_12648149
] 

noble.paul edited comment on SOLR-829 at 11/20/08 7:48 PM:
-----------------------------------------------------------

After a lot of discussions on SOLR-856 I realize that it is not straight forward to provide
a 'container independent' means to provide compression. We have to document different ways
for different containers to ensure that it works properly. 

 * How important is it to use HTTP standards to achieve this? Consider the fact that nothing
else in the whole solution is complying with any standard
 * For this feature , compression is a critical  . It can mean huge differences in replication
time
 * I am not very comfortable with complex configuration documentation saying do this if you
use jetty or do this if you use resin and this for glassfish etc etc. 
 * How about giving both the options to users and let them choose what they want. This also
gives them the flexibility of doing compression only for replication
 * Power users can use their own favorite configuration to do the compression. 
Something like
{code}
<lst name="slave">
  <!-- values can be internal|external . --> 
  <str name="compression">internal</str>
</lst>
{code}



      was (Author: noble.paul):
    After a lot of discussions on SOLR-856 I realize that it is not straight forward to provide
a 'container independent' means to provide compression. We have to document different ways
for different containers to ensure that it works properly. 

 * How important is it to use HTTP standards to achieve this? Consider the fact that nothing
else in the whole solution is complying with any standard
 * For this feature , compression is a critical  . It can mean huge differences in replication
time
 * I am not very comfortable with complex configuration documentation saying do this if you
use jetty or do this if you use resin and this for glassfish etc etc. 
 * How about giving both the options to users and let them choose what they want. This also
gives them the flexibility of doing compression only for replication
 * Power users can use their own favorite configuration to do the compression. 
Something like
{code}
<lst name="slave">
  <!-- values can be internal|external . --> 
  <str name="compression">internal</str>
  <!-- values can be gzip|deflate default is gzip--> 
  <str name="encoding">gzip</str
</lst>
{code}


  
> replication Compression
> -----------------------
>
>                 Key: SOLR-829
>                 URL: https://issues.apache.org/jira/browse/SOLR-829
>             Project: Solr
>          Issue Type: Improvement
>          Components: replication (java)
>            Reporter: Simon Collins
>            Assignee: Shalin Shekhar Mangar
>         Attachments: email discussion.txt, solr-829.patch, solr-829.patch, solr-829.patch
>
>
> From a discussion on the mailing list solr-user, it would be useful to have an option
to compress the files sent between servers for replication purposes.
> Files sent across between indexes can be compressed by a large margin allowing for easier
replication between sites.
> ...Noted by Noble Paul 
> we will use a gzip on both ends of the pipe . On the slave side you can say <str name="zip">true<str>
as an extra option to compress and send data from server 
> Other thoughts on issue: 
> Do keep in mind that compression is a CPU intensive process so it is a trade off between
CPU utilization and network bandwidth.  I have see cases where compressing the data before
a network transfer ended up being slower than without compression because the cost of compression
and un-compression was more than the gain in network transfer.
> Why invent something when compression is standard in HTTP? --wunder

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message