ambari-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Olivér Szabó (JIRA) <j...@apache.org>
Subject [jira] [Updated] (AMBARI-21810) Create Utility Script to support Solr Collection Data Retention/Purging/Archiving
Date Sun, 24 Sep 2017 13:53:02 GMT

     [ https://issues.apache.org/jira/browse/AMBARI-21810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Olivér Szabó updated AMBARI-21810:
----------------------------------
    Description: 
In Ambari 3.0, LogSearch will include more fully-featured support in this area, but this current
script will be used in Ambari 2.6.0, as a way to simplify the customer's use cases in the
areas of log data retention, log purging, and log archiving.

The script solrDataManager.py (which is located inside /usr/lib/ambari-infra-solr-client folder)
accepts a mode parameter, which may be delete or save. In both cases the user may specify
the filter field, an end value, or the number of days to keep, and potentially kerberos keytab/principal
for solr. In case of "save" mode the user should specify either arguments for HDFS, S3, or
a local path to save to. The user may also specify the size of the read block ( documents
returned by one solr query ) and the write block ( documents in an output file )

Examples:

Save data from the solr collection hadoop_logs accessible at http://c6401.ambari.apache.org:8886/solr
based on the field logtime, save everything older than 1 day, read 10 documents at once, write
100 documents into a file, and copy the zip files into the local directory /tmp. Do this in
verbose mode:
{code:java}
/usr/bin/python solrDataManager.py -m save -s http://c6401.ambari.apache.org:8886/solr -c
hadoop_logs -f logtime -d 1 -r 10 -w 100 -x /tmp -v
{code}

Save the last 3 days of hadoop_logs into HDFS path "/" with the user hdfs, fetching data from
a kerberized Solr:
{code:java}
/usr/bin/python solrDataManager.py -m save -s http://c6401.ambari.apache.org:8886/solr -c
hadoop_logs -f logtime -d 3 -r 10 -w 100 -k /etc/security/keytabs/ambari-infra-solr.service.keytab
-n infra-solr/c6401.ambari.apache.org@AMBARI.APACHE.ORG -u hdfs -p /
{code}

Delete the data before 2017-08-29T12:00:00.000Z:
{code:java}
/usr/bin/python solrDataManager.py -m delete -s http://c6401.ambari.apache.org:8886/solr -c
hadoop_logs -f logtime -e 2017-08-29T12:00:00.000Z
{code}

> Create Utility Script to support Solr Collection Data Retention/Purging/Archiving
> ---------------------------------------------------------------------------------
>
>                 Key: AMBARI-21810
>                 URL: https://issues.apache.org/jira/browse/AMBARI-21810
>             Project: Ambari
>          Issue Type: Bug
>          Components: ambari-infra
>    Affects Versions: 2.6.0
>            Reporter: Miklos Gergely
>            Assignee: Miklos Gergely
>             Fix For: 2.6.0
>
>         Attachments: AMBARI-21810.patch
>
>
> In Ambari 3.0, LogSearch will include more fully-featured support in this area, but this
current script will be used in Ambari 2.6.0, as a way to simplify the customer's use cases
in the areas of log data retention, log purging, and log archiving.
> The script solrDataManager.py (which is located inside /usr/lib/ambari-infra-solr-client
folder) accepts a mode parameter, which may be delete or save. In both cases the user may
specify the filter field, an end value, or the number of days to keep, and potentially kerberos
keytab/principal for solr. In case of "save" mode the user should specify either arguments
for HDFS, S3, or a local path to save to. The user may also specify the size of the read block
( documents returned by one solr query ) and the write block ( documents in an output file
)
> Examples:
> Save data from the solr collection hadoop_logs accessible at http://c6401.ambari.apache.org:8886/solr
based on the field logtime, save everything older than 1 day, read 10 documents at once, write
100 documents into a file, and copy the zip files into the local directory /tmp. Do this in
verbose mode:
> {code:java}
> /usr/bin/python solrDataManager.py -m save -s http://c6401.ambari.apache.org:8886/solr
-c hadoop_logs -f logtime -d 1 -r 10 -w 100 -x /tmp -v
> {code}
> Save the last 3 days of hadoop_logs into HDFS path "/" with the user hdfs, fetching data
from a kerberized Solr:
> {code:java}
> /usr/bin/python solrDataManager.py -m save -s http://c6401.ambari.apache.org:8886/solr
-c hadoop_logs -f logtime -d 3 -r 10 -w 100 -k /etc/security/keytabs/ambari-infra-solr.service.keytab
-n infra-solr/c6401.ambari.apache.org@AMBARI.APACHE.ORG -u hdfs -p /
> {code}
> Delete the data before 2017-08-29T12:00:00.000Z:
> {code:java}
> /usr/bin/python solrDataManager.py -m delete -s http://c6401.ambari.apache.org:8886/solr
-c hadoop_logs -f logtime -e 2017-08-29T12:00:00.000Z
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message