cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anubhav Kale (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CASSANDRA-10580) When mutations are dropped, the column family should be printed / have a counter per column family
Date Wed, 28 Oct 2015 03:50:27 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-10580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Anubhav Kale updated CASSANDRA-10580:
-------------------------------------
    Description: 
In our production cluster, we are seeing a large number of dropped mutations. At a minimum,
we should print the time the thread took to get scheduled thereby dropping the mutation. This
will help find the right tuning parameter for write_timeout_in_ms. 

The change will need to be done in StorageProxy.java and MessagingTask.java. It is easy, and
I will submit a patch shortly.



  was:
In our production cluster, we are seeing a large number of dropped mutations. It would be
really helpful to see which column families are really affected by this (either through logs
or through a dedicated counter for every column family).

I have made a hack in StorageProxy (below) to help us with this. I am happy to extend this
to a better solution (print the CF affected in as logger.debug and then manually grep) if
experts agree this additional detail would be helpful in general. Any other suggestions are
welcome.

    private static abstract class LocalMutationRunnable implements Runnable
    {
        private final long constructionTime = System.currentTimeMillis();
        
        private IMutation mutation;

        public final void run()
        {
            if (System.currentTimeMillis() > constructionTime + 2000L)
            {
                long timeTaken = System.currentTimeMillis() - constructionTime;
                logger.warn("Anubhav LocalMutationRunnable thread ran after " + timeTaken);
                    
                try
                {
                	 for(ColumnFamily family : this.mutation.getColumnFamilies())
                     {
                		if (family.toString().toLowerCase().contains("udsuserdailysnapshot"))
                     	{
                     		MessagingService.instance().incrementDroppedMessages(MessagingService.Verb.USERDAILY);
                     	}
                     	
                     	else if (family.toString().toLowerCase().contains("udsuserhourlysnapshot"))
                     	{
                     		MessagingService.instance().incrementDroppedMessages(MessagingService.Verb.USERHOURLY);
                     	}
                     	
                     	else if (family.toString().toLowerCase().contains("udstenantdailysnapshot"))
                     	{
                     		MessagingService.instance().incrementDroppedMessages(MessagingService.Verb.TENANTDAILY);
                     	}
                     	
                     	else if (family.toString().toLowerCase().contains("udstenanthourlysnapshot"))
                     	{
                     		MessagingService.instance().incrementDroppedMessages(MessagingService.Verb.TENANTHOURLY);
                     	}
                     	
                     	else if (family.toString().toLowerCase().contains("userdatasetraw"))
                     	{
                     		MessagingService.instance().incrementDroppedMessages(MessagingService.Verb.USERDSRAW);
                     	}
                		
                     	else if (family.toString().toLowerCase().contains("tenants"))
                     	{
                     		MessagingService.instance().incrementDroppedMessages(MessagingService.Verb.TENANTS);
                     	}
                		
                     	else if (family.toString().toLowerCase().contains("users"))
                     	{
                     		MessagingService.instance().incrementDroppedMessages(MessagingService.Verb.USERS);
                     	}
                		
                     	else if (family.toString().toLowerCase().contains("tenantactivity"))
                     	{
                     		MessagingService.instance().incrementDroppedMessages(MessagingService.Verb.TENANTACTIVITY);
                     	}
                		
                     	else if (family.getKeySpaceName().toLowerCase().contains("system"))
                     	{
                     		MessagingService.instance().incrementDroppedMessages(MessagingService.Verb.SYSTEMKS);
                     	}
                     	
                		else
                     	{
                     		logger.warn("Anubhav LocalMutationRunnable updating mutations for "
+ family.toString().toLowerCase());
                     		MessagingService.instance().incrementDroppedMessages(MessagingService.Verb.OTHERTBL);
                     	}
                     } 	
                }
                catch (Exception e)
                {
                	logger.error("Anubhav LocalMutationRunnable Exception ", e);
                }
                
                MessagingService.instance().incrementDroppedMessages(MessagingService.Verb.MUTATION);
                
                HintRunnable runnable = new HintRunnable(FBUtilities.getBroadcastAddress())
                {
                    protected void runMayThrow() throws Exception
                    {
                        LocalMutationRunnable.this.runMayThrow();
                    }
                };
                submitHint(runnable);
                return;
            }

            try
            {
            	runMayThrow();
            }
            catch (Exception e)
            {
                throw new RuntimeException(e);
            }
        }
        
        public LocalMutationRunnable(IMutation mutation)
        {
        	this.mutation = mutation;
        }

        abstract protected void runMayThrow() throws Exception;
    }




> When mutations are dropped, the column family should be printed / have a counter per
column family
> --------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-10580
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10580
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>         Environment: Production
>            Reporter: Anubhav Kale
>            Priority: Minor
>             Fix For: 2.1.x
>
>
> In our production cluster, we are seeing a large number of dropped mutations. At a minimum,
we should print the time the thread took to get scheduled thereby dropping the mutation. This
will help find the right tuning parameter for write_timeout_in_ms. 
> The change will need to be done in StorageProxy.java and MessagingTask.java. It is easy,
and I will submit a patch shortly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message