Archive for Attila

Often times a few lines of bash script can go a long way. I have been using the following few lines and its variations for many years, and saved me a lot of time last week while trying to address a task I needed to perform in high volume. All it does, it reads comma-separated…»

While looking at some threading related issue the other day, I used the following commands for diagnostics. Collecting paging activity information To collect paging data, use the following command: vmstat {time_between_samples_in_seconds} {number_of_samples} \ > vmstat.txt vmstat 10 10 > vmstat.txt If you start vmstat when the problem occurs, a value of 10 for time_between_samples_in_seconds and…»

I had a task the other day where I had 110GB of compressed log files and wanted to import into Impala (Cloudera). Currently, Impala does not support compressed files so I had to decompress them all. I created this handy script and thought you might find it useful. I mounted the EC2 bucket using s3fs…»

s3fs is a open-source project which lets you mount your S3 storage locally to have access to your files at the system level so that you could actually work with them. I use this method to mount S3 buckets on my EC2 instances. Below, I go through the installation steps and also document some of…»

Often time I come across situations where server comes under high CPU load. There is a simple way to find out which application thread(s) are responsible for the load. To get the thread which has the highest CPU load, issue: ps -mo pid,lwp,stime,time,cpu -p <pid> LWP stands for Light-Weight Process and typically refers to kernel…»

I wanted to create a simple yet flexible way to parse command line arguments in bash. I used case statement, and some expression expansion technique to read arguments in a simple manner. I find this very handy, and hoping you will find it useful in solving or simplifying your task as well. Whether it is…»

Set innodb_stats_on_metadata=0 which will prevent statistic update when you query information_schema. mysql> select count(*),sum(data_length) from information_schema.tables; +———-+——————+ | count(*) | sum(data_length) | +———-+——————+ | 5581 | 3051148872493 | +———-+——————+ 1 row in set (3 min 21.82 sec) mysql> show variables like '%metadata' +————————–+——-+ | Variable_name | Value | +————————–+——-+ | innodb_stats_on_metadata | ON | +————————–+——-+…»

1) start import of data.sql into a dummy db when both instances are running 2) pt-stalk –collect –collect-oprofile –no-stalk for the duration of the import oprofile will show where MySQL spends most of its time during the import 3) run pt-diskstats -g all –devices-regex sdb1 for the duration of the import 4) run poor-man-profiler for…»

I find often time very useful to use this script to evaluate GC behavior/RCA which is published at java.net (http://java.net/projects/printgcstats/sources/svn/show). # PrintGCStats – summarize statistics about garbage collection, in particular gc # pause time totals, averages, maximum and standard deviations. # # Attribution: written by John Coomes, based on earlier work by Peter Kessler, # Ross Knippel…»