Archive for Attila

I measured the memory bandwidth of a server using the popular STREAM benchmark tool. Compiled the STREAM code with the working array size set to 500 MB. The number of threads accessing the memory was determined and controlled by setting the environment variable OMP_NUM_THREADS to 1, 2, 5, 10, 20, 30  and 50. To compile STREAM, I…»

Often times a few lines of bash script can go a long way. I have been using the following few lines and its variations for many years, and saved me a lot of time last week while trying to address a task I needed to perform in high volume. All it does, it reads comma-separated…»

While looking at some threading related issue the other day, I used the following commands for diagnostics. Collecting paging activity information To collect paging data, use the following command: vmstat {time_between_samples_in_seconds} {number_of_samples} \ > vmstat.txt vmstat 10 10 > vmstat.txt If you start vmstat when the problem occurs, a value of 10 for time_between_samples_in_seconds and…»

I had a task the other day where I had 110GB of compressed log files and wanted to import into Impala (Cloudera). Currently, Impala does not support compressed files so I had to decompress them all. I created this handy script and thought you might find it useful. I mounted the EC2 bucket using s3fs…»

s3fs is a open-source project which lets you mount your S3 storage locally to have access to your files at the system level so that you could actually work with them. I use this method to mount S3 buckets on my EC2 instances. Below, I go through the installation steps and also document some of…»

Often time I come across situations where server comes under high CPU load. There is a simple way to find out which application thread(s) are responsible for the load. To get the thread which has the highest CPU load, issue: ps -mo pid,lwp,stime,time,cpu -p LWP stands for Light-Weight Process and typically refers to kernel threads….»

I wanted to create a simple yet flexible way to parse command line arguments in bash. I used case statement, and some expression expansion technique to read arguments in a simple manner. I find this very handy, and hoping you will find it useful in solving or simplifying your task as well. Whether it is…»

Set innodb_stats_on_metadata=0 which will prevent statistic update when you query information_schema. mysql> select count(*),sum(data_length) from information_schema.tables; +———-+——————+ | count(*) | sum(data_length) | +———-+——————+ | 5581 | 3051148872493 | +———-+——————+ 1 row in set (3 min 21.82 sec) mysql> show variables like ‘%metadata’ +————————–+——-+ | Variable_name | Value | +————————–+——-+ | innodb_stats_on_metadata | ON | +————————–+——-+…»

1) start import of data.sql into a dummy db when both instances are running 2) pt-stalk –collect –collect-oprofile –no-stalk for the duration of the import oprofile will show where MySQL spends most of its time during the import 3) run pt-diskstats -g all –devices-regex sdb1 for the duration of the import 4) run poor-man-profiler for…»