I measured the memory bandwidth of a server on using the popular STREAM benchmark tool. Compiled the STREAM code with the working array size set to 500 MB. The number of threads accessing the memory was determined and controlled by setting the environment variable OMP_NUM_THREADS to 1, 2, 5, 10, 20, 30  and 50.

To compile STREAM, I used the following compile command:

gcc -m64 -mcmodel=medium -O -fopenmp stream.c \
-DSTREAM_ARRAY_SIZE=500000000 -DNTIMES=[10-1000] \
-o stream_multi_threaded_500MB_[10-1000]TIMES

# Compiled Stream packages
# stream_multi_threaded_500MB_10TIMES
# stream_multi_threaded_500MB_100TIMES
# stream_multi_threaded_500MB_1000TIMES

Above, I compiled multiple versions of STREAM so can see the effect of various iterations from 10 to 1000.  Then I created, wrapper bash script for STREAM to execute and collect its output:

#!/bin/bash
#################################################
# STREAM Harness to analyze memory bandwidth
#################################################
bench_home=/$USER/stream
out_home=$bench_home/out
bench_exec=stream_multi_threaded_500MB_1000TIMES
host=`hostname`
echo "Running Test: $bench_exec"
# Timer
elapsed() {
   (( seconds  = SECONDS ))
   "$@"
   (( seconds = SECONDS - seconds ))
   (( etime_seconds = seconds % 60 ))
   (( etime_minuts  = ( seconds - etime_seconds ) / 60 % 60 ))
   (( etime_hours   = seconds / 3600 ))
   (( verif = etime_seconds + (etime_minuts * 60) + (etime_hours * 3600) ))
   echo "Elapsed time: ${etime_hours}h ${etime_minuts}m ${etime_seconds}s"
 }
mem_stream() {
 for n in 1 2 5 10 20 30 50
  do
   export OMP_NUM_THREADS=$n
   $bench_home/$bench_exec  > $out_home/$host.memory.$n.txt
   echo "Thread $OMP_NUM_THREADS complete"
  done
}
# Main
elapsed mem_stream
exit 0

Sample result – ADD:

Hope this writeup will help you get on the right path with STREAM.

Often times a few lines of bash script can go a long way. I have been using the following few lines and its variations for many years, and saved me a lot of time last week while trying to address a task I needed to perform in high volume. All it does, it reads comma-separated values from an input file, and performs operations to address task(s). I believe when there is a task that is being repeated more than once, the task needs to be automated. Besides, I have never been a typing wiz. In the example below, I have a simplified script (skeleton) to read any value(s) from input file and perform a task to echo the values read in from an input file. This input is read in as an argument to the script.

#!/bin/bash
###################################
# Takes param1 and param2 from file
# csv file:
###################################
# Set variables
INPUT_FILE=$1
OLDIFS=$IFS
IFS=,</code>

[ ! -f $INPUT_FILE ] &amp;&amp; { echo "$INPUT_FILE file not found"; exit 1; }
while read dbhost dbname
do
echo "DB Host : ${dbhost} DB Name : ${dbname}"
# Do parametrized command here
# Add login for processing below....

# Conclude your processing and add error handling as needed
echo "Task complete for ${dbhost}"
done &lt; $INPUT_FILE
IFS=$OLDIFS

exit 0

Paste the above into a script, set the file permission to be executable and have fun.
<code>
chmod +x readfile.sh
./readfile.sh input.csv

If you are a Linux sysadmin or sysadmin “wannabe”, dba or just a software developer .. or just tired of typing and want do to things on scale, this simple script will save you a lot of time!

While looking at some threading related issue the other day, I used the following commands for diagnostics.

Collecting paging activity information

To collect paging data, use the following command:

vmstat {time_between_samples_in_seconds} {number_of_samples} \
> vmstat.txt
vmstat 10 10 > vmstat.txt

If you start vmstat when the problem occurs, a value of 10 for time_between_samples_in_seconds and 10 for number_of_samples usually ensures that enough data is collected during the problem. Collect the vmstat.txt file 100 seconds later.

Collecting system CPU usage information

You can gather CPU usage information using the following command:

top -b > top.txt

You can then collect the top.txt file.

Collecting process CPU usage information
Gather process and thread-level CPU activity information at the point at which the problem occurs, using the following command:

top -H -b -c > top_threads.txt
cat top_threads.txt

top - 06:22:10 up 192 days, 19:00,  1 user,  load average: 0.00, 0.00, 0.00
Tasks: 542 total,   1 running, 541 sleeping,   0 stopped,   0 zombie
Cpu(s):  1.0%us,  0.0%sy,  0.0%ni, 99.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:  164936028k total, 160272700k used,  4663328k free,        0k buffers
Swap:        0k total,        0k used,        0k free, 64188236k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
24741 xxx      22   0 xxxg  xxxg  11m S  0.0 50.0   0:00.00 java

or if you would like to look into a specific process using PID then issue

top -H -b -c -p <pid> > top_threads_pid.txt

Allow this command to run for a short time. It produces a file called top_threads_pid.txt.

I had a task the other day where I had 110GB of compressed log files and wanted to import into Impala (Cloudera). Currently, Impala does not support compressed files so I had to decompress them all. I created this handy script and thought you might find it useful. I mounted the EC2 bucket using s3fs I mentioned in my earlier post.

#!/bin/bash
# Utils
elapsed()
{
   (( seconds  = SECONDS ))
   "$@"
   (( seconds = SECONDS - seconds ))
   (( etime_seconds = seconds % 60 ))
   (( etime_minuts  = ( seconds - etime_seconds ) / 60 % 60 ))
   (( etime_hours   = seconds / 3600 ))
   (( verif = etime_seconds + (etime_minuts * 60) + (etime_hours * 3600) ))

   echo "Elapsed time: ${etime_hours}h ${etime_minuts}m ${etime_seconds}s"
 }

convert()
{
# Remove the .gz extention from the compressed file name
UFILE=`echo ${FILE:0:${#FILE}-3}`

# Decompress gz file
sudo -u hdfs hdfs dfs -cat /user/hdfs/oms/logs/$FILE | \ 
sudo -u hdfs gunzip -d | sudo -u hdfs hdfs dfs -put - /user/hdfs/oms/logs/$UFILE

# Discard original gz file
sudo -u hdfs hdfs dfs -rm -skipTrash /user/hdfs/oms/logs/$FILE
sudo -u hdfs hdfs dfs -ls /user/hdfs/oms/logs/$UFILE
}

for FILE in `ls /media/ephemeral0/logs/`
  do
    elapsed convert $FILE
    echo "Decompressed $FILE to $UFILE on hdfs"
  done

exit 0

s3fs is a open-source project which lets you mount your S3 storage locally to have access to your files at the system level so that you could actually work with them. I use this method to mount S3 buckets on my EC2 instances. Below, I go through the installation steps and also document some of the problems and their workarounds.

Dowload s3fs source code from to your EC2 instance and decompress it:

[ec2-user@ip-10-xx-xx-xxx ~]$ wget http://s3fs.googlecode.com/files/s3fs-1.63.tar.gz
--Make sure your libraries are installed/up-to-date
[ec2-user@ip-10-xx-xx-xxx ~]$ sudo yum install gcc libstdc++-devel gcc-c++ fuse fuse-devel curl-devel libxml2-devel openssl-devel mailcap
[ec2-user@ip-10-xx-xx-xxx ~]$ cd s3fs-1.63
[ec2-user@ip-10-xx-xx-xxx ~]$ ./configure --prefix=/usr

At this point you might get the following exception indicating that s3fs requires a newer version of Fuse (http://fuse.sourceforge.net/).

configure: error: Package requirements (fuse >= 2.8.4 libcurl >= 7.0 libxml-2.0 >= 2.6 libcrypto >= 0.9) were not met:
Requested 'fuse >= 2.8.4' but version of fuse is 2.8.3
Consider adjusting the PKG_CONFIG_PATH environment variable if you
installed software in a non-standard prefix.

Alternatively, you may set the environment variables DEPS_CFLAGS
and DEPS_LIBS to avoid the need to call pkg-config.
See the pkg-config man page for more details.

Follow the steps to upgrade your Fuse posted at http://fuse.sourceforge.net/.

[ec2-user@ip-10-xx-xx-xxx ~]$ wget http://downloads.sourceforge.net/project/fuse/fuse-2.X/2.8.4/fuse-2.8.4.tar.gz
[ec2-user@ip-10-xx-xx-xxx ~]$ tar -xvf fuse-2.8.4.tar.gz
[ec2-user@ip-10-xx-xx-xxx ~]$ cd fuse-2.8.4
[ec2-user@ip-10-xx-xx-xxx ~]$ sudo  yum -y install "gcc*" make libcurl-devel libxml2-devel openssl-devel
[ec2-user@ip-10-xx-xx-xxx ~]$ sudo ./configure --prefix=/usr
[ec2-user@ip-10-xx-xx-xxx ~]$ sudo make && sudo make install
[ec2-user@ip-10-xx-xx-xxx ~]$ sudo ldconfig
--Verify that the new version is now in place
[ec2-user@ip-10-xx-xx-xxx ~]$ pkg-config --modversion fuse
2.8.3

Now we can return to our s3fs installation step to add the AWS credentials in the following format: AWS Access Key:Secret Key

[ec2-user@ip-10-xx-xx-xxx ~]$ sudo vi /etc/passwd-s3fs
-- Set file permission
[ec2-user@ip-10-xx-xx-xxx ~]$ sudo chmod 640 /etc/passwd-s3fs

Now you should be able to successfully mount your AWS S3 bucket onto your local folder as such:

[ec2-user@ip-10-xx-xx-xxx ~]$ sudo s3fs <AWS S3 Bucker Name> <Path to local dir on EC2 Instance>

That is about it and thanks for reading!

Often time I come across situations where server comes under high CPU load. There is a simple way to find out which application thread(s) are responsible for the load. To get the thread which has the highest CPU load, issue:

ps -mo pid,lwp,stime,time,cpu -p <pid>

LWP stands for Light-Weight Process and typically refers to kernel threads. To identify the thread, get the LWP with highest CPU load, and convert its unique number (xxxx) into a hexadecimal number (0xxxx).

Get the java stack dump using jstack -l command and find the thread based on the hex humber identified above.

I wanted to create a simple yet flexible way to parse command line arguments in bash. I used case statement, and some expression expansion technique to read arguments in a simple manner. I find this very handy, and hoping you will find it useful in solving or simplifying your task as well. Whether it is a serious script or a quick hack, clean programming makes your script more efficient and also easier to understand.

usage() {
      echo -e "No command-line argument\n"
      echo "Usage: $0 <command line arguments>"
      echo "Arguments:"
      echo -e " --copy-from-hdfs\tcopy data set resides in HDFS"
      echo -e " --copy-to-s3\t\tcopy files to S3 in AWS"
      echo -e " --gzip\t\t\tcompress source files, recommended before sending data set to S3"
      echo -e " --remote-dir=\t\tpath to input directory (HDFS directory)"
      echo -e " --local-dir=\t\tlocal tmp directory (local directory)"
      echo -e " --s3-bucket-dir=\ts3 bucket directory in AWS"
      exit 1
}

# Check command line args
if [ -z $1 ]
 then
  usage
 else
 # Parsing commandline args
 for i in $*
 do
  case $i in
  -r=*|--remote-dir=*)
      #DM_DATA_DIR=`echo $i | sed 's/[-a-zA-Z0-9]*=//'`  -- > this work but using expression expansion below is a much nicer and compact way 
      DM_DATA_DIR=${i#*=}
      ;;
  -l=*|--local-dir=*)
      #AMAZON_DATA_DIR=`echo $i | sed 's/[-a-zA-Z0-9]*=//'`
      AMAZON_DATA_DIR=${i#*=}
      ;;
  -s3=*|--s3-bucket-dir=*)
      #S3_DIR=`echo $i | sed 's/[-a-zA-Z0-9]*=//'`
      S3_DIR=${i#*=}
      ;;
  --copy-from-hdfs)
      COPY_FROM_HDFS=YES
      ;;
  --copy-to-s3)
      COPY_TO_S3=YES
      ;;
  -c|--gzip)
      COMPRESS=YES
      ;;
           *)
      # Unknown option
      ;;
   esac
 done

Thoughts, and suggestions are welcome!

Set innodb_stats_on_metadata=0 which will prevent statistic update when you query information_schema.

mysql> select count(*),sum(data_length) from information_schema.tables;
+----------+------------------+
| count(*) | sum(data_length) |
+----------+------------------+
|     5581 |    3051148872493 |
+----------+------------------+
1 row in set (3 min 21.82 sec)
mysql> show variables like '%metadata'
+--------------------------+-------+
| Variable_name            | Value |
+--------------------------+-------+
| innodb_stats_on_metadata | ON    |
+--------------------------+-------+
mysql> set global innodb_stats_on_metadata=0;
mysql> show variables like '%metadata'
+--------------------------+-------+
| Variable_name            | Value |
+--------------------------+-------+
| innodb_stats_on_metadata | OFF   |
+--------------------------+-------+

mysql> select count(*),sum(data_length) from information_schema.tables;
+----------+------------------+
| count(*) | sum(data_length) |
+----------+------------------+
|     5581 |    3051148872493 |
+----------+------------------+
1 row in set (0.49 sec)

1) start import of data.sql into a dummy db when both instances are running
2) pt-stalk –collect –collect-oprofile –no-stalk for the duration of the import

  • oprofile will show where MySQL spends most of its time during the import

3) run pt-diskstats -g all –devices-regex sdb1 for the duration of the import
4) run poor-man-profiler for the duration of the import

I find often time very useful to use this script to evaluate GC behavior/RCA which is published at java.net (http://java.net/projects/printgcstats/sources/svn/show).

# PrintGCStats - summarize statistics about garbage collection, in particular gc
# pause time totals, averages, maximum and standard deviations.
#
# Attribution: written by John Coomes, based on earlier work by Peter Kessler,
# Ross Knippel and Jon Masamitsu.
# This version is based off of a version posted on the OpenJDK
# mailing list on 04/20/2007 and available at:-
# http://article.gmane.org/gmane.comp.java.openjdk.hotspot.gc.devel/51/match=gc+log+reader
# Modifications by Y. Srinivas Ramakrishna
#
# The input to this script should be the output from the HotSpot(TM)
# Virtual Machine when run with one or more of the following flags:
#
# -verbose:gc # produces minimal output so statistics are
# # limited, but available in all VMs
#
# -XX:+PrintGCTimeStamps # enables time-based statistics (e.g.,
# # allocation rates, intervals), but only
# # available in JDK 1.4.0 and later.
#
# -XX:+PrintGCDetails # enables more detailed statistics gathering,
# # but only available in JDK 1.4.1 and later.
#
# -XX:-TraceClassUnloading # [1.5.0 and later] disable messages about class
# # unloading, which are enabled as a side-effect
# # by -XX:+PrintGCDetails. The class unloading
# # messages confuse this script and will cause
# # some GC information in the log to be ignored.
# #
# # Note: This option only has an effect in 1.5.0
# # and later. Prior to 1.5.0, the option is
# # accepted, but is overridden by
# # -XX:+PrintGCDetails. In 1.4.2 and earlier
# # releases, use -XX:-ClassUnloading instead (see
# # below).
#
# -XX:-ClassUnloading # disable class unloading, since PrintGCDetails
# # turns on TraceClassUnloading, which cannot be
# # overridden from the command line until 1.5.0.
#
# Recommended command-line with JDK 1.5.0 and later:
#
# java -verbose:gc -XX:+PrintGCTimeStamps -XX:+PrintGCDetails \
# -XX:-TraceClassUnloading ...
#
# Recommended command-line with JDK 1.4.1 and 1.4.2:
#
# java -verbose:gc -XX:+PrintGCTimeStamps -XX:+PrintGCDetails \
# -XX:-ClassUnloading ...
#
# ------------------------------------------------------------------------------
#
# Usage:
#
# PrintGCStats -v cpus=&lt;n&gt; [-v interval=&lt;seconds&gt;] [-v verbose=1] [file ...]
# PrintGCStats -v plot=name [-v plotcolumns=&lt;n&gt;] [-v verbose=1] [file ...]
#
# cpus - number of cpus on the machine where java was run, used to
# compute cpu time available and gc 'load' factors. No default;
# must be specified on the command line (defaulting to 1 is too
# error prone).
#
# ncpu - synonym for cpus, accepted for backward compatibility
#
# interval - print statistics at the end of each interval; requires
# output from -XX:+PrintGCTimeStamps. Default is 0 (disabled).
#
# plot - generate data points useful for plotting one of the collected
# statistics instead of the normal statistics summary. The name
# argument is the name of one of the output statistics, e.g.,
# "gen0t(s)", "cmsRM(s)", "commit0(MB)", etc.
#
# The default output format for time-based statistics such as
# "gen0t(s)" includes four columns, described below. The
# default output format for size-based statistics such as
# "commit0(MB)" includes just the first two columns. The
# number of columns in the output can be set on the command
# line with -v plotcolumns=&lt;N&gt;.
#
# The output columns are:
#
# 1) the starting timestamp if timestamps are present, or a
# simple counter if not
#
# 2) the value of the desired statistic (e.g., the length of a
# cms remark pause).
#
# 3) the ending timestamp (or counter)
#
# 4) the value of the desired statistic (again)
#
# The last column is to make plotting start &amp; stop events
# easier.
#
# plotcolumns - the number of columns to include in the plot data.
#
# verbose - if non-zero, print each item on a separate line in addition
# to the summary statistics
#
# Typical usage:
#
# PrintGCStats -v cpus=4 gc.log &gt; gc.stats
#
# ------------------------------------------------------------------------------
#
# Basic Output statistics:
#
# gen0(s) - young gen collection time, excluding gc_prologue &amp; gc_epilogue.
# gen0t(s) - young gen collection time, including gc_prologue &amp; gc_epilogue
# gen1i(s) - train generation incremental collection
# gen1t(s) - old generation collection/full GC
# cmsIM(s) - CMS initial mark pause
# cmsRM(s) - CMS remark pause
# cmsRS(s) - CMS resize pause
# GC(s) - all stop-the-world GC pauses
# cmsCM(s) - CMS concurrent mark phase
# cmsCP(s) - CMS concurrent preclean phase
# cmsCS(s) - CMS concurrent sweep phase
# cmsCR(s) - CMS concurrent reset phase
# alloc(MB) - object allocation in MB (approximate***)
# promo(MB) - object promotion in MB (approximate***)
# used0(MB) - young gen used memory size (before gc)
# used1(MB) - old gen used memory size (before gc)
# used(MB) - heap space used memory size (before gc) (excludes perm gen)
# commit0(MB) - young gen committed memory size (after gc)
# commit1(MB) - old gen committed memory size (after gc)
# commit(MB) - heap committed memory size (after gc) (excludes perm gen)
# apptime(s) - amount of time application threads were running
# safept(s) - amount of time the VM spent at safepoints (app threads stopped)
#
# *** - these values are approximate because there is no way to track
# allocations that occur directly into older generations.
#
# Some definitions:
#
# 'mutator' or 'mutator thread': a gc-centric term referring to a non-GC
# thread that modifies or 'mutates' the heap by allocating memory and/or
# updating object fields.
#
# promotion: when an object that was allocated in the young generation has
# survived long enough, it is copied, or promoted, into the old generation.
#
# Time-based Output Statistics (require -XX:+PrintGCTimeStamps):
#
# alloc/elapsed_time - allocation rate, based on elapsed time
# alloc/tot_cpu_time - allocation rate, based on total cpu time
# alloc/mut_cpu_time - allocation rate, based on cpu time available to mutators
# promo/elapsed_time - promotion rate, based on elapsed time
# promo/gc0_time - promotion rate, based on young gen gc time
# gc_seq_load - the percentage of cpu cycles used by gc 'serially'
# (i.e., while java application threads are stopped)
# gc_conc_load - the percentage of cpu cycles used by gc 'concurrently'
# (i.e., while java application threads are also running)
# gc_tot_load - the percentage of cpu cycles spent in gc