Archive for September 2012

I wanted to create a simple yet flexible way to parse command line arguments in bash. I used case statement, and some expression expansion technique to read arguments in a simple manner. I find this very handy, and hoping you will find it useful in solving or simplifying your task as well. Whether it is a serious script or a quick hack, clean programming makes your script more efficient and also easier to understand.

usage() {
      echo -e "No command-line argument\n"
      echo "Usage: $0 "
      echo "Arguments:"
      echo -e " --copy-from-hdfs\tcopy data set resides in HDFS"
      echo -e " --copy-to-s3\t\tcopy files to S3 in AWS"
      echo -e " --gzip\t\t\tcompress source files, recommended before sending data set to S3"
      echo -e " --remote-dir=\t\tpath to input directory (HDFS directory)"
      echo -e " --local-dir=\t\tlocal tmp directory (local directory)"
      echo -e " --s3-bucket-dir=\ts3 bucket directory in AWS"
      exit 1
}

# Check command line args
if [ -z $1 ]
 then
  usage
 else
 # Parsing commandline args
 for i in $*
 do
  case $i in
  -r=*|--remote-dir=*)
      #DM_DATA_DIR=`echo $i | sed 's/[-a-zA-Z0-9]*=//'`  -- > this work but using expression expansion below is a much nicer and compact way 
      DM_DATA_DIR=${i#*=}
      ;;
  -l=*|--local-dir=*)
      #AMAZON_DATA_DIR=`echo $i | sed 's/[-a-zA-Z0-9]*=//'`
      AMAZON_DATA_DIR=${i#*=}
      ;;
  -s3=*|--s3-bucket-dir=*)
      #S3_DIR=`echo $i | sed 's/[-a-zA-Z0-9]*=//'`
      S3_DIR=${i#*=}
      ;;
  --copy-from-hdfs)
      COPY_FROM_HDFS=YES
      ;;
  --copy-to-s3)
      COPY_TO_S3=YES
      ;;
  -c|--gzip)
      COMPRESS=YES
      ;;
           *)
      # Unknown option
      ;;
   esac
 done

Thoughts, and suggestions are welcome!

Set innodb_stats_on_metadata=0 which will prevent statistic update when you query information_schema.

mysql> select count(*),sum(data_length) from information_schema.tables;
+----------+------------------+
| count(*) | sum(data_length) |
+----------+------------------+
|     5581 |    3051148872493 |
+----------+------------------+
1 row in set (3 min 21.82 sec)
mysql> show variables like '%metadata'
+--------------------------+-------+
| Variable_name            | Value |
+--------------------------+-------+
| innodb_stats_on_metadata | ON    |
+--------------------------+-------+
mysql> set global innodb_stats_on_metadata=0;
mysql> show variables like '%metadata'
+--------------------------+-------+
| Variable_name            | Value |
+--------------------------+-------+
| innodb_stats_on_metadata | OFF   |
+--------------------------+-------+

mysql> select count(*),sum(data_length) from information_schema.tables;
+----------+------------------+
| count(*) | sum(data_length) |
+----------+------------------+
|     5581 |    3051148872493 |
+----------+------------------+
1 row in set (0.49 sec)