Building a spell-checker using soundex

One of the little know facts about MySQL is that it also provides SOUNDEX() function. Soundex is a well known phonetic algorithm that indexes different words such that ‘homophones’ get the same encoding. So, lets try to see it in action using a MySQL server.


mysql> SELECT SOUNDEX('hello');
+------------------+
| SOUNDEX('hello') |
+------------------+
| H400             |
+------------------+
1 row in set (0.01 sec)

mysql> SELECT SOUNDEX('hellow');
+-------------------+
| SOUNDEX('hellow') |
+-------------------+
| H400              |
+-------------------+
1 row in set (0.00 sec)

Great! As we can see soundex’s encoding has a proper format : Lnnn, i.e the first letter of the supplied word followed by a 3-digit number. This output should mostly be same for homophones (words with similar pronunciation). Lets us now harness this fact to build a ‘not-so-perfect spell checker’.

First, create a table that stores all possible valid words and their soundex code, for brevity I will just use ‘hello’.


mysql> CREATE TABLE `spell_checker` (`word` VARCHAR(50), `code` VARCHAR(4));
Query OK, 0 rows affected (0.13 sec)

mysql> INSERT INTO `spell_checker` VALUES ('hello', SOUNDEX('hello'));
Query OK, 1 row affected (0.00 sec)

Now, once we have the soundex store in place, we can use it as the spell-checker backend to provide the correct spelling for an invalid word.


# hello
mysql> SELECT `word` FROM `spell_checker` WHERE SOUNDEX('hello') = `code`;
+-------+
| word  |
+-------+
| hello |
+-------+
1 row in set (0.00 sec)

# hellow
mysql> SELECT `word` FROM `spell_checker` WHERE SOUNDEX('hellow') = `code`;
+-------+
| word  |
+-------+
| hello |
+-------+
1 row in set (0.00 sec)

# hallow
mysql> SELECT `word` FROM `spell_checker` WHERE SOUNDEX('hallow') = `code`;
+-------+
| word  |
+-------+
| hello |
+-------+
1 row in set (0.01 sec)

# hallo
mysql> SELECT `word` FROM `spell_checker` WHERE SOUNDEX('hallo') = `code`;
+-------+
| word  |
+-------+
| hello |
+-------+
1 row in set (0.00 sec)

As we can see the output is the correct word even it we feed an invalid one.
Sounds interesting.. Isn’t it!!

Experience Ajax

Lately, I thought of trying my hands on with the classic Ajax alongside Go language. The requirement is to write a simple web server which sends ‘current timestamp’ to the requester every second. I have written a small program to do exactly the same. The Ajax part is taken care by jquery.

/* tick-tock.go */
package main

import (
	"fmt"
	"log"
	"net/http"
	"time"
)

// Content for the main html page..
var page = "<html>\n<head>\n" +
	"<script type=\"text/javascript\"" +
	"src=\"http://ajax.googleapis.com/ajax/libs/jquery/1.3.2/jquery.min.js\">" +
	"</script>\n" +
	"</head><body>\n" +
	"<h1>Go Timer (ticks every second!)</h1>\n" +
	"<div id=\"output\"></div>\n" +
	"<script type=\"text/javascript\">\n" +
	"$(document).ready(function () {\n" +
	"    setInterval(\"delayedPost()\", 1000);\n" +
	"});\n" +
	"function delayedPost() {\n" +
	"    $.post(\"http://localhost:9999/gettime\", \"\", function(data, status) {\n" +
	"      $(\"#output\").append(\"<br>\");\n" +
	"      $(\"#output\").append(data);\n" +
	"      });\n" +
	"  }\n" +
	"</script>\n" +
	"</body></html>"

// handler for the main page.
func handler(w http.ResponseWriter, r *http.Request) {
	fmt.Fprintf(w, "%s", page)
}

// handler to cater AJAX requests
func handlerGetTime(w http.ResponseWriter, r *http.Request) {
	fmt.Fprintf(w, "%s", string(time.Now().Format("20060102-15:04:05")))
}

func main() {
	http.HandleFunc("/time", handler)
	http.HandleFunc("/gettime", handlerGetTime)
	log.Fatal(http.ListenAndServe(":9999", nil))
}

In order to see it in action, one needs to run it “go run tick-tock.go”. When the program is running, it essentially becomes a tiny web server listening to requests at port ’9999′. Now all you need to do is fire up a browser and raise a request to the tick-tock server : http://localhost:9999/time

Go is awesome.. Isn’t it ?

A barebone logger for Go

While working on a Go library, I was looking for an elegant solution for logging. After following many discussions on the forum, I came up with the following barebone which I think is decent performant and clean. Check it out!

package main

import (
        "flag"
        "log"
)

type Logger struct {
        log_func func(string, ...interface{})
}

func do_log(fmt string, args ...interface{}) {
        log.Printf(fmt, args...)
}

func do_not_log(fmt string, args ...interface{}) {
        // Do nothing..
}

func (logger *Logger) init_logger(enable_log bool) {
        if enable_log == true {
                logger.log_func = do_log
        } else {
                logger.log_func = do_not_log
        }
}

func main() {
        logger := new(Logger)

        // command line option
        enable_log := flag.Bool("enable_log", false, "Enable logging")
        flag.Parse()

        logger.init_logger(*enable_log)

        logger.log_func("%s", "log me!")
        logger.log_func("%s", "log me again..!")
}

Output:

$ ./logger
$ ./logger –enable_log
2013/01/28 11:41:19 log me!
2013/01/28 11:41:19 log me again..!
$

Locating current position on map

HTML5 introduced new set of specifications which include geolocation APIs. Using them one can find the location of a device. While exploring it further, I found a simple Javascript program that neatly demonstrates the overall idea (references (i)). The following code has been modified a bit from the original one to make it more verbose.

  if(navigator.geolocation) {

    navigator.geolocation.getCurrentPosition(

      // success callback, where 'position' holds the found coordinates.
      function(position) {
      // Instantiate a Google Map's LatLng object using the position coordinates.
      initialLocation = new google.maps.LatLng(position.coords.latitude,
                                               position.coords.longitude);
      /*..
        Now 'initialLocation' can be used to place a marker on the map or
        other coordinates related stuff.
      ..*/
      },

      // error callback, where 'positionError' holds various error related
      // attributes (see reference (ii) for details).
      function(positionError) {
      // log the error message
      console.log("Geolocation service failed. MSG: " + positionError.message);
      },
      {maximumAge:Infinity, timeout:10000});
  }

  // Browser doesn't support Geolocation
  else {
    console.log("Your browser doesn't support geolocation.");
  }
}

References:
i) https://developers.google.com/maps/articles/geolocation
ii) http://dev.w3.org/geo/api/spec-source.html

Shell script: recursive find/replace

Lately I had a requirement of finding & replacing a specific string from all the files under a directory tree, while keeping the file permissions intact. Here is the shell script that I eventually came up..

#!/bin/bash

cd target_dir

for _file in `find .`;
do
  if [ ! -d $_file ]; then
    sed "s/SEARCH_STR/REPLACE_STR/g" $_file > tmp; cp tmp $_file; rm tmp;
  fi
done

Hope this would help somebody.. disfrutar!

Shell: test: argument expected

The following shell script might work just fine on many of the available shell interpreters but the same would fail with the ‘test: argument expected’ error if you try to execute it under sh on Solaris platform.

#!/usr/bin/sh

touch /tmp/i_exist
if [ -e /tmp/i_exist ]; then
  echo "file exists!"
fi

So, what we are trying in the above script is to check if the given file exists using shell’s [condition].
As per Solaris test(1):


-e file — True if file exists. (Not available in sh.)

A simple solution to this issue is to use ‘-f’ instead. It checks for the existence of regular files and supposed to work on most of existing shell interpreters.