Saliya's Blogs

Mostly technical stuff with some interesting moments of life

K-Means Clustering with Chapel

No comments
K-Means clustering is simple to comprehend, yet is a good candidate to benefit from parallelization. The idea is to partition N data points into K clusters by assigning points to the nearest cluster center. The program starts with a random K number of centers. These centers are then refined iteratively by taking the mean of nearest points to each center. The refinement continues until the change in centers falls below a predefined threshold or until a fixed number of iterations have completed.

Full working code is at
  • Declaration of points, dimension of a point, iterations, and number of clusters. Note. These are configurable at start time by passing them as command line arguments. For example passing --numPoints 50000 will set the number of points to 50000 instead of 2000.
     config var numDim: int = 2,
                numClusters: int = 4,
                numPoints: int  = 2000,
                numIter: int = 50,
                threshold: real = 0.0001;
  • Definitions of domains. The {0 .. #N} notation indicates the integer range starting at zero and counting up to N number of values, i.e. 0, 1, 2 ... N-1
     const PointsSpace = {0..#numPoints,0..#numDim};
     const ClusterSpace = {0..#numClusters,0..#numDim};
     const ClusterNumSpace = {0..#numClusters};
  • Block distribute points along zeroth dimension across locales. The array Locales and the numLocales are made available to programs by the Chapel runtime and we have reshaped the locales into a two dimensional array of numLocales x 1 size. This ensures components of each point stay in the same locale when blocked. 
     var blockedLocaleView = {0..#numLocales,1..1};
     var blockedLocales: [blockedLocaleView] locale = reshape(Locales, blockedLocaleView);
     const BlockedPointsSpace = PointsSpace 
       dmapped  Block(boundingBox=PointsSpace, targetLocales=blockedLocales);
     var points: [BlockedPointsSpace] real;
  • Arrays to hold current centers. Chapel’s rich domain operations make it possible to assign a subset of points as current centers by specifying a range as shown in second line. 
     var currCenters: [ClusterSpace] real;
     currCenters = points[ClusterSpace];  
  • Replicated arrays to keep local centers and updates. Chapel allows transparent access to array elements on remote locales; however, performance may suffer when data is transferred over network. Therefore, it is beneficial to use local arrays when operating on points in a particular locale. Chapel provides a convenient distribution to achieve this, ReplicatedDist, which creates a copy of the array in each locale. The array reference resolves to the local copy in the particular locale where the referring code runs. Also note the use of atomic real and integer variables, which is done as a workaround to non-implemented atomic blocks. 
     const ReplClusterSpace = ClusterSpace dmapped ReplicatedDist();
     var localCurrCenters: [ReplClusterSpace] real;
     // using atomic primitives as a work around to not implemented atomic blocks
     var localCenterUpdates: [ReplClusterSpace] atomic real;
     const ReplClusterNumSpace = ClusterNumSpace dmapped ReplicatedDist();
     var localCenterPCounts: [ReplClusterNumSpace] atomic int;
  • The next steps happen iteratively while refining centers.
    • We start by resetting local arrays as follows. The first line copies the current centers to localCurrCenters array in each locale. The ReplicatedDist in Chapel guarantees the array assignment in the first line happens in each locale. The next two lines initialize the two local arrays, i.e. localCenterUpdates and localCenterPCounts, to zero. Again, the distribution guarantees the two forall loops happen for each local copy of the arrays. These three statements are run in parallel by wrapping inside a cobegin clause.          
          cobegin {
            localCurrCenters = currCenters;
            forall lcu in localCenterUpdates do lcu.write(0.0);
            forall lcpc in localCenterPCounts do lcpc.write(0);
    • Next is to compare the distance for each point against all cluster centers and decide the cluster it belongs to.  Note the shifting of locales using the on clause in line 2, which in turn guarantee the access of current centers, center updates, and center point counts arrays local to the particular locale. We have placed the atomic construct to show where the updates should be done atomically, but it is not implemented in Chapel yet. However, the use of atomic variables, overcomes this and guarantees proper atomic updates. Also note if the number of clusters was large, the for loop could be changed to the parallel forall version with slight modification to the code. Moreover, if parallel task creation was unnecessarily expensive one could easily change the code to use serial execution.
          forall p in {0..#numPoints} {
            on points[p,0] {
              var closestC: int = -1;
              var closestDist: real = MaxDist;
              for c in {0..#numClusters} { // possibility to parallelize
                var dist: atomic real;
                forall d in {0..#numDim} {
                  var tmp = points[p,d] - localCurrCenters[c,d];
                if ( < closestDist) {
                  closestDist =;
                  closestC = c;
              forall d in {0..#numDim} {
                atomic { // here's where we need atomic
    • Then we collate centers in each locale. Note the use of nested forall and cobegin constrcuts. Also, the tmpLCU and tmpLCPC arrays are not distributed, yet Chapel gives seamless access to their elements even when the referring code runs on remote locales.
          var tmpLCU: [ClusterSpace] atomic real;
          forall tlcu in tmpLCU do tlcu.write(0.0);
          var tmpLCPC: [0..#numClusters] atomic int;
          forall tlcpc in tmpLCPC do tlcpc.write(0);

          forall loc in Locales {
            on loc do {
              cobegin {
                forall (i,j) in ClusterSpace {
                forall i in {0..#numClusters} {
    • Finally, we test for convergence. If it has converged or the number of iterations has exceeded the given limit we will stop and will print results.
          var b: atomic bool;
          forall (i,j) in ClusterSpace {
            var center: real = tmpLCU[i,j].read()/tmpLCPC[i].read();
            if (abs(center - currCenters[i,j]) > threshold){
            currCenters[i,j] = center;
          converged =;

P.S. I thought I should mention here that this code may not be the most optimal due to the fact that for each point it'll do a locale shift. Ideally, I'd like to implement this in a fashion as "for each locale do for each point", but this requires you knowing the points assigned to a particular locale. Currently, Chapel doesn't have a readily available method to retrieve this info. However, Chapel developers suggested some workaround to my mail on Identifying Local Indices of Distributed Arrays

No comments :

Post a Comment

Automate Phylogenetic Tree Coloring: Dendroscope
Dendroscope ( is a neat tool that we (SALSA group in IU) have been using in our research work related to phylogenetic trees. Recently we encountered a scenario where leaf nodes need to be colored according to a corresponding cluster number. We already had generated the mapping from leaf node labels to RGB color strings. To complicate it further, we wanted to color the “lowest single ancestor” (LSA) induced sub trees for each cluster in the same color.

Hard Labor

Open tree file using Dendroscope –> select leaf nodes for a particular cluster –> click “Select –> Advanced Selection –> LSA Induced Network” –> Ctrl+J –> check “Label Color” and “Line  Color” –> pick the desired color –> hit “close” –> click somewhere to unselect nodes/edges
Repeat this (except opening file) for each cluster.

Hard Labor with Pain Killer

You can avoid the step in bold above by having a text file with label names for the particular cluster. Then after opening the tree file you can hit Ctrl+f and specify your text file there (using the open folder icon in that). This will go through each line of the text file and select the labeled leaf nodes. This avoids having to click numerous times just to select :). Still you have to go through the pain of selecting the LSA sub tree and modifying colors.

No Pain, YES! gain

The clean way is to automate this process and Dendroscope facilitates this through its command line. Here’s what you need to generate using a simple script written by you.
open file=<path-to-your-tree-file>;select all;set edgewidth=2;set color=0 0 0;set labelcolor=0 0 0;deselect all;
find searchtext=abc;find searchtext=def
select LSA induced network;set color=r g b;set labelcolor=r g b;deselect all;
Some explanation please …
Line 1 will simply open your tree file and set all the edges to a width of 2 and everything to black
For each label in a particular cluster you need to add find searchtext=<label>; as in Line 2 (remember to separate with semi-colon)
Line 3 will select the LSA sub tree for the labels in Line 2 and line and label color to whatever color specified by r g b values
You need to include corresponding Line 2 and 3 for each cluster.
Once you generate this file simply open Dendroscope and go to “Window –> Command Input”. In the pop-up command window type source file=<path-to-your-generated-file> and hit “Apply”. That’s it enjoy!!
A good reference is available in Dendroscope’s manual at


Post a Comment

How to Compute The Sum of Squares for First N Integers

No comments
We need to find,
How to get that?
Let’s consider the expansion of clip_image002[9]
Now let’s substitute values for clip_image002[15]
if you sum them from top to bottom
Taking the sum of arithmetic series,
Special thanks to my friend Jayampathi Rathnayake ( for his insight on the approach.

Another good resource

No comments :

Post a Comment

Drilling Straight with a Hand Drill

No comments

If you don’t have a drill guide like then drilling straight using a hand drill will be a pain and will often result angled holes.

Here’s a simple and elegant (though not precise) solution. You will need a square ruler like The trick is that all drills have visible center line. Use the square ruler to guide this in a straight line.


When you drill without a guide as shown above, the drill is free to move along both blue and red axes. You can restrict this by using it as shown below with the square ruler.


The square ruler has a flat edge, so when you keep it as above, it will stay perpendicular to the drilling surface. You can rotate the drill to have its center line and the square ruler to be in the same vertical plane as blue axis is. Then, with bit of patience, you can guide the drill while keeping the center line in the same plane. Still the drill is free to move in both directions, but you can use the ruler to guide it very well restricting any movement along red axis.

OK, what about movement along blue axis? This is where you need to have some practice. You can use your eyesight to keep the distance between drill bit and the vertical edge of the ruler constant while drilling to make sure you are not moving along blue axis.

As I said earlier, this is not perfect or highly precise, but when you don’t have any the fancy guiding tools, this works like a charm.


Image source for the drill, hand, and wooden piece

No comments :

Post a Comment

Weekend Carpentry: Monitor Stand


I’ve been wanting a monitor stand for sometime now as the Dell monitor that I am having doesn’t have an adjustable stand. Apparently, these stands cost around $30-40 and it didn’t seem worth the money.

So I ended up building one out of some old furniture parts I had. In the end it turned out to be better than I expected :). In fact comparable to what you can buy online

Here are some photos.

IMG_4733 IMG_4735

Finished product

It has enough space underneath to store my files

IMG_4743 IMG_4742

Top view

Bottom view

IMG_4740 IMG_4746

Just before I cleaned up the place


My old monitor stand; just a cardboard box from Amazon :)


Post a Comment

Lexus Aux/IPod Input: iSimple Gateway

No comments

I’ve been wanting to use an auxiliary input with our 2006 Lexus ES for sometime and found this very neat solution from iSimple After doing some research on other available products on the market I decided to go with this one as it can control an IPod (or any other I* music device) right from the car stereo controls.

Setting up is pretty easy and sound quality is very well indeed! Here are the things  you will need and steps to follow.


  • A socket wrench and 10mm socket. It’s helpful if you have an extension as well.
image image
  • [Optional] A plastic pry bar like the one on below (left). I simply used the one on right and trust me it’s very nice and does no harm to the leather.
image image



  • You may need to think where to run your cables. I ran them to the center console box and with some effort managed to make nice installation without any drilling. Just post a comment if you need more information. I wish I took some pictures.
  • I used Velcro tape to keep the device attached to the car behind AC control panel.

No comments :

Post a Comment

Recursion is Natural

No comments



“Yay! I can see my bu** !! … wait, it’s not my bu**, it’s myself !”

If you can come to the last realization that you are seeing yourself then you already understand recursion is natural. If not, the following few examples may help.

Example 1: Is Even?

Zero is even. Any integer N > 0 is even if N-1 is not even.

Example 2: Factorial

Factorial of zero is 1. Factorial of any integer N > 0 is N times factorial of N-1.

Example 2: Length of a List

An empty list has a length zero. Any other list has one head element and a sub list called the tail. So length is 1 more than the length of the tail.

Example 3: Map f() to List S

If S is empty then nothing to do just return an empty list. If not map f() to the tail and get a mapped list. Then add f(head) to the front of that list.


P.S. Many thanks to Dan Friedman ( for his great class of B521 ( at IU, 2009.


About the image; it’s an InkScape sketching of the image I found at

No comments :

Post a Comment

Taming Wild Horses: Chapel Asynchronous Tasks

No comments

Chapel supports nesting data parallel and task parallel code arbitrary as desired. This allows you, for example, to spawn asynchronous tasks inside a forall loop. The code snippet below shows code for a case like this where a forall is run on an array of 3 elements. The work to be done for second element is time consuming, hence a new task is spawned to run the timeeater(). Seems straightforward isn’t it? What if timeeater() takes more time than the forall loop? You’d expect forall to wait till all the dynamically spawned tasks to complete, but unfortunately it’s not the case. So if you want everything to be done when you exit forall loop use the construct sync to synchronize.

Try running the code with and without sync and observe the value of result, which should be 500500 if forall exit only after all the tasks have completed.

var d : domain(1) = [1..3];
var a : [d] int = d;
var result : int;
sync forall i in a{
    if (i == 2) then {
       begin result = timeeater();
writeln("end of forall");
writeln("result ", result);

proc timeeater(){
    var tmp : int = 0;
    for i in 1 .. 1000{
        tmp = tmp + i;
        if (i%25 == 0) then {
            writeln("eating time ", i);
    return tmp;

No comments :

Post a Comment

Chapel is Sweet

No comments

It has been a little while since I started playing around Chapel ( language, but could not run anything fun and large until recently. As part of the B524 – Parallelism in Programming Languages and Systems class from Prof. Lumsdaine (, we had to implement Single Source Shortest Path (SSSP) of Graph500 ( specification. Only then I could realize the easiness of many of the high-level abstractions provided in Chapel compared to other parallel languages or paradigms. Honestly, I did not expect it to work in the first run across a set of machines, but surprisingly it did!

No comments :

Post a Comment

Download a Set of URLs with GNU Wget

No comments

I had a list of URLs that I wanted to download and it was a pain to do it manually. So end up writing a simple shell script and downloading all of them using GNU Wget. Here’s the shell script (modified the one at

# Set the field seperator to a newline
# Loop through the file
for line in `cat file.txt`;do
wget $line

No comments :

Post a Comment

Blogging: Images in Comments

1 comment

Finally, an awesome solution to a problem that I’ve been searching for quite a while: how to add an image in a comment to blog post?

Look no further, just visit Spice Up Your Blog on this at

Apparently it has just more than adding images like colored text and scrolling text.

See the test comments I made for fun

1 comment :

Post a Comment

Windows Live Writer: Life Made Easy for Blogging

I've been using Blogger for quite some time and over the years they have improved their Web based blog editor a lot, yet there was some uneasiness always when thinking about editing or adding a post. Anyway, after I got to know about Windows Live Writer I wanted to get it setup with Blogger but it wasn’t successful simply for some reason I couldn't understand “Username or password is incorrect”! I remembered only today that I was using two step verification with Google and that I have to create an application specific password to connect to Blogger. Wish it came to my mind sooner! Anyway, it’s now working fine and if you can read this online that means I made a post successfully with Live Writer.
Few places to note if you are having trouble connecting to Blogger with Live Writer as I did.
  1. Blog URL: Don’t forget to use https instead of http
  2. Username: Remember to add to your user id
  3. Password: As mentioned above, if you are using a two step verification with Google you need to generate application specific password to connect (see


Post a Comment

A Small Contribution: Substitution in The World of Lambda

No comments
Feeling a humble joy for being able to make a small contribution to a great resource: Semantics Engineering with PLT Redex, errata on substitution

Thank you Professor Matthias Felleisen for posting it and Professor Amr Sabry at Indiana University for the inspiring course on B522 - Foundations in Programming Languages, which made this possible.

No comments :

Post a Comment

SugarSync Public Link: Direct Download

SugarSync public links are great, but unfortunately they've added an intermediate download page instead of the direct download that used to be there (see more on this at

May be it was done with good intentions, but it broke all the image links we had in our Web site. Anyway, it seems there's a way around to get images working back in your Web pages without much hassle.

The solution is just add the following to the end of each image link (I know it's bit work too, but way better than having copy images to a local folder and linking them again manually).


Update 3/27/2012:

I tried the same trick to put an image to a blog post using a SugarSync public link, but it wasn’t successful. As it seems Blogger’s image retrieving service couldn’t handle the directDownload=true.


Post a Comment