For a recent project, I had the need to cluster groups of Twitter users. After researching several algorithms, I decided to implement Markov Clustering to see how it would go.

I wrote this in PHP as the rest of the project was in PHP and it was a good test to see if I could implement the algorithm from scratch.

The algorithm is detailed here: http://micans.org/mcl/ and an example is available here which explain it much better than I could.

The basic steps are:

- Create matrix representing the nodes in your graph as rows and columns with the elements representing the paths between them.
- Add self loops to the nodes
- Normalise the Matrix
- Expand by taking the (Expansion Parameter)th power of the matrix
- Inflate by taking inflation of the resulting matrix with the inflation parameter
- Repeat steps 5 and 6 until a steady state is reached (convergence).
- Interpret resulting matrix to discover clusters.

The algorithm takes two parameters in addition to the matrix, an expansion parameter (which the matrix will be raised to the power of) and an inflation parameter (which all elements of the matrix will be raised to the power of).

Once the algorithm has completed, you should end up with a sparse Matrix (full of mostly 0) and then can interpret groups by looking at rows and columns that have values at their intersection (implying they are in the same cluster).

Code is available on GitHub at https://github.com/jrowlandsnz/mcl-php