{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"%matplotlib inline"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n# Enhancing QuickBundles with different metrics and features\n\nQuickBundles [Garyfallidis12]_ is a flexible algorithm that requires only a\ndistance metric and an adjacency threshold to perform clustering. There is a\nwide variety of metrics that could be used to cluster streamlines.\n\nThe purpose of this tutorial is to show how to easily create new ``Feature``\nand new ``Metric`` classes that can be used by QuickBundles.\n\n\n## Clustering framework\nDIPY_ provides a simple, flexible and fast framework to do clustering of\nsequential data (e.g. streamlines).\n\nA *sequential datum* in DIPY is represented as a numpy array of size\n$(N imes D)$, where each row of the array represents a $D$ dimensional\npoint of the sequence. A set of these sequences is represented as a list of\nnumpy arrays of size $(N_i imes D)$ for $i=1:M$ where $M$ is the\nnumber of sequences in the set.\n\nThis clustering framework is modular and divided in three parts:\n\n#. Feature extraction\n\n#. Distance computation\n\n#. Clustering algorithm\n\nThe **feature extraction** part includes any preprocessing needed to be done on\nthe data before computing distances between them (e.g. resampling the number of\npoints of a streamline). To define a new way of extracting features, one has to\nsubclass ``Feature`` (see below).\n\nThe **distance computation** part includes any metric capable of evaluating a\ndistance between two sets of features previously extracted from the data. To\ndefine a new way of extracting features, one has to subclass ``Metric`` (see\nbelow).\n\nThe **clustering algorithm** part represents the clustering algorithm itself\n(e.g. QuickBundles, K-means, Hierarchical Clustering). More precisely, it\nincludes any algorithms taking as input a list of sequential data and\noutputting a ``ClusterMap`` object.\n\n\n## Extending `Feature`\nThis section will guide you through the creation of a new feature extraction\nmethod that can be used in the context of this clustering framework. For a\nlist of available features in DIPY see `example_segment_clustering_features`.\n\nAssuming a set of streamlines, the type of features we want to extract is the\narc length (i.e. the sum of the length of each segment for a given streamline).\n\nLet's start by importing the necessary modules.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"import numpy as np\n\nfrom dipy.data import get_fnames\nfrom dipy.io.streamline import load_tractogram\nfrom dipy.tracking.streamline import Streamlines\nfrom dipy.viz import window, actor, colormap\nfrom dipy.segment.clustering import QuickBundles\nfrom dipy.segment.featurespeed import Feature, VectorOfEndpointsFeature\nfrom dipy.segment.metric import Metric, SumPointwiseEuclideanMetric\nfrom dipy.tracking.streamline import length"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We now define the class ``ArcLengthFeature`` that will perform the desired\nfeature extraction. When subclassing ``Feature``, two methods have to be\nredefined: ``infer_shape`` and ``extract``.\n\nAlso, an important property about feature extraction is whether or not\nits process is invariant to the order of the points within a streamline.\nThis is needed as there is no way one can tell which extremity of a\nstreamline is the beginning and which one is the end.\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"class ArcLengthFeature(Feature):\n \"\"\" Computes the arc length of a streamline. \"\"\"\n def __init__(self):\n # The arc length stays the same even if the streamline is reversed.\n super(ArcLengthFeature, self).__init__(is_order_invariant=True)\n\n def infer_shape(self, streamline):\n \"\"\" Infers the shape of features extracted from `streamline`. \"\"\"\n # Arc length is a scalar\n return 1\n\n def extract(self, streamline):\n \"\"\" Extracts features from `streamline`. \"\"\"\n # return np.sum(np.sqrt(np.sum((streamline[1:] - streamline[:-1]) ** 2)))\n # or use a DIPY's function that computes the arc length of a streamline.\n return length(streamline)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The new feature extraction ``ArcLengthFeature`` is ready to be used. Let's use\nit to cluster a set of streamlines by their arc length. For educational\npurposes we will try to cluster a small streamline bundle known from\nneuroanatomy as the fornix.\n\nWe start by loading the fornix streamlines.\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"fname = get_fnames('fornix')\nfornix = load_tractogram(fname, 'same',\n bbox_valid_check=False).streamlines\n\nstreamlines = Streamlines(fornix)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Perform QuickBundles clustering using the metric\n``SumPointwiseEuclideanMetric`` and our ``ArcLengthFeature``.\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"metric = SumPointwiseEuclideanMetric(feature=ArcLengthFeature())\nqb = QuickBundles(threshold=2., metric=metric)\nclusters = qb.cluster(streamlines)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We will now visualize the clustering result.\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# Color each streamline according to the cluster they belong to.\ncmap = colormap.create_colormap(np.ravel(clusters.centroids))\ncolormap_full = np.ones((len(streamlines), 3))\nfor cluster, color in zip(clusters, cmap):\n colormap_full[cluster.indices] = color\n\nscene = window.Scene()\nscene.SetBackground(1, 1, 1)\nscene.add(actor.streamtube(streamlines, colormap_full))\nwindow.record(scene, out_path='fornix_clusters_arclength.png', size=(600, 600))\n\n# Enables/disables interactive visualization\ninteractive = False\nif interactive:\n window.show(scene)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
".. figure:: fornix_clusters_arclength.png\n :align: center\n\n Showing the different clusters obtained by using the arc length.\n\n\n## Extending `Metric`\nThis section will guide you through the creation of a new metric that can be\nused in the context of this clustering framework. For a list of available\nmetrics in DIPY see `example_segment_clustering_metrics`.\n\nAssuming a set of streamlines, we want a metric that computes the cosine\ndistance giving the vector between endpoints of each streamline (i.e. one\nminus the cosine of the angle between two vectors). For more information\nabout this distance check [](http://en.wikipedia.org/wiki/Cosine_similarity).\n\nLet's start by importing the necessary modules.\n\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We now define the class ``CosineMetric`` that will perform the desired\ndistance computation. When subclassing ``Metric``, two methods have to be\nredefined: ``are_compatible`` and ``dist``. Moreover, when implementing the\n``dist`` method, one needs to make sure the distance returned is symmetric\n(i.e. `dist(A, B) == dist(B, A)`).\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"class CosineMetric(Metric):\n \"\"\" Computes the cosine distance between two streamlines. \"\"\"\n def __init__(self):\n # For simplicity, features will be the vector between endpoints of a streamline.\n super(CosineMetric, self).__init__(feature=VectorOfEndpointsFeature())\n\n def are_compatible(self, shape1, shape2):\n \"\"\" Checks if two features are vectors of same dimension.\n\n Basically this method exists so that we don't have to check\n inside the `dist` method (speedup).\n \"\"\"\n return shape1 == shape2 and shape1[0] == 1\n\n def dist(self, v1, v2):\n \"\"\" Computes a the cosine distance between two vectors. \"\"\"\n norm = lambda x: np.sqrt(np.sum(x**2))\n cos_theta = np.dot(v1, v2.T) / (norm(v1)*norm(v2))\n\n # Make sure it's in [-1, 1], i.e. within domain of arccosine\n cos_theta = np.minimum(cos_theta, 1.)\n cos_theta = np.maximum(cos_theta, -1.)\n return np.arccos(cos_theta) / np.pi # Normalized cosine distance"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The new distance ``CosineMetric`` is ready to be used. Let's use\nit to cluster a set of streamlines according to the cosine distance of the\nvector between their endpoints. For educational purposes we will try to\ncluster a small streamline bundle known from neuroanatomy as the fornix.\n\nWe start by loading the fornix streamlines.\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"fname = get_fnames('fornix')\nfornix = load_tractogram(fname, 'same', bbox_valid_check=False)\nstreamlines = fornix.streamlines"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Perform QuickBundles clustering using our metric ``CosineMetric``.\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"metric = CosineMetric()\nqb = QuickBundles(threshold=0.1, metric=metric)\nclusters = qb.cluster(streamlines)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We will now visualize the clustering result.\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# Color each streamline according to the cluster they belong to.\ncmap = colormap.create_colormap(np.arange(len(clusters)))\ncolormap_full = np.ones((len(streamlines), 3))\nfor cluster, color in zip(clusters, cmap):\n colormap_full[cluster.indices] = color\n\nscene = window.Scene()\nscene.SetBackground(1, 1, 1)\nscene.add(actor.streamtube(streamlines, colormap_full))\nwindow.record(scene, out_path='fornix_clusters_cosine.png', size=(600, 600))\nif interactive:\n window.show(scene)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
".. figure:: fornix_clusters_cosine.png\n :align: center\n\n Showing the different clusters obtained by using the cosine metric.\n\n.. include:: ../links_names.inc\n\n### References\n\n.. [Garyfallidis12] Garyfallidis E. et al., QuickBundles a method for\n tractography simplification, Frontiers in Neuroscience, vol 6, no 175,\n 2012.\n\n\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.13"
}
},
"nbformat": 4,
"nbformat_minor": 0
}