create_dendrogram
A dendrogram, also known as hierarchical tree, represents the output of a hierarchical clustering algorithm as a tree. The create_dendrogram
function from the figure_factory
module performs and represents a hierarchical clustering of your data.
Figure factory is a module for complex graphs prior to plotly express that still has some legacy functions and other functions like create_dendrogram
that don’t have an equivalent in plotly express.
The create_dendrogram
function takes a matrix of observations as array of arrays. In the following example we have 20 samples with 5 dimensions each.
import plotly.figure_factory as ff
import numpy as np
np.random.seed(3)
# 20 samples, with 5 dimensions each
X = np.random.rand(20, 5)
fig = ff.create_dendrogram(X)
fig.update_layout(autosize = True)
fig.show()
Color threshold
The number of clusters can be selected based on the height of the dendrogram at which the separation must be made. By default, the function creates a reasonable number of clusters but you can select a customized height with color_threshold
, as in the example below.
import plotly.figure_factory as ff
import numpy as np
np.random.seed(3)
# 20 samples, with 5 dimensions each
X = np.random.rand(20, 5)
fig = ff.create_dendrogram(X, color_threshold = 1.2)
fig.update_layout(autosize = True)
fig.show()
Labels
Each observation or sample can be labeled making use of the labels
argument. You will need to input an array with as many texts as observations.
import plotly.figure_factory as ff
import numpy as np
np.random.seed(3)
# 20 samples, with 5 dimensions each
X = np.random.rand(20, 5)
text = [chr(x) for x in range(65, 85)] # Letters
fig = ff.create_dendrogram(X, labels = text)
fig.update_layout(autosize = True)
fig.show()
Color customization
The colorscale
argument can be used to customize the colors. Following the original documentation of the function, for some reason an array with 8 colors is needed but the 7th will be ignored.
import plotly.figure_factory as ff
import numpy as np
np.random.seed(3)
# 20 samples, with 5 dimensions each
X = np.random.rand(20, 5)
colors = ['red', 'blue', 'green', 'darkorange',
'gold', 'lightcoral', 'orangered', 'brown']
fig = ff.create_dendrogram(X, colorscale = colors)
fig.update_layout(autosize = True)
fig.show()
The orientation of the dendrogram can be customized though the orientation
argument, which defaults to 'bottom'
, but can also be 'left'
, 'right'
or 'top'
, as shown below. This can be very useful if you want to add a dendrogram to the sides of a heatmap.
Left
import plotly.figure_factory as ff
import numpy as np
np.random.seed(3)
# 20 samples, with 5 dimensions each
X = np.random.rand(20, 5)
fig = ff.create_dendrogram(X, orientation = 'left')
fig.update_layout(autosize = True)
fig.show()
Right
import plotly.figure_factory as ff
import numpy as np
np.random.seed(3)
# 20 samples, with 5 dimensions each
X = np.random.rand(20, 5)
fig = ff.create_dendrogram(X, orientation = 'right')
fig.update_layout(autosize = True)
fig.show()
Top
import plotly.figure_factory as ff
import numpy as np
np.random.seed(3)
# 20 samples, with 5 dimensions each
X = np.random.rand(20, 5)
fig = ff.create_dendrogram(X, orientation = 'top')
fig.update_layout(autosize = True)
fig.show()
See also