create_dendrogram
A dendrogram, also known as hierarchical tree, represents the output of a hierarchical clustering algorithm as a tree. The create_dendrogram function from the figure_factory module performs and represents a hierarchical clustering of your data.
Figure factory is a module for complex graphs prior to plotly express that still has some legacy functions and other functions like create_dendrogram that don’t have an equivalent in plotly express.
The create_dendrogram function takes a matrix of observations as array of arrays. In the following example we have 20 samples with 5 dimensions each.
import plotly.figure_factory as ff
import numpy as np
np.random.seed(3)
# 20 samples, with 5 dimensions each
X = np.random.rand(20, 5)
fig = ff.create_dendrogram(X)
fig.update_layout(autosize = True)
fig.show()
Color threshold
The number of clusters can be selected based on the height of the dendrogram at which the separation must be made. By default, the function creates a reasonable number of clusters but you can select a customized height with color_threshold, as in the example below.
import plotly.figure_factory as ff
import numpy as np
np.random.seed(3)
# 20 samples, with 5 dimensions each
X = np.random.rand(20, 5)
fig = ff.create_dendrogram(X, color_threshold = 1.2)
fig.update_layout(autosize = True)
fig.show()
Labels
Each observation or sample can be labeled making use of the labels argument. You will need to input an array with as many texts as observations.
import plotly.figure_factory as ff
import numpy as np
np.random.seed(3)
# 20 samples, with 5 dimensions each
X = np.random.rand(20, 5)
text = [chr(x) for x in range(65, 85)] # Letters
fig = ff.create_dendrogram(X, labels = text)
fig.update_layout(autosize = True)
fig.show()
Color customization
The colorscale argument can be used to customize the colors. Following the original documentation of the function, for some reason an array with 8 colors is needed but the 7th will be ignored.
import plotly.figure_factory as ff
import numpy as np
np.random.seed(3)
# 20 samples, with 5 dimensions each
X = np.random.rand(20, 5)
colors = ['red', 'blue', 'green', 'darkorange',
'gold', 'lightcoral', 'orangered', 'brown']
fig = ff.create_dendrogram(X, colorscale = colors)
fig.update_layout(autosize = True)
fig.show()
The orientation of the dendrogram can be customized though the orientation argument, which defaults to 'bottom', but can also be 'left', 'right' or 'top', as shown below. This can be very useful if you want to add a dendrogram to the sides of a heatmap.
Left
import plotly.figure_factory as ff
import numpy as np
np.random.seed(3)
# 20 samples, with 5 dimensions each
X = np.random.rand(20, 5)
fig = ff.create_dendrogram(X, orientation = 'left')
fig.update_layout(autosize = True)
fig.show()
Right
import plotly.figure_factory as ff
import numpy as np
np.random.seed(3)
# 20 samples, with 5 dimensions each
X = np.random.rand(20, 5)
fig = ff.create_dendrogram(X, orientation = 'right')
fig.update_layout(autosize = True)
fig.show()
Top
import plotly.figure_factory as ff
import numpy as np
np.random.seed(3)
# 20 samples, with 5 dimensions each
X = np.random.rand(20, 5)
fig = ff.create_dendrogram(X, orientation = 'top')
fig.update_layout(autosize = True)
fig.show()
See also