The data of the following block of code will be used in the examples below. On the right there is a table where you can see the data which will be used in this tutorial.
swarmplot
The swarmplot
function allows creating a bee swarm plot or swarm plot in Python when using seaborn. Note that you can pass a variable or a variable of a data frame, as shown below.
Vertical swarm plot
If you want to rotate your plot and create a vertical visualization you can input your variable to the y
argument of the function, instead of x
.
Size of the points
The default size of the points of the plot is 5. However, the size
argument allows choosing the size you desire based on your data.
Fill color customization
By default, the points of the seaborn swarm plot will be blue, but you can customize its color setting a new color with the color
argument of the swarmplot
function.
Border width and color of the points
In addition, it is possible to customize the width (which defaults to 0) and the color of the borders of the points with linewidth
and edgecolor
, respectively.
If you have a categorical variable representing groups you can pass both the numerical and the categorical variable to the function to create a swarm plot by group in Python with seaborn.
Orientation
Note that if you change the order of the variables you can create an horizontal swarm plot by group.
Custom order
You might have noticed that the order of the groups on the previous plot was G3, G2, and G1, which is the order of appearance of the groups of the categorical variable (check it seeing the table of the first section). To modify the order you can use the order
argument as follows.
Color palette
The palette
argument allows customizing the color palette of the plot. You can pass the name of a color palette or a dictionary with a color for each group.
Color based on a second categorical variable
If your data set contains a second grouping variable you can pass it to the hue
argument to colorize each group based on that subgroups.
Dodged swarm plot
In the previous scenario you can also set the argument dodge
to True
, so the data points will be separated based on the second group.
See also