`scatter`

function in matplotlib
Matplotlib provides a function named `scatter`

which allows creating fully-customizable scatter plots in Python. In order to create a basic scatter plot you just need to pass arrays to the `x`

and `y`

arguments with your data.

```
import numpy as np
import matplotlib.pyplot as plt
# Data
x = np.array([3, 8, 5, 6, 1, 9, 6, 7, 2, 1, 8])
y = np.array([4, 5, 2, 4, 6, 1, 4, 6, 5, 2, 3])
# Plot
fig, ax = plt.subplots()
ax.scatter(x = x, y = y)
# plt.show()
```

The default marker or symbol of a scatter plot is a circle but the argument `marker`

allows customizing the markers. Possible options are the ones from this list.

```
import numpy as np
import matplotlib.pyplot as plt
# Data
x = np.array([3, 8, 5, 6, 1, 9, 6, 7, 2, 1, 8])
y = np.array([4, 5, 2, 4, 6, 1, 4, 6, 5, 2, 3])
# Plot
fig, ax = plt.subplots()
ax.scatter(x, y, marker = "*")
# plt.show()
```

**LaTeX markers**

Note that in addition to the matplotlib markers you can also use LaTeX symbols adding them the following way:

```
import numpy as np
import matplotlib.pyplot as plt
# Data
x = np.array([3, 8, 5, 6, 1, 9, 6, 7, 2, 1, 8])
y = np.array([4, 5, 2, 4, 6, 1, 4, 6, 5, 2, 3])
# Plot
fig, ax = plt.subplots()
ax.scatter(x, y, marker = r'$\clubsuit$')
# plt.show()
```

The `scatter`

function provides several arguments to customize the markers in several ways. If you want to change the default blue color you can set a new color using `c`

.

```
import numpy as np
import matplotlib.pyplot as plt
# Data
x = np.array([3, 8, 5, 6, 1, 9, 6, 7, 2, 1, 8])
y = np.array([4, 5, 2, 4, 6, 1, 4, 6, 5, 2, 3])
# Plot
fig, ax = plt.subplots()
ax.scatter(x, y, c = "red")
# plt.show()
```

**Color by group**

You can also set a color by group by creating an array with colors, as in the example below.

```
import numpy as np
import matplotlib.pyplot as plt
# Data
x = np.array([3, 8, 5, 6, 1, 9, 6, 7, 2, 1, 8])
y = np.array([4, 5, 2, 4, 6, 1, 4, 6, 5, 2, 3])
color = np.where(x < 5, "yellow", "lightblue")
# Plot
fig, ax = plt.subplots()
ax.scatter(x, y, c = color)
# plt.show()
```

**Gradient color**

If you pass a numerical array to `c`

the points will be colored with a color palette, as shown below. The default color palette (viridis) can be changed with the `cmap`

argument.

```
import numpy as np
import matplotlib.pyplot as plt
# Data
x = np.array([3, 8, 5, 6, 1, 9, 6, 7, 2, 1, 8])
y = np.array([4, 5, 2, 4, 6, 1, 4, 6, 5, 2, 3])
# Plot
fig, ax = plt.subplots()
ax.scatter(x, y, c = np.sqrt(x ** 2 + y ** 2))
# plt.show()
```

**Markers transparency**

In addition, the markers transparency can be set with `alpha`

, which ranges from 0 (invisible) to 1 (completely opaque).

```
import numpy as np
import matplotlib.pyplot as plt
# Data
x = np.array([3, 8, 5, 6, 1, 9, 6, 7, 2, 1, 8])
y = np.array([4, 5, 2, 4, 6, 1, 4, 6, 5, 2, 3])
# Plot
fig, ax = plt.subplots()
ax.scatter(x, y, alpha = 0.5)
# plt.show()
```

**Border color of the markers**

Finally, you can also customize the border of the symbol markers by using the `edgecolors`

argument, which defaults to the fill color of the symbol. You can also customize its width with `linewidths`

.

```
import numpy as np
import matplotlib.pyplot as plt
# Data
x = np.array([3, 8, 5, 6, 1, 9, 6, 7, 2, 1, 8])
y = np.array([4, 5, 2, 4, 6, 1, 4, 6, 5, 2, 3])
# Plot
fig, ax = plt.subplots()
ax.scatter(x, y, c = "white",
edgecolors = "black", linewidths = 1.5)
# plt.show()
```

The argument `s`

allows customizing the markers size. The unit are “points ^ 2”.

```
import numpy as np
import matplotlib.pyplot as plt
# Data
x = np.array([3, 8, 5, 6, 1, 9, 6, 7, 2, 1, 8])
y = np.array([4, 5, 2, 4, 6, 1, 4, 6, 5, 2, 3])
# Plot
fig, ax = plt.subplots()
ax.scatter(x, y, s = 200)
# plt.show()
```

**Size based on a variable**

An alternative is to set the size based on a numerical variable of the same length of the data. This type of chart is known as bubble plot.

```
import numpy as np
import matplotlib.pyplot as plt
# Data
x = np.array([3, 8, 5, 6, 1, 9, 6, 7, 2, 1, 8])
y = np.array([4, 5, 2, 4, 6, 1, 4, 6, 5, 2, 3])
size = x * 25
# Plot
fig, ax = plt.subplots()
ax.scatter(x, y, s = size)
# plt.show()
```

There are several ways to add a legend to a scatter plot in matplotlib. The selection between the methods will depend on your use case. If you want to set a label for a single marker set the name with `label`

and place the legend with `legend`

.

```
import numpy as np
import matplotlib.pyplot as plt
# Data
x = np.array([3, 8, 5, 6, 1, 9, 6, 7, 2, 1, 8])
y = np.array([4, 5, 2, 4, 6, 1, 4, 6, 5, 2, 3])
# Plot
fig, ax = plt.subplots()
ax.scatter(x, y, label = "Points")
plt.legend(loc = "upper right")
# plt.show()
```

**Splitting the data**

If you split the data in several groups and you add the points independently you can add the legend for all the groups, as in the following example.

```
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
# Data 1
x1 = np.array([1, 1, 2, 3])
y1 = np.array([2, 6, 5, 4])
plt.scatter(x1, y1, c = "red", label = "Group 1")
# Data 2
x2 = np.array([5, 6, 7, 8, 8, 9])
y2 = np.array([2, 4, 6, 5, 3, 1])
plt.scatter(x2, y2, c = "blue", label = "Group 2")
# Add the legend
plt.legend()
# plt.show()
```

**Using mpatches**

Another way is to use `mpatches`

from `matplotlib.patches`

, but note that by default the legend won’t show the markers symbol, but rectangles.

```
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
# Data
x = np.array([3, 8, 5, 6, 1, 9, 6, 7, 2, 1, 8])
y = np.array([4, 5, 2, 4, 6, 1, 4, 6, 5, 2, 3])
color = np.where(x < 5, "red", "green")
# Plot
fig, ax = plt.subplots()
ax.scatter(x, y, c = color, label = color)
# Legend labels
red = mpatches.Patch(color = "red", label = "Red points")
green = mpatches.Patch(color = "green", label = "Green points")
# Legend
plt.legend(handles = [red, green])
# plt.show()
```

See also