Adding annotations to visualizations using Matplotlib

In this post, I’ll show you how to add annotations to your visualizations built using Matplotlib.

Annotations allow you to put text labels, boxes and arrows anywhere you like on your graph. Combined with horizontal and vertical lines indicating certain thresholds or values in your data, annotations are an important tool for making your visualizations easy to understand.

Since creating visualizations in Matplotlib is a large topic, I’m going to focus on the annotations and other icing on the metaphorical cake, which assumes you already know the basics of creating graphs. However, it’s worth noting that you can add annotations to graphs built using other libraries, such as Seaborn, as long as they use Matplotlib under the hood. In fact, the examples here were created using Seaborn, to take advantage of its more modern look. Just use the appropriate technique to access the Matplotlib axes object, which I call ax in my examples.

Consider our first example below, showing the median worldwide box office revenue for a sample of films released in the US over the past 10 years. [1]

Median box office revenue graph, without annotations
Example 1, before annotation

It’s pretty easy to read, and I can definitely see some patterns I might find interesting. But let’s say my audience is concerned with whether movies being released through online streaming services have impacted box office revenues. An annotation would be a great way to show when that began.

Example 1 with annotation

Let’s walk through the code specific to the annotation to see how it works.

Example 1 annotation code
  • ax.annotate() : This is the function you use to add both the label text and arrow component to the appropriate Matplotlib axes.
  • label_text: The first parameter you pass to ax.annotate should be the annotation text itself. Note that you can add line breaks to it as I have in this example, if your text is long and you want to control how it wraps.
  • xy: Provide a tuple representing the x and y coordinates of the point where the arrow will start. In our example I’m not using a pointy arrow, but this is basically the coordinate where the arrow points to.
  • xytext: Provide a tuple representing the x and y coordinates of the point where the text label will go.
  • arrowprops: Optional. Provide a dictionary of attributes to format the arrow. Here I’ve used color to make it black, lw to adjust the line width and make it a bit heavier than default, and arrowstyle to change the tip of the arrow to a bracket instead of a point. See other arrowstyles you can use here, and some examples of other properties you can set in arrowprops.
  • ha: Optional. Text is left-aligned by default, but you can center it using the horizontal alignment parameter.
  • size: Optional. Provide an integer representing the desired font size to adjust up or down from the default.

A few notes about the coordinate system:

Matplotlib assumes you’re providing data coordinates, corresponding to the values plotted on the x and y axes. You can also specify that you want to provide these coordinates using a different method, such as in fractions of the figure size or points, using the xycoords and textcoords parameters.

In the example above, the x axis is categorical (even though the years are numbers). The major ticks are represented by a 0-based index, and I’ve placed my annotation above the 6th bar, so the x coordinate is 5. You can use ax.xaxis.get_majorticklocs() and ax.yaxis.get_majorticklocs() to have Matplotlib spit out the data coordinates for your x and y axes, if you’re in this situation and not sure what the data coordinates are in your figure.

A few notes about text alignment:

By default, annotation text is left-aligned, and the coordinates you provide in xytext represent where the left corner of the box will be positioned. Whether it’s the bottom left corner, or the top left corner will depend on whether the arrow is positioned above or below the text; Matplotlib helpfully adjusts this automatically to be the most appropriate corner for our situation.

To illustrate how alignment works, I created the example below using default left alignment.

Left-aligned text

Matplotlib places each the text blurb with the left corner at the x coordinate specified, aligned with the point of the arrow. But it still draws the arrow from the center of the text blurbs, so the arrows are angled. Thankfully, Matplotlib just draws the line for us at the correct angle.

Left-aligned annotation example

Below is the same example, except using center ha (or horizontalalignment, if you want to write it out in full).

Centered text

Now, you can see that the text blurbs are centered at the xytext x coordinates, and the angles of the arrows have been adjusted accordingly.

Centered annotation example

Let’s look at another example and see how we can leverage multiple annotations paired with horizontal and vertical lines. The graph below plots budgets against profits for the same movie data set.

Example 2, before annotation

Well, looking at this visualization I could say generally that as movie budgets increase, the profits also increase.

However, it may not be immediately clear to my audience whether there are any other takeaways from this. Let’s see if adding some annotations and lines might help.

Example 2, with annotation

Here, we’ve plotted a vertical line at the 75th percentile, and a horizontal line at 0 to indicate the break-even point. These lines, along with their descriptive annotations of course, provide reference points so our audience can understand the distribution of data better.

Let’s take a look at some of the new code compared to the first example.

Example 2 annotation code
  • ax.vlines() : I’ve created a red vertical line at the budget’s 75th percentile. The first parameter I passed is the x coordinate, and then I also specified ymin and ymax values so the line doesn’t go all the way to the edge of the figure.
  • ax.hlines(): I’ve similarly created a green horizontal line at y=0 to highlight the break-even point, showing that there definitely are some movies that don’t make back their budgets.
  • ax.text() : This is an alternative to using ax.annotation() if you want to draw a box around the text instead of adding an arrow. I’ve provided the x-coordinate, y-coordinate, and text to go in the box as my first 3 parameters, as well as the horizontal and vertical alignment, font size, and the font color. Finally, since my color is too light to show up well on the light grey background, I’ve indicated I want a darker grey background, along with other formatting options for the box, in the bbox parameter, which should be passed as a dictionary.

We’ve only reviewed two examples here, but hopefully I’ve shown that adding annotations can make your Matplotlib visualizations much more accessible and easy to read for a variety of audiences. And although Matplotlib can seem complicated at first, including annotations are actually quite simple and intuitive to implement.

Check out Matplotlib’s own documentation for many more options to style your annotations.

Sources:

[1]: Movie data sourced from The Movie Database via their API in March 2021.

[2]: https://en.wikipedia.org/wiki/List_of_Amazon_Studios_films; accessed March 18, 2021

[3]: https://en.wikipedia.org/wiki/List_of_Netflix_original_films_(2015%E2%80%932017); accessed March 18, 2021

Student of Data Science

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store