- Stacked Bar Charts with Python’s Matplotlib
- An excellent way to visualize proportions and composition
- Stacked bar chart#
- Stacked bar chart in matplotlib
- Color customization
- Adding a legend for the subgroups
- Stacked bar chart with labels
- Stacked bar graph in python : Step By Step
- The stacked bar graph in python using Matplotlib –
- Step 1: Importing & Dummy data creation
- Step 2: Plotting stack barchart-
- Important Alternatives –
- 1. plotting multiple bar graphs in python –
- 2. Pandas as a data source for stack barchart-
- 3. Stacked Bar chart using Seaborn Library –
- Join our list
Stacked Bar Charts with Python’s Matplotlib
An excellent way to visualize proportions and composition
Bar charts are by far my favourite visualization technique. They are very versatile, usually easy to read, and relatively straightforward to build.
Just like any visualization, they do have some disadvantages as well. For example, they struggle with scalability.
Too many bars in a bar chart make it confusing and hard to read. That is more than ordinary when we’re working with hierarchical categories — In other words, when we have groups and subgroups that we need to visualize.
Stacked bars are a great alternative in those cases, allowing us to compare and analyze those groups’ composition.
In this article, we’ll explore how to build those visualizations with Python’s Matplotlib.
I’ll be using a simple dataset that holds data on video game copies sold worldwide. The dataset is quite outdated, but it’s suitable for the following examples.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
Let’s read and get a look at it.
df = pd.read_csv('../data/vgsales.csv')
df.head()
I want to visualize the total number of copies sold by platform and analyze the regions where they were sold.
Having the regions already separated into columns helps a lot; we only need to group the records by ‘Platform’ and sum the values from NA_Sales to Global_Sales.
Groupby → Sum → Select Fields
df_grouped = df.groupby('Platform').sum()[['NA_Sales','EU_Sales','JP_Sales','Other_Sales', 'Global_Sales']]df_grouped
Stacked bar chart#
This is an example of creating a stacked bar plot using bar .
import matplotlib.pyplot as plt import numpy as np # data from https://allisonhorst.github.io/palmerpenguins/ species = ( "Adelie\n $\\mu=$3700.66g", "Chinstrap\n $\\mu=$3733.09g", "Gentoo\n $\\mu=5076.02g$", ) weight_counts = "Below": np.array([70, 31, 58]), "Above": np.array([82, 37, 66]), > width = 0.5 fig, ax = plt.subplots() bottom = np.zeros(3) for boolean, weight_count in weight_counts.items(): p = ax.bar(species, weight_count, width, label=boolean, bottom=bottom) bottom += weight_count ax.set_title("Number of penguins with above average body mass") ax.legend(loc="upper right") plt.show()
Stacked bar chart in matplotlib
Stacked bar charts represent the values which take different groups within other groups. For instance, consider that people living in five different cities (e.g., the groups, named ‘G1’ , ‘G2’ , ‘G3’ , ‘G4’ and ‘G5’ ) answered Yes or No to one question and the results (the count of Yes and No for each city) were stored into values1 and values2 variables.
This data can be represented in Python with the bar function from matplotlib making use of the bottom argument, which represents the Y coordinates used to draw the bottom of the bars, so you can stack several bars.
import matplotlib.pyplot as plt # Data groups = ['G1', 'G2', 'G3', 'G4', 'G5'] values1 = [12, 19, 14, 27, 16] values2 = [21, 30, 15, 17, 20] fig, ax = plt.subplots() # Stacked bar chart ax.bar(groups, values1) ax.bar(groups, values2, bottom = values1) # plt.show()
Stacked bar plot with three subgroups
If you have to stack more than two subgroups you will need to pass the added values of the previous data to bottom , as in the example below.
import matplotlib.pyplot as plt import numpy as np # Data groups = ['G1', 'G2', 'G3', 'G4', 'G5'] values1 = [12, 19, 14, 27, 16] values2 = [21, 30, 15, 17, 20] values3 = [15, 23, 12, 11, 15] fig, ax = plt.subplots() # Stacked bar chart ax.bar(groups, values1) ax.bar(groups, values2, bottom = values1) ax.bar(groups, values3, bottom = np.add(values1, values2)) # plt.show()
Generalization with multiple subgroups
However, if you don’t want to stack the values manually of the stacked bar graph you can join the values into a numpy array and then use the following for loop:
import matplotlib.pyplot as plt import numpy as np # Data groups = ['G1', 'G2', 'G3', 'G4', 'G5'] values = np.array([[12, 19, 14, 27, 16], [21, 30, 15, 17, 20], [15, 23, 12, 11, 15], [2, 5, 1, 6, 8]]) fig, ax = plt.subplots() # Stacked bar chart with loop for i in range(values.shape[0]): ax.bar(groups, values[i], bottom = np.sum(values[:i], axis = 0)) # plt.show()
The width argument controls the width of the bars. The default value is 0.8.
import matplotlib.pyplot as plt # Data groups = ['G1', 'G2', 'G3', 'G4', 'G5'] values1 = [12, 19, 14, 27, 16] values2 = [21, 30, 15, 17, 20] width = 0.25 fig, ax = plt.subplots() # Stacked bar chart ax.bar(groups, values1, width = width) ax.bar(groups, values2, width = width, bottom = values1) # plt.show()
You can add error bars to the stacked bar plot with the yerr argument. You will need to input the standard deviation for each subgroup. Note that you can customize the color of the error bars with ecolor and the size of the caps with capsize .
import matplotlib.pyplot as plt # Data groups = ['G1', 'G2', 'G3', 'G4', 'G5'] values1 = [12, 19, 14, 27, 16] values2 = [21, 30, 15, 17, 20] values1_std = [2, 1, 3, 0.5, 2] values2_std = [1, 4, 0.25, 0.75, 1] fig, ax = plt.subplots() # Stacked bar chart ax.bar(groups, values1, yerr = values1_std, ecolor = 'red') ax.bar(groups, values2, yerr = values2_std, ecolor = 'green', bottom = values1) # plt.show()
Color customization
The color of the bars for each subgroup can be customized through the color argument. Note that you can also customize the transparency of the colors with alpha .
import matplotlib.pyplot as plt # Data groups = ['G1', 'G2', 'G3', 'G4', 'G5'] values1 = [12, 19, 14, 27, 16] values2 = [21, 30, 15, 17, 20] fig, ax = plt.subplots() # Stacked bar chart ax.bar(groups, values1, color = "#024b7a", alpha = 0.5) ax.bar(groups, values2, bottom = values1, color = "#44a5c2", alpha = 0.5) # plt.show()
In addition, you can modify the border color and width of the bars with edgecolor and linewidth , respectively.
import matplotlib.pyplot as plt # Data groups = ['G1', 'G2', 'G3', 'G4', 'G5'] values1 = [12, 19, 14, 27, 16] values2 = [21, 30, 15, 17, 20] fig, ax = plt.subplots() # Stacked bar chart ax.bar(groups, values1, color = "#44a5c2", edgecolor = "black", linewidth = 2) ax.bar(groups, values2, bottom = values1, color = "#ffae49", edgecolor = "black", linewidth = 2) # plt.show()
Adding a legend for the subgroups
You can set a label for each subgroup with the label argument. Then, you can make use of the legend function to add the legend to the plot, as in the example below.
import matplotlib.pyplot as plt # Data groups = ['G1', 'G2', 'G3', 'G4', 'G5'] values1 = [12, 19, 14, 27, 16] values2 = [21, 30, 15, 17, 20] fig, ax = plt.subplots() # Stacked bar chart ax.bar(groups, values1, label = "Yes") ax.bar(groups, values2, bottom = values1, label = "No") ax.legend() ax.set_ylabel('Number of answers') # plt.show()
Stacked bar chart with labels
You can display the numerical values for each bar within the stack with the text function from matplotlib. In order to add them, you can use a for loop like the following:
import matplotlib.pyplot as plt # Data groups = ['G1', 'G2', 'G3', 'G4', 'G5'] values1 = [12, 19, 14, 27, 16] values2 = [21, 30, 15, 17, 20] fig, ax = plt.subplots() # Stacked bar chart ax.bar(groups, values1, label = "Yes") ax.bar(groups, values2, bottom = values1, label = "No") # Labels for bar in ax.patches: ax.text(bar.get_x() + bar.get_width() / 2, bar.get_height() / 2 + bar.get_y(), round(bar.get_height()), ha = 'center', color = 'w', weight = 'bold', size = 10) ax.legend() ax.set_ylabel('Number of answers') # plt.show()
Total value labels
Alternatively, you can add labels representing the total value for each group or bar. For this purpose, you will need to add the values and then use a for loop to loop through each label, as in the following example.
import matplotlib.pyplot as plt import numpy as np # Data groups = ['G1', 'G2', 'G3', 'G4', 'G5'] values1 = [12, 19, 14, 27, 16] values2 = [21, 30, 15, 17, 20] fig, ax = plt.subplots() # Stacked bar chart ax.bar(groups, values1, label = "Yes") ax.bar(groups, values2, bottom = values1, label = "No") # Sum of values total_values = np.add(values1, values2) # Total values labels for i, total in enumerate(total_values): ax.text(i, total + 0.5, round(total), ha = 'center', weight = 'bold', color = 'black') ax.legend() ax.set_ylabel('Number of answers') plt.show()
Stacked bar graph in python : Step By Step
We can draw stacked bar graphs in python using matplotlib library python. We can create some dummy data and plot the same chart. For this dummy data creation, we can either use a NumPy array or we can provide it in Pandas data frame. Alternatively, we can use the seaborn library as well to achieve the same.
The stacked bar graph in python using Matplotlib –
Step 1: Importing & Dummy data creation
In this step, we will import the matplotlib package first, and then we will create the dummy data for visualization.
Step 2: Plotting stack barchart-
In order to plot the chart. Please follow the below syntax. We will first run the code and see the output and then follow the important notes.
Here the most important thing is defining the second bottom chart. Here the y2 plot will take y1 as the bottom.
Important Alternatives –
1. plotting multiple bar graphs in python –
Firstly, I will say there is no difference between the above section only we need to create one data point. Also, we need to adjust the bottom parameter. Let’s see the below example and see –
The difference between the above section and this one is the data structure for underline data. In the above section, it was in a list format and for the multibar chart, It is in the NumPy chart. Here is the output of matplotlib stacked bar chart code.
2. Pandas as a data source for stack barchart-
Please run the below code. Here we are using pandas dataframe and converting it to stacked bar chart.
3. Stacked Bar chart using Seaborn Library –
Let’s take a dummy dataframe and in one step, we will draw the stacked bar chart using the seaborn library.
Data Science Learner Team
Join our list
Subscribe to our mailing list and get interesting stuff and updates to your email inbox.
We respect your privacy and take protecting it seriously
Thank you for signup. A Confirmation Email has been sent to your Email Address.