Python – Data visualization tutorial
Last Updated :
26 Dec, 2024
Data visualization is a crucial aspect of data analysis, helping to transform analyzed data into meaningful insights through graphical representations. This comprehensive tutorial will guide you through the fundamentals of data visualization using Python. We’ll explore various libraries, including Matplotlib, Seaborn, Pandas, Plotly, Plotnine, Altair, Bokeh, Pygal, and Geoplotlib. Each library offers unique features and advantages, catering to different visualization needs and preferences.

Data visualization tutorial
Introduction to Data Visualization
After analyzing data, it is important to visualize the data to uncover patterns, trends, outliers, and insights that may not be apparent in raw data using visual elements like charts, graphs, and maps. Choosing the right type of chart is crucial for effectively communicating your data. Different charts serve different purposes and can highlight various aspects of your data. For a deeper dive into selecting the best chart for your data, check out this comprehensive guide on:
Equally important is selecting the right colors for your visualizations. Proper color choices highlight key information, improve readability, and make visuals more engaging. For expert advice on choosing the best colors for your charts, visit How to select Colors for Data Visualizations?
Python Libraries for Data Visualization
Python offers numerous libraries for data visualization, each with unique features and advantages. Below are some of the most popular libraries:
Here are some of the most popular ones:
- Matplotlib
- Seaborn
- Pandas
- Plotly
- Plotnine
- Altair
- Bokeh
- Pygal
- Geoplotlib
Getting Started – Data Visualization with Matplotlib
Matplotlib is a great way to begin visualizing data in Python, essential for data visualization in data science. It is a versatile library that designed to help users visualize data in a variety of formats. Well-suited for creating a wide range of static, animated, and interactive plots.
Example: Plotting a Linear Relationship with Matplotlib
Python
# importing the required libraries
import matplotlib.pyplot as plt
import numpy as np
# define data values
x = np.array([1, 2, 3, 4]) # X-axis points
y = x*2 # Y-axis points
plt.plot(x, y) # Plot the chart
plt.show() # display
Output:

Effective Data Visualization With Seaborn
Seaborn is a Python library that simplifies the creation of attractive and informative statistical graphics. It integrates seamlessly with Pandas DataFrames and offers a range of functions tailored for visualizing statistical relationships and distributions. This chapter will guide you through using Seaborn to create effective data visualizations.
Example: Scatter Plot Analysis with Seaborn
Python
import seaborn as sns
import matplotlib.pyplot as plt
# Load the 'tips' dataset
tips = sns.load_dataset('tips')
# Create a scatter plot
plt.figure(figsize=(6, 4))
sns.scatterplot(x='total_bill', y='tip', data=tips, hue='time', style='time')
plt.title('Total Bill vs Tip')
plt.xlabel('Total Bill')
plt.ylabel('Tip')
plt.show()
Output:

Data Visualization with Seaborn
Data Visualization with Pandas
Pandas is a powerful data manipulation library in Python that also offers some basic data visualization capabilities. While it may not be as feature-rich as dedicated visualization libraries like Matplotlib or Seaborn, Pandas’ built-in plotting is convenient for quick and simple visualizations.
Examples: Visualizing Spread and Outliers
Box plots are useful for visualizing the spread and outliers in your data. They provide a graphical summary of the data distribution, highlighting the median, quartiles, and potential outliers. Let’s create box plot with Pandas:
Python
# Sample data
data = {
'Category': ['A']*10 + ['B']*10,
'Value': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
}
df = pd.DataFrame(data)
# Box plot
df.boxplot(by='Category')
plt.title('Box Plot Example')
plt.suptitle('')
plt.xlabel('Category')
plt.ylabel('Value')
plt.show()
Output:

Box Plot
Data Visualization with Plotly
Plotly is a versatile library for creating interactive and aesthetically pleasing visualizations. This chapter will introduce you to Plotly and guide you through creating basic visualizations.
We’ll create a simple bar plot. For this example, we’ll use the same ‘tips’ dataset we used with Seaborn.
Python
import plotly.express as px
import pandas as pd
tips = px.data.tips()
fig = px.bar(tips, x='day', y='total_bill', title='Average Total Bill per Day')
fig.show()
Output:

Bar Plot Plotly
Plotly allows for extensive customizations, including updating layouts, adding annotations, and incorporating dropdowns and sliders.
Data Visualization with Plotnine
Plotnine is a Python library that implements the Grammar of Graphics, inspired by R’s ggplot2. It provides a coherent and consistent way to create complex visualizations with minimal code.. This chapter will introduce you to Plotnine in Python, demonstrating how they can be used to create various types of plots.
Plotnine Example: Creating Line Plots
Python
import pandas as pd
from plotnine import ggplot, aes, geom_line, geom_histogram, labs, theme_minimal
from plotnine.data import economics
# Load the 'economics' dataset available in Plotnine
# This dataset contains economic indicators including unemployment numbers
# Create a line plot to visualize the trend of unemployment rate over time
line_plot = (
ggplot(economics, aes(x='date', y='unemploy'))
+ geom_line(color='blue')
+ labs(title='Unemployment Rate Over Time',
x='Date', y='Number of Unemployed')
+ theme_minimal()
)
print(line_plot)
Output:

Line Plots
Data Visualizations with Altair
Altair is a declarative statistical visualization library for Python, designed to provide an intuitive way to create interactive and informative charts. Built on Vega and Vega-Lite, Altair allows users to build complex visualizations through simple and expressive syntax.
Altair Example: Creating Charts
Python
# Import necessary libraries
import altair as alt
from vega_datasets import data
iris = data.iris()
# Create a scatter plot
scatter_plot = alt.Chart(iris).mark_point().encode(
x='sepalLength',
y='petalLength',
color='species'
)
scatter_plot
Output:

Creating Charts
Interactive Data Visualization with Bokeh
Bokeh is a powerful Python library for creating interactive data visualization and highly customizable visualizations. It is designed for modern web browsers and allows for the creation of complex visualizations with ease. Bokeh supports a wide range of plot types and interactivity features, making it a popular choice for interactive data visualization.
Example : Basic Plotting with Bokeh- Adding Hover Tool
Python
from bokeh.models import HoverTool
from bokeh.plotting import figure, show
from bokeh.io import output_notebook
output_notebook()
p = figure(title="Scatter Plot with Hover Tool",
x_axis_label='X-Axis', y_axis_label='Y-Axis')
p.scatter(x=[1, 2, 3, 4, 5], y=[6, 7, 2, 4, 5],
size=10, color="green", alpha=0.5)
# Add HoverTool
hover = HoverTool()
hover.tooltips = [("X", "@x"), ("Y", "@y")]
p.add_tools(hover)
# Show the plot
show(p)
Output:

Basic Plotting with Bokeh- Adding Hover Tool
Mastering Advanced Data Visualization with Pygal
In this final chapter, we will delve into advanced techniques for data visualization using Pygal. It is known for its ease of use and ability to create beautiful, interactive charts that can be embedded in web applications.
- Data Visualization with Pygal: With Pygal, you can create a wide range of charts including line charts, bar charts, pie charts, and more, all with interactive capabilities.
Example: Creating Advanced Charts with Pygal
Firstly, you’ll need to install pygal, you can install it using pip:
pip install pygal
Python
import pygal
from pygal.style import Style
# Create a custom style
custom_style = Style(
background='transparent',
plot_background='transparent',
foreground='#000000',
foreground_strong='#000000',
foreground_subtle='#6e6e6e',
opacity='.6',
opacity_hover='.9',
transition='400ms',
colors=('#E80080', '#404040')
)
# Create a line chart
line_chart = pygal.Line(style=custom_style, show_legend=True,
x_title='Months', y_title='Values')
line_chart.title = 'Monthly Trends'
line_chart.add('Series 1', [1, 3, 5, 7, 9])
line_chart.add('Series 2', [2, 4, 6, 8, 10])
# Render the chart to a file
line_chart.render_to_file('line_chart.svg')
Output:

Advanced Line Charts with Pygal
Choosing the Right Data Visualization Library
Library | Best For | Strengths | Limitations |
---|
Matplotlib | Static plots | Highly customizable | Steep learning curve |
Seaborn | Statistical visualizations | Easy to use, visually appealing | Limited interactivity |
Plotly | Interactive visualizations | Web integration, modern designs | Requires browser rendering |
Bokeh | Web-based dashboards | Real-time interactivity | More complex setup |
Altair | Declarative statistical plots | Concise syntax | Limited customization |
Pygal | Scalable SVG charts | High-quality graphics | Less suited for complex datasets |
To create impactful and engaging data visualizations. Start by selecting the appropriate chart type—bar charts for comparisons, line charts for trends, and pie charts for proportions.
- Simplify your visualizations to focus on key insights.
- Use annotations to guide the viewer’s attention.
- Strategically use color to differentiate categories or highlight important data, but avoid overuse to prevent confusion.
For a more detailed exploration of these techniques consider below resources:
Similar Reads
Python - Data visualization tutorial
Data visualization is a crucial aspect of data analysis, helping to transform analyzed data into meaningful insights through graphical representations. This comprehensive tutorial will guide you through the fundamentals of data visualization using Python. We'll explore various libraries, including M
7 min read
What is Data Visualization and Why is It Important?
Data visualization is the graphical representation of information. In this guide we will study what is Data visualization and its importance with use cases. Understanding Data VisualizationData visualization translates complex data sets into visual formats that are easier for the human brain to unde
4 min read
Data Visualization using Matplotlib in Python
Matplotlib is a powerful and widely-used Python library for creating static, animated and interactive data visualizations. In this article, we will provide a guide on Matplotlib and how to use it for data visualization with practical implementation. Matplotlib offers a wide variety of plots such as
13 min read
Data Visualization with Seaborn - Python
Data visualization can be done by seaborn and it can transform complex datasets into clear visual representations making it easier to understand, identify trends and relationships within the data. This article will guide you through various plotting functions available in Seaborn. Getting Started wi
13 min read
Data Visualization with Pandas
Pandas allows to create various graphs directly from your data using built-in functions. This tutorial covers Pandas capabilities for visualizing data with line plots, area charts, bar plots, and more. Introducing Pandas for Data VisualizationPandas is a powerful open-source data analysis and manipu
5 min read
Plotly for Data Visualization in Python
Plotly is an open-source Python library for creating interactive visualizations like line charts, scatter plots, bar charts and more. In this article, we will explore plotting in Plotly and covers how to create basic charts and enhance them with interactive features. Introduction to Plotly in Python
13 min read
Data Visualization using Plotnine and ggplot2 in Python
Plotnoine is a Python library that implements a grammar of graphics similar to ggplot2 in R. It allows users to build plots by defining data, aesthetics, and geometric objects. This approach provides a flexible and consistent method for creating a wide range of visualizations. It is built on the con
7 min read
Introduction to Altair in Python
Altair is a statistical visualization library in Python. It is a declarative in nature and is based on Vega and Vega-Lite visualization grammars. It is fast becoming the first choice of people looking for a quick and efficient way to visualize datasets. If you have used imperative visualization libr
5 min read
Python - Data visualization using Bokeh
Bokeh is a data visualization library in Python that provides high-performance interactive charts and plots. Bokeh output can be obtained in various mediums like notebook, html and server. It is possible to embed bokeh plots in Django and flask apps. Bokeh provides two visualization interfaces to us
3 min read
Pygal Introduction
Python has become one of the most popular programming languages for data science because of its vast collection of libraries. In data science, data visualization plays a crucial role that helps us to make it easier to identify trends, patterns, and outliers in large data sets. Pygal is best suited f
5 min read