Python has become one of the most explored programming languages in the past few years. It has replaced many programming languages in the IT industry because of its amazingly massive collection of libraries.
In this tutorial we will discuss about:
- Introduction to Matplotlib
- History of Matplotlib
- Architecture of Matplotlib
- Interfaces in Matplotlib
- Methods used in Matplotlib
- Toolkits in Matplotlib
- Style Reference in Matplotlib
- Plotting using Matplotlib
- Matplotlib: The future of Visualization
- Matplotlib Vs. Matlab
Matplotlib tutorial takes you through the basics of Python data visualization to have an expertise in it. Humans are very visual creatures. We understand things better when we see ideas visualized. You might not even know where to begin, or you might have already the right format in mind. So this is the best tutorial to understand, and to clear all your doubts regarding data visualization with Matplotlib. We will further complete the following topics which are given below:
- The analysis of a Matplotlib plot, what is a plot? What are the Axes? What exactly is a graph?
- Plotting, which could raise questions like what library you exactly need to import, how you exactly should go about understanding the figure and the Axes of your plot, how to import matplotlib library in Jupyter notebooks.
- Plotting methods, from simple ways to plot your data to more advanced methods of visualizing your data and information.
- Basic plot generation, with a focus on plot legends and texts, titles, axes labels, and plots layout.
- Saving, displaying, clearing your plots, show the plot, save one or more figures to, for example, pdf files, remove the axes, clear the picture or close the plot, etc.
Introduction to Matplotlib
Matplotlib is one of the widely used libraries in Python. It helps in visualizing and plotting different types of data quickly. In today’s fast-moving world, information has become as important as breathing. Information and data are everywhere, so its analysis is also essential. Matplotlib is immense, as it is a handy plotting tool and its numerical extension NumPy. One of it’s most important features is the ability to perform well with many operating systems and graphics backends.
Matplotlib.pyplot is a plotting library used for two-dimensional graphics in the python programming language. By framing numbers, percentage, statistics, differences, ratio, and other kinds of numerical data into a plot can instantly make any information look more arranged. A picture is an assistance of thousand words, and with a matplotlib library, it takes less than a thousand words of code to create a production-quality graphic.
Installation: Matplotlib library is available as a wheel package for macOS, Windows, and Linux distributions. It is already installed by default to the latest version of python. However, if you want to know how to install matplotlib library, it can be easily installed using the following command.
Matplotlib’s History
It is a multi-platform plotting data visualization tool built upon the Numpy and Scipy framework. John Hunter conceived it in 2002. It was initially as a patch to IPython to enable interactive MatLab-style plotting via gnuplot from the IPython command-line. Fernando Perez was, scrambling to finish his Ph.D., and let John know he wouldn’t have time to review the patch for several months. John got a cue to set out on his own, and then the matplotlib package was born, with version 0.1 released in 2003.
There was a primitive boost for matplotlib when it was adopted as the plotting package of the Space Telescope Science Institute(STSI), which financially supported matplotlib’s development and led to greatly expanded capabilities.
Matplotlib became popular in outlasting the dozens of competing packages. It could be used on any operating system via its array of backend had a familiar interface. One similar to MATLAB, it had a coherent vision to do 2D graphics, do them well and found early institutional support, from astronomers at STSI and JPLit had a deal with Hunter himself, who enthusiastically promoted the project within the Python world.
Matplotlib’s Architecture
Matplotlib is a 3-layered architecture. The layers from top to bottom are the scripting layer, artist layer, and backend layer, which can be viewed as a stack. Each layer that is above another layer knows how to interact with the layer below it, but the lower layer is not aware of the layers above it.
1. Scripting Layer: The scripting layer is the Interface that simplifies the task of working with other layers.It is a better option for scientists’ daily use, data visualization, or exploratory interactions.
2. Artist Layer: The artist layer is used to integrate matplotlib with the application server. It is concerned with things such as the lines, shapes, axes, and text. The subclasses of the artist layer define things such as visibility, labels, and a clip box that defines the paintable area.
The subclasses can be classified into:
- Primitives
- Containers
3. Backend Layer: Thebackend layer is the bottom-most layer where the plots are displayed on to an output device. It focuses on providing a familiar interface to the toolkits and rendering the primitives and containers of the artist layer. It also helps in developing interactive plots in a GUI.
Interfaces in Matplotlib
There are two interfaces for plotting in matplotlib
- MATLAB style plotting using pyplot
- Object-Oriented Interface
Object-Oriented Interface is more convenient to operate than others, every figure is divided into some objects, and the position of the object is clear. We work on the object to visualize the data and achieve the desired results. As matplotlib is a massive library, and getting a plot to look just right is often made through hit and trial. Using one-liners to generate basic plots in matplotlib is quite simple, but skilfully commanding the remaining 90% of the library can be terrifying.
It provides an object-oriented API for fastening plots into applications using general-purpose GUI toolkits like Tkinter, wxPython, or GTK+. When a large amount of data presented in numbers on a spreadsheet, it is complicated to understand. It is worse if there are tons of variables and time frames. Information plotting and visualization can make a vast difference.
Methods used in matplotlib
There are various methods used in matplotlib to generate the desired graph or plot. Once we get the axes object, we can call the process of the axes to object to generate plots. We will be using the following methods.
- Plot(a,b): It generates a graph between a and b.
- Set_xlabel: It labels for the X-axis.
- Set_ylabel: It labels for the Y-axis.
- Set_title: It gives the title of the plot.
- Legend(): It generates the legend of the graph.
- Show(): It gives a view of the plot.
- Hist(): It generates a histogram plot.
- ax.plot(): It plots y vs. x as lines and markers.
- <strong>plt.plot():</strong> It is a convenient way to get the current Axes of the current Figure and call its plot() method.
- <strong>Pie():</strong> It plots the pie chart of the given data.
- Bar(): It is used to make a bar plot.
- Barh(): It is used to create a Horizontal bar plot.
- Boxplot(): It is used to create a box and whisker plot.
- Hist2d(): It is used to create a 2D histogram.
- Polar(): It creates a polar plot.
- Scatter(): It creates a scatter plot of x vs. y.
- Stackplot(): It creates a stacked area plot.
- Stem(): It create a Stem plot.
- Step(): It makes a Step plot.
- Quiver(): It plots a 2D field of arrows.
There are many more methods like displot(), plt.scatter(), ax.boxpot(), kdeplot(), jointplot(), which help you in plotting different types of problems. Mataplotlib presents problems as the figure anatomy, rather than an explicit hierarchy. Pyplot is a Matplotlib member which provides a MATLAB like Interface. It is designed to be as usable as MATLAB, with the ability to use Python and the advantage of being free and open source.
Use of cla(), clf(), and close(): When you are working with data visualization library, you should be aware of these functions also. You have to tell matplotlib to close down the plot that you have been working so that you can further move on. These three functions will be helpful once you are at this point.
- You can use plt.cla() to clear an axis.
- If you want to clear the entire figure, use plt.clf().
- And if you want to close a window that has popped up to show your plot.
Toolkits in Matplotlib
Several toolkits are available that explain python matplotlib functionality.
- Excel tools: Matplotlib provides benefits for exchanging data with Microsoft Excel. The data analysis function can use a single worksheet at a time. You provide the data and information using excel, and the tool uses the appropriate statistical or technical macro functions to calculate and generate charts in addition to output tables.
- Mplot3d: The mplot3d toolkit provides simple 3D plotting capabilities to matplotlib by supplying an axes object that can create a 2D projection of a 3D scene. Mplot3d is not the fastest or complete 3D library out there, but comes with matplotlib and thus may be a light-weight solution for some use cases.
- Cartopy: It is a mapping library providing object-oriented map projection definitions, and arbitrary point, line, polygon, and image transformation capabilities. It is a matplotlib package designed for geospatial and geoscience data analysis. It is designed to make drawing plots for data visualization and easy analysis.
- Natgrid: It is an interface to the natgrid library for irregular gridding of the spaced data. It requires to install natgrid from the toolkits. You could also try downloading and compiling it manually and install it locally. It provides visualization in very less time with more accuracy.
- Basemap: It is a plotting toolkit with various map projections, coastlines, and boundaries. Basemap would be required illustrating the parks, the buildings, the individual sites required, and the modern features and topography. It is an excellent tool for creating maps using the matplotlib library. It has got features to create data visualizations and adds the projections of datasets to be able to plot coastlines, countries, and so on from the library.
- GTK tools: Matplotlib toolkits.gtktools provides utilities for working with GTK. This toolkit ships with matplotlib, but requires PyGTK. It is a cross-platform toolkit for creating GUIs. By offering the set of devices, it is suitable for projects ranging from small tools to complete application suites. It is written in C language but has been designed to support a wide range of languages.
Types of Plot
There are different types of plots created using python Matplotlib. A few of them are as follows.
- Simple Line Plot:- In the program we will use plot() function to plot the line chart, which takes two variables t and s to plot the line. When we plot the line using the function, the graph gets plotted internally, but to visualize externally, we use the function show(). A simple TS line can be plotted as followed.
Output:
2. Histogram:- It is an approximation of the probability distribution of a continuous variable. It is a kind of bar graph. It is used to represent data given in the form of some groups, or we can say when you have arrays or an extensive list. Where X-axis is about bin ranges, and Y-axis talks about frequency.
Consider the following example:
Output:
- Scatter Plot:-It is a type of plot that shows the data as a collection of points. The position of points depends on its 2D values, where each value is ground on either the horizontal or vertical dimension. Normally, we need scatter plots to compare variables. For example, how one variable is affected by another variable to build a relation out of it.
Consider the following example:
Output :
- 3D Plot:-3D plotting of data along x,y, and z axes to enhance the display of data representing the 3D plots. It is an advance plotting technique that gives us a better view of the data representation along with the three axes of the graph.
Consider the following example :
Matplotlib library was initially designed to plot in two-dimensional only. The mpl_tookits.mplot3d import axes3d module included with matplotlib, provides the necessary methods to create three-dimensional surface plots with Python.
Output :
- Image Plot:-Consists of several plots like line, histogram, bar, scatter, image plot. Image plotting is one of the best features of matplotlib as it allows the user to insert the desired image using simple commands. We can add multiple images; in the below code, we will see how we can work with PNG images using Matplotlib.
Output :
- Bar Plot:-A bar plot or a bar chart is a chart that presents categorical data with rectangular bars with heights or lengths proportional to the values they represent. It shows comparisons between discrete values. On the axis of the chart shows the specific categories being compared, and another axis represents a measured value.
Consider the following example :
Output :
- Stack Plot: It is a plot which shows the aggregate data set with better visualization of how each part makes up the whole. Each element of the stack plot is stacked on the top of each other.
For example:
We have plotted the above example with different labels, giving line width of 6 to each. We have used the plt.stackplot() method.
Output:
- Pie Chart: It is a circular statistical plot. The chart represents whole of the data. However, the areas of the diagram represent the percentage of different parts of data and called wedges.
For example:
In the above example, we have used the same example on stack plot. We have plotted the pie chart using plt.pie() method.
Output:
Style reference
We can try many styling techniques to create a better graph by changing the width or color of a particular line. Or what if we want to have some grid lines, and there we need styling!
There are several pre-defined styles provided by matplotlib. For example, there’s a pre-defined style called “ggplot,” which emulates the aesthetics of ggplot( a popular plotting package of R). To use this style, add:
Output :
To describe the text and to mark up the plot, the default for matplotlib is to use a sans-serif font. So, let’s show how to add style to a graph using python matplotlib. First, you need to import the style package from matplotlib library and then use styling functions as shown in below code :
Output:
The styling provide better-looking charts, but they don’t get us all the way to beautiful plots. Some alterations, particularly around font selection and font size, are also necessary. As you can see in the above example, we have used many functions to make our graph more colorful.
Matplotlib for plotting graphs: Advantages
Matplotlib is a visualization and plotting library. There are many other visualization tools, but most of the tools are paid. However, matplotlib is an open-source library. The significant advantages of matplotlib are:
- It is an open-source library that makes everyone to use it without paying any fees.
- It operates on any OS Windows, Linux, and macOS.
- It has a complete package for data visualization. Data and information can be elaborated and plot easily using this tool.
- It is used in many applications as it has several types of graphs, and plots. That makes it more customizable according to available use cases.
- The main advantage of the matplotlib library is that it uses Python programming language which is a very popular language among the data scientists.
Matplotlib: The future of Visualization
Visualizations are the easiest way to organize and absorb information. Visuals help in understanding the complex problems, identifying patterns, relationships, and outliers in data. It helps in understanding business problems better and quickly. It is often used for data visualization and calculation. Visualization of data can be done via many packages, and we will be discussing it using matplotlib library package.
It helps to build a compiled story based on visuals. Insights collected from the visuals, help in building strategies for the business purpose. It is also a precursor to many high-level data analysis for Exploratory Data Analysis and Machine Learning.
- It is a process of describing information in a graphical or pictorial format which helps the decision-makers to analyze the data effectively.
- Data visualization just not makes data more beautiful but also provides insight into complex data sets.
- It helps in identifying areas that need attention or improvement.
- It helps to predict scenarios, and many more.
Now, as we have understood a glimpse of Data visualization using matplotlib. Even if you are not directly involved in data science, it will be useful to know what data visualization can do and how it is used in the real world.
Matplotlib Vs. Matlab
Matplotlib library is a plotting library for the Python programming language. It is a two dimensional plotting library, which generates figures in a variety of hardcopy formats and user friendly environment across the platforms. | Matlab is a high-level language, and two sided library for numerical computation, visualization, and programming. It helps in analyzing data, developing algorithms, and creating models and applications. |
It can be used in Python, Python shells, jupyter notebook, web application servers, and four GUI toolkits. | Matlab enables you to explore multiple approaches and reach a solution faster than the spreadsheets or other programming languages, such as C, C++, or Java. |
It is a Python visualization library. It is a free and open-source library and way more capable and flexible than Matlab plotting. | It is a vital asset for scientists, researchers, and engineers. Matlab is a numerical computing and programming language. You have to pay to use Matlab, while it has a very similar plotting object structure to matplotlib. |
If someone is looking to interact with your plots, stuff like matplotlib would give you more options than Matlab’s figure editor. | As Matlab is an interpreted language, it can be slow, and poor programming practices can make it very slow. |
Matplotlib is used in many companies, like ADEXT, Virta Health, Quezx.com, King’s Digital Lab, and Opportunity Network. | Matlab is used in Empatica, RideCell, Leap Motion, 10X Genomics, Brandyourself. |