This project is a data visualization of Arsenal’s expected goals (xG) Vs different formations during the 2022/2023 season. The data was obtained from Understat.com and the chart was created using the Matplotlib library in Python. Getting Started
Please grab a cup of coffee and enjoy reading this article ☕
(1) Imports three Python packages – Pandas, Matplotlib, and Pyplot
This code imports three Python packages – Pandas, Matplotlib, and Pyplot.
- Pandas is a popular data manipulation library that provides data structures for efficiently storing and analyzing large datasets. It is commonly used for data preprocessing, data cleaning, and data analysis tasks.
- Matplotlib is a comprehensive data visualization library that provides a variety of plotting functions to create high-quality statistical graphics. It is widely used for creating line plots, scatter plots, bar plots, histograms, and many other types of visualizations.
- Pyplot is a subpackage of Matplotlib that provides a collection of functions for creating plots. It provides an interface similar to that of Matlab, making it easy to use for those familiar with Matlab.
(2) Reads a CSV file
- This code reads a CSV file named ‘Arsenal xG Vs Formations.csv’ using the read_csv() function from the Pandas library and stores the data into a Pandas DataFrame called df.
- CSV stands for Comma Separated Values, which is a simple file format used to store and exchange tabular data. The read_csv() function is a built-in function in Pandas that reads CSV files and creates a DataFrame object that can be easily manipulated and analyzed.
- The function takes the file path as an argument and returns a DataFrame object that contains the data from the CSV file. In this case, the CSV file is assumed to be in the same directory as the Python script, so only the file name is provided as an argument to the function.
- After executing this code, the df variable contains the data from the CSV file, which can be further processed and analyzed using Pandas functions.
(3) Full Code Arsenal xG Vs Formations [2022/2023]
- In the first three lines, the code imports the necessary Matplotlib modules:
pyplot
,font_manager
, andimage
. - The line
plt.rcParams['figure.facecolor'] = 'white'
sets the background color of all figures to white by default. - The next two lines define the sample data for the plot:
x
is a list of strings representing different football formations, andy
is a list of expected goals (xG) values. - The line
fig = plt.figure(figsize=(10, 10), facecolor='white')
creates a new figure object with a size of 10×10 inches and a white background. - The line
fig, ax = plt.subplots()
creates an axis object that will hold the plot. It also returns the figure object, but we don’t use it in this code. - The line
ax.scatter(x, y)
creates a scatter plot using thescatter()
function of the axis object. The x-axis values are the different formations inx
, and the y-axis values are the corresponding xG values iny
. - The next few lines set the plot title and axis labels using the
set_title()
,set_xlabel()
, andset_ylabel()
functions. The font and size of the text are customized using thefontdict
parameter. - The line
ax.set_xticks(range(len(x)))
sets the x-axis tick locations to be evenly spaced integers from 0 to the length ofx
minus 1. - The line
ax.set_xticklabels(x)
sets the x-axis tick labels to be the strings inx
. - The line
ax.set_ylim([0, 60])
sets the y-axis limits to be from 0 to 60. - The line
ax.set_yticks(range(0, 61, 10))
sets the y-axis tick locations to be evenly spaced integers from 0 to 60 in increments of 10. - The line
ax.set_yticklabels(range(0, 61, 10))
sets the y-axis tick labels to be the same as the tick locations. - The
for
loop that follows adds labels to each point of the scatter plot using thetext()
function. The labels are the corresponding xG values iny
, and they are positioned slightly above each point. - The line
ax.plot(x, y, color='black')
adds lines between the points using theplot()
function. - The line
ax.grid(which='major', color='gray', linestyle='--', linewidth=0.4)
adds a grid to the plot with gray color, dashed lines, and a linewidth of 0.4. - The
fig.text()
function is used twice to add credit to the plot. The first call adds a textwith the source of the data and the creator’s Twitter handle, and the second call adds the creator’s website URL. - The line
plt.savefig('ARSxG.png', dpi=300)
saves the plot as a PNG image with a resolution of 300 dots per inch in the current working directory. - The line
plt.show()
displays the plot on the screen.
This project was created by @SeifAjax04. The data used in this project was obtained from Understat.com.
Published by