Arsenal xG Vs Formations [2022/2023]

Arsenal xG Vs Formations [2022/2023]

This project is a data visualization of Arsenal’s expected goals (xG) Vs different formations during the 2022/2023 season. The data was obtained from Understat.com and the chart was created using the Matplotlib library in Python. Getting Started

Please grab a cup of coffee and enjoy reading this article

(1) Imports three Python packages – Pandas, Matplotlib, and Pyplot

This code imports three Python packages – Pandas, Matplotlib, and Pyplot.

  • Pandas is a popular data manipulation library that provides data structures for efficiently storing and analyzing large datasets. It is commonly used for data preprocessing, data cleaning, and data analysis tasks.
  • Matplotlib is a comprehensive data visualization library that provides a variety of plotting functions to create high-quality statistical graphics. It is widely used for creating line plots, scatter plots, bar plots, histograms, and many other types of visualizations.
  • Pyplot is a subpackage of Matplotlib that provides a collection of functions for creating plots. It provides an interface similar to that of Matlab, making it easy to use for those familiar with Matlab.

(2) Reads a CSV file

  • This code reads a CSV file named ‘Arsenal xG Vs Formations.csv’ using the read_csv() function from the Pandas library and stores the data into a Pandas DataFrame called df.
  • CSV stands for Comma Separated Values, which is a simple file format used to store and exchange tabular data. The read_csv() function is a built-in function in Pandas that reads CSV files and creates a DataFrame object that can be easily manipulated and analyzed.
  • The function takes the file path as an argument and returns a DataFrame object that contains the data from the CSV file. In this case, the CSV file is assumed to be in the same directory as the Python script, so only the file name is provided as an argument to the function.
  • After executing this code, the df variable contains the data from the CSV file, which can be further processed and analyzed using Pandas functions.

(3) Full Code Arsenal xG Vs Formations [2022/2023]

  • In the first three lines, the code imports the necessary Matplotlib modules: pyplot, font_manager, and image.
  • The line plt.rcParams['figure.facecolor'] = 'white' sets the background color of all figures to white by default.
  • The next two lines define the sample data for the plot: x is a list of strings representing different football formations, and y is a list of expected goals (xG) values.
  • The line fig = plt.figure(figsize=(10, 10), facecolor='white') creates a new figure object with a size of 10×10 inches and a white background.
  • The line fig, ax = plt.subplots() creates an axis object that will hold the plot. It also returns the figure object, but we don’t use it in this code.
  • The line ax.scatter(x, y) creates a scatter plot using the scatter() function of the axis object. The x-axis values are the different formations in x, and the y-axis values are the corresponding xG values in y.
  • The next few lines set the plot title and axis labels using the set_title(), set_xlabel(), and set_ylabel() functions. The font and size of the text are customized using the fontdict parameter.
  • The line ax.set_xticks(range(len(x))) sets the x-axis tick locations to be evenly spaced integers from 0 to the length of x minus 1.
  • The line ax.set_xticklabels(x) sets the x-axis tick labels to be the strings in x.
  • The line ax.set_ylim([0, 60]) sets the y-axis limits to be from 0 to 60.
  • The line ax.set_yticks(range(0, 61, 10)) sets the y-axis tick locations to be evenly spaced integers from 0 to 60 in increments of 10.
  • The line ax.set_yticklabels(range(0, 61, 10)) sets the y-axis tick labels to be the same as the tick locations.
  • The for loop that follows adds labels to each point of the scatter plot using the text() function. The labels are the corresponding xG values in y, and they are positioned slightly above each point.
  • The line ax.plot(x, y, color='black') adds lines between the points using the plot() function.
  • The line ax.grid(which='major', color='gray', linestyle='--', linewidth=0.4) adds a grid to the plot with gray color, dashed lines, and a linewidth of 0.4.
  • The fig.text() function is used twice to add credit to the plot. The first call adds a textwith the source of the data and the creator’s Twitter handle, and the second call adds the creator’s website URL.
  • The line plt.savefig('ARSxG.png', dpi=300) saves the plot as a PNG image with a resolution of 300 dots per inch in the current working directory.
  • The line plt.show() displays the plot on the screen.

This project was created by @SeifAjax04. The data used in this project was obtained from Understat.com.

Code on GitHub

Published by

Leave a Reply

Your email address will not be published. Required fields are marked *