-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathData Visualisation in Python
More file actions
72 lines (38 loc) · 3.66 KB
/
Copy pathData Visualisation in Python
File metadata and controls
72 lines (38 loc) · 3.66 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
Introduction to Matplotlib
Data visualisation is an important skill to possess for anyone trying to extract and communicate insights from data. Great business narratives and presentations often stem from brilliant visualisations that convey the key ideas in a concise and aesthetic manner. In the field of machine learning, visualisation plays a key role throughout the entire process of analysis - to obtain relationships, observe trends and portray the final results as well. Therefore, it is imperative that you learn and master this tool which will aid you throughout this program.
how to visualise data in Python using the Matplotlib library.
visualise the arrays using another library in Python, namely, Matplotlib.
Data visualisation is a crucial step in the process of data analysis
Creating and plotting graphs
Different chart types
Modification of charts for better understanding and presentation
“There are three kinds of lies: lies, damned lies, and statistics.” - Mark Twain
The Necessity of Data Visualisation
Facts and Dimensions
Graphics and visuals, when used intelligently and innovatively, can convey a lot more than what raw data alone can. Matplotlib serves the purpose of providing multiple functions to build graphs from the data stored in your lists, arrays, etc. So, let’s start with the first lecture on Matplotlib.
Before we start discussing different types of plots, you need to learn about the elements that help us create charts and plots effectively. There are two types of data, which are as follows:
Facts
Dimensions
import matplotlib.pyplot as plt
To recap, Matplotlib allows you to use a simple and intuitive workflow to create plots. The important Matplotlib commands used in the video above are as follows:
plt.bar(x_component, y_component): Used to draw a bar graph
plt.show(): Explicit command required to display the plot object
A bar graph is helpful when you need to visualise a numeric feature (fact) across multiple categories. In the example covered in the video, you plotted the sales amount (numeric feature) under three different product categories
Scatter plot, as the name suggests, displays how the variables are spread across the range considered. It can be used to identify a relationship or pattern between two quantitative variables and the presence of outliers within them.
plt.scatter(x_axis, y_axis)
plt.scatter(x_axis, y_axis, c = color, label = labels)
A line graph is used to present continuous time-dependent data. It accurately depicts the trend of a variable over a specified time period. Let’s watch the next video to learn how to plot a line chart using the Matplotlib library.
plt.plot(x_axis, y_axis)
y = np.random.randint(1,100, 50)
plt.plot(y, 'ro') # ‘ro’ represents color (r) and marker (o)
plt.yticks(rotation = number)
plt.hist(profit, bins = 100,edgecolor='Orange',color='cyan')
plt.show()
Box Plot
Box plots are quite effective in summarising the spread of a large data set into a visual representation. They use percentiles to divide the data range.
The percentile value gives the proportion of the data range that falls below a chosen data point when all the data points are arranged in the descending order.
Box plots divide the data range into three important categories, which are as follows:
Median value: This is the value that divides the data range into two equal halves, i.e., the 50th percentile.
Interquartile range (IQR): These data points range between the 25th and 75th percentile values.
Outliers: These are data points that differ significantly from other observations and lie beyond the whisk
fig, ax = plt.subplots(): It initiates a figure that will be used to comprise multiple graphs in a single chart