how to plot a scatter plot using the histogram output in matplotlib?

  • Last Update :
  • Techknowledgy :

Essentially plt.hist() outputs two arrays (and as Nordev pointed out some patches). The first is the count in each bin (n) and the second the edges of the bin.

import matplotlib.pylab as plt
import numpy as np

# Create some example data
y = np.random.normal(5, size = 1000)

# Usual histogram plot
fig = plt.figure()
ax1 = fig.add_subplot(121)
n, bins, patches = ax1.hist(y, bins = 50) # output is two arrays

# Scatter plot
# Now we find the center of each bin from the bin edges
bins_mean = [0.5 * (bins[i] + bins[i + 1]) for i in range(len(n))]
ax2 = fig.add_subplot(122)
ax2.scatter(bins_mean, n)

Suggestion : 2

Let us first define a function that takes x and y data as input, as well as three axes, the main axes for the scatter, and two marginal axes. It will then create the scatter and histograms inside the provided axes.,To define the axes positions, Figure.add_axes is provided with a rectangle [left, bottom, width, height] in figure coordinates. The marginal axes share one dimension with the main axes.,For a nice alignment of the main axes with the marginals, two options are shown below.,Show the marginal distributions of a scatter as histograms at the sides of the plot.

import numpy as np
import matplotlib.pyplot as plt

# Fixing random state
for reproducibility
np.random.seed(19680801)

# some random data
x = np.random.randn(1000)
y = np.random.randn(1000)

def scatter_hist(x, y, ax, ax_histx, ax_histy):
   # no labels
ax_histx.tick_params(axis = "x", labelbottom = False)
ax_histy.tick_params(axis = "y", labelleft = False)

# the scatter plot:
   ax.scatter(x, y)

# now determine nice limits by hand:
   binwidth = 0.25
xymax = max(np.max(np.abs(x)), np.max(np.abs(y)))
lim = (int(xymax / binwidth) + 1) * binwidth

bins = np.arange(-lim, lim + binwidth, binwidth)
ax_histx.hist(x, bins = bins)
ax_histy.hist(y, bins = bins, orientation = 'horizontal')
# definitions
for the axes
left, width = 0.1, 0.65
bottom, height = 0.1, 0.65
spacing = 0.005

rect_scatter = [left, bottom, width, height]
rect_histx = [left, bottom + height + spacing, width, 0.2]
rect_histy = [left + width + spacing, bottom, 0.2, height]

# start with a square Figure
fig = plt.figure(figsize = (8, 8))

ax = fig.add_axes(rect_scatter)
ax_histx = fig.add_axes(rect_histx, sharex = ax)
ax_histy = fig.add_axes(rect_histy, sharey = ax)

# use the previously defined
function
scatter_hist(x, y, ax, ax_histx, ax_histy)

plt.show()
# start with a square Figure
fig = plt.figure(figsize = (8, 8))

# Add a gridspec with two rows and two columns and a ratio of 2 to 7 between
# the size of the marginal axes and the main axes in both directions.
# Also adjust the subplot parameters
for a square plot.
gs = fig.add_gridspec(2, 2, width_ratios = (7, 2), height_ratios = (2, 7),
   left = 0.1, right = 0.9, bottom = 0.1, top = 0.9,
   wspace = 0.05, hspace = 0.05)

ax = fig.add_subplot(gs[1, 0])
ax_histx = fig.add_subplot(gs[0, 0], sharex = ax)
ax_histy = fig.add_subplot(gs[1, 1], sharey = ax)

# use the previously defined
function
scatter_hist(x, y, ax, ax_histx, ax_histy)

plt.show()

Suggestion : 3

Matplotlib is a tool for data visualization and this tool built upon the Numpy and Scipy framework. It was developed by John Hunter in 2002. Matplotlib is a library for making 2D plots of arrays in Python. Matplotlib also able to create simple plots with just a few commands and along with limited 3D graphic support.,Introduction:  Matplotlib is a tool for data visualization and this tool built upon the Numpy and Scipy framework. It was developed by John Hunter in 2002. Matplotlib is a library for making 2D plots of arrays in Python. Matplotlib also able to create simple plots with just a few commands and along with limited 3D graphic support. It can provide quality graph/figure in interactive environment across platforms. It can also be used for animations as well. Below mentioned are some advantages of MatPlotLib: ,We are going to make 3 python list that contain information about sales and advertisement medium (TV and Radio). We will use this list for making Line plots in python. First we import matplotlib library and give shortcut name as plt.,We have loaded matplotlib, and have some data for making line plot, we can start putting some simple code

We are going to make 3 python list that contain information about sales and advertisement medium (TV and Radio). We will use this list for making Line plots in python. First we import matplotlib library and give shortcut name as plt.

import matplotlib.pyplot as plt
y = [1, 4, 9, 16, 25, 36, 49, 64]
x1 = [1, 16, 30, 42, 55, 68, 77, 88]
x2 = [1, 6, 12, 18, 28, 40, 52, 65]
2._
# show() command
for display figure
plt.plot(tv, sales)
plt.show()
3._
plt.plot(tv, sales)
plt.xlabel('TV')
plt.ylabel('Sales')
plt.title('Advertisement effect on sales')
plt.show()
5._
# # create Random numbers
import numpy as np
x = np.random.randn(1, 50)
y = np.random.randn(1, 50)
plt.scatter(x1, y, color = 'red', s = 30) # # here s is size of point in scatter plot
plt.xlabel('X axis')
plt.ylabel('Y axis')
plt.title('Scatter Plot')
plt.show()
6._
fig = plt.figure()
# # here 1 show number of row, 2 show number of column and 1 show number of subplot
# #Left plot
img1 = fig.add_subplot(121)
N = 50
x = np.random.randn(N)
y = np.random.randn(N)
colors = np.random.rand(N)
size = (10 * np.random.rand(N)) ** 2
plt.scatter(x, y, s = size, c = colors, alpha = 0.5)

# # right plot
img2 = fig.add_subplot(122)
N = 100
x1 = np.random.randn(N)
y1 = np.random.randn(N)
area = (15 * np.random.rand(N)) ** 2
colors = ['red', 'blue', 'green', 'yellow']
plt.scatter(x1, y1, s = area, c = colors, alpha = 0.2)
img2.grid(True) # # show grid in plot
plt.show()
# show() command
for display figure
plt.plot(tv, sales)
plt.show()
plt.plot(tv, sales)
plt.xlabel('TV')
plt.ylabel('Sales')
plt.title('Advertisement effect on sales')
plt.show()
plt.plot(tv, sales, marker = 'o', linestyle = '--', color = 'r', label = 'tv')
plt.plot(radio, sales, marker = '*', linestyle = '-', color = 'g', label = 'raddio')
plt.xlabel('Advertisement medium')
plt.ylabel('Sales')
plt.title('Advertisement effect on sales')
plt.legend(loc = 'lower right')
plt.show()
# # create Random numbers
import numpy as np
x = np.random.randn(1, 50)
y = np.random.randn(1, 50)
plt.scatter(x1, y, color = 'red', s = 30) # # here s is size of point in scatter plot
plt.xlabel('X axis')
plt.ylabel('Y axis')
plt.title('Scatter Plot')
plt.show()
fig = plt.figure()
# # here 1 show number of row, 2 show number of column and 1 show number of subplot
# #Left plot
img1 = fig.add_subplot(121)
N = 50
x = np.random.randn(N)
y = np.random.randn(N)
colors = np.random.rand(N)
size = (10 * np.random.rand(N)) ** 2
plt.scatter(x, y, s = size, c = colors, alpha = 0.5)

# # right plot
img2 = fig.add_subplot(122)
N = 100
x1 = np.random.randn(N)
y1 = np.random.randn(N)
area = (15 * np.random.rand(N)) ** 2
colors = ['red', 'blue', 'green', 'yellow']
plt.scatter(x1, y1, s = area, c = colors, alpha = 0.2)
img2.grid(True) # # show grid in plot
plt.show()

Suggestion : 4

April 21, 2020

First, I am going to import the libraries I will be using.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt %
   matplotlib inline
plt.rcParams.update({
   'figure.figsize': (10, 8),
   'figure.dpi': 100
})
2._
# Simple Scatterplot
x = range(50)
y = range(50) + np.random.randint(0, 30, 50)
plt.scatter(x, y)
plt.rcParams.update({
   'figure.figsize': (10, 8),
   'figure.dpi': 100
})
plt.title('Simple Scatter plot')
plt.xlabel('X - value')
plt.ylabel('Y - value')
plt.show()

You can also provide different variable of same size as X.

# Simple Scatterplot with colored points
x = range(50)
y = range(50) + np.random.randint(0, 30, 50)
plt.rcParams.update({
   'figure.figsize': (10, 8),
   'figure.dpi': 100
})
plt.scatter(x, y, c = y, cmap = 'Spectral')
plt.colorbar()
plt.title('Simple Scatter plot')
plt.xlabel('X - value')
plt.ylabel('Y - value')
plt.show()

3) If the value of y changes randomly independent of x, then it is said to have a zero corelation.

# Scatterplot and Correlations
# Data
x = np.random.randn(100)
y1 = x * 5 + 9
y2 = -5 * x
y3 = np.random.randn(100)

# Plot
plt.rcParams.update({
   'figure.figsize': (10, 8),
   'figure.dpi': 100
})
plt.scatter(x, y1, label = f 'y1 Correlation = {np.round(np.corrcoef(x,y1)[0,1], 2)}')
plt.scatter(x, y2, label = f 'y2 Correlation = {np.round(np.corrcoef(x,y2)[0,1], 2)}')
plt.scatter(x, y3, label = f 'y3 Correlation = {np.round(np.corrcoef(x,y3)[0,1], 2)}')

# Plot
plt.title('Scatterplot and Correlations')
plt.legend()
plt.show()

Use the color ='____' command to change the colour to represent scatter plot.

# Scatterplot - Color Change
x = np.random.randn(50)
y1 = np.random.randn(50)
y2 = np.random.randn(50)

# Plot
plt.scatter(x, y1, color = 'blue')
plt.scatter(x, y2, color = 'red')
plt.rcParams.update({
   'figure.figsize': (10, 8),
   'figure.dpi': 100
})

# Decorate
plt.title('Color Change')
plt.xlabel('X - value')
plt.ylabel('Y - value')
plt.show()