smoothing my heatmap in python

  • Last Update :
  • Techknowledgy :

The main problem is that your X_Blue_stars and Y_Blue_stars are tabulated values, while the convolution is something that should be applied to signals (i.e. images). Just for illustration suppose you have 10 tabulated x and y coordinates:

x = np.array([3, 3, 4, 4, 4, 5, 5, 5, 9, 9])
y = np.array([9, 0, 0, 4, 7, 5, 5, 9, 0, 2])

if you apply a Gaussian filter on them the coordinates of different stars are getting convolved:

from astropy.convolution
import convolve
from astropy.convolution.kernels
import Gaussian1DKernel
convolve(x, Gaussian1DKernel(stddev = 2))
#array([2.0351543, 2.7680258, 3.40347329, 3.92589723, 4.39194033,
   # 4.86262055, 5.31327857, 5.56563858, 5.34183035, 4.48909886
])
convolve(y, Gaussian1DKernel(stddev = 2))
#array([2.30207128, 2.72042232, 3.17841789, 3.78905438, 4.42883559,
   # 4.81542569, 4.71720663, 4.0875217, 3.08970732, 2.01679469
])

which is almost certainly NOT what you want. You probably want to convolve your heatmap (this time I chose a rather larger sample to have some nice plots):

x = np.random.randint(0, 100, 10000)
y = np.random.randint(0, 100, 10000)

heatmap, xedges, yedges = np.histogram2d(x, y, bins = 100)

and the convolved heatmap

from astropy.convolution.kernels
import Gaussian2DKernel
ax2.imshow(convolve(heatmap, Gaussian2DKernel(stddev = 2)), interpolation = 'none')
plt.show()

Suggestion : 2

The KDE approach also fails for discrete data or when data are naturally continuous but specific values are over-represented. The important thing to keep in mind is that the KDE will always show you a smooth curve, even when the data themselves are not smooth. For example, consider this distribution of diamond weights:,In many cases, the layered KDE is easier to interpret than the layered histogram, so it is often a good choice for the task of comparison. Many of the same options for resolving multiple distributions apply to the KDE as well, however:,A less-obtrusive way to show marginal distributions uses a “rug” plot, which adds a small tick on the edge of the plot to represent each individual observation. This is built into displot():,By default, however, the normalization is applied to the entire distribution, so this simply rescales the height of the bars. By setting common_norm=False, each subset will be normalized independently:

penguins = sns.load_dataset("penguins")
sns.displot(penguins, x = "flipper_length_mm")
sns.displot(penguins, x = "flipper_length_mm", binwidth = 3)
sns.displot(penguins, x = "flipper_length_mm", bins = 20)
tips = sns.load_dataset("tips")
sns.displot(tips, x = "size")
sns.displot(tips, x = "size", bins = [1, 2, 3, 4, 5, 6, 7])
sns.displot(tips, x = "size", discrete = True)

Suggestion : 3

The heatmap can show the exact value behind the color. To add a label to each cell, annot parameter of the heatmap() function should be set to True.,The following parameters will make customizations to the heatmap plot: ,You can remove the color bar from a heatmap plot by giving False to the parameter cbar.,The previous post explains how to make a heatmap from 3 different input formats. This post aims to describe customizations you can make to a heatmap.

# libraries
import seaborn as sns
import pandas as pd
import numpy as np

# Create a dataset
df = pd.DataFrame(np.random.random((10, 10)), columns = ["a", "b", "c", "d", "e", "f", "g", "h", "i", "j"])

# plot a heatmap with annotation
sns.heatmap(df, annot = True, annot_kws = {
   "size": 7
})
<AxesSubplot:>
# libraries
import seaborn as sns
import pandas as pd
import numpy as np

# Create a dataset
df = pd.DataFrame(np.random.random((10, 10)), columns = ["a", "b", "c", "d", "e", "f", "g", "h", "i", "j"])

# plot a heatmap with custom grid lines
sns.heatmap(df, linewidths = 2, linecolor = 'yellow')
# libraries
import seaborn as sns
import pandas as pd
import numpy as np

# Create a dataset
df = pd.DataFrame(np.random.random((10, 10)), columns = ["a", "b", "c", "d", "e", "f", "g", "h", "i", "j"])

# plot a heatmap
sns.heatmap(df, yticklabels = False)
# libraries
import seaborn as sns
import pandas as pd
import numpy as np

# Create a dataset
df = pd.DataFrame(np.random.random((10, 10)), columns = ["a", "b", "c", "d", "e", "f", "g", "h", "i", "j"])

# plot a heatmap
sns.heatmap(df, cbar = False)
# libraries
import seaborn as sns
import pandas as pd
import numpy as np

# Create a dataset
df = pd.DataFrame(np.random.random((10, 10)), columns = ["a", "b", "c", "d", "e", "f", "g", "h", "i", "j"])

# plot a heatmap
sns.heatmap(df, xticklabels = 4)

Suggestion : 4

Here's a solution that uses interpolation to smooth the discrete values. If this is not appropriate for your purposes, another idea is trying the heatmap function in the seaborn library. By the way, this considers the ends of your bars as having a unit length equal to 1.,store a string that the user inputted into a char pointer instead of a char array,Python PIL: Rotate and scale image so that two points match two other points,Update a reactive data in axios response callback inside setup option - Vue 3

# Some needed packages
import numpy as np
import matplotlib.pyplot as plt
from scipy
import sparse
from scipy.ndimage
import gaussian_filter
np.random.seed(42)

# init an array with a lot of nans to imitate OP data
non_zero_entries = sparse.random(50, 60)
sparse_matrix = np.zeros(non_zero_entries.shape) + non_zero_entries
sparse_matrix[sparse_matrix == 0] = None

# set nans to 0
sparse_matrix[np.isnan(sparse_matrix)] = 0

# smooth the matrix
smoothed_matrix = gaussian_filter(sparse_matrix, sigma = 5)

# Set 0 s to None as they will be ignored when plotting
# smoothed_matrix[smoothed_matrix == 0] = None
sparse_matrix[sparse_matrix == 0] = None

# Plot the data
fig, (ax1, ax2) = plt.subplots(nrows = 1, ncols = 2,
   sharex = False, sharey = True,
   figsize = (9, 4))
ax1.matshow(sparse_matrix)
ax1.set_title("Original matrix")
ax2.matshow(smoothed_matrix)
ax2.set_title("Smoothed matrix")
plt.tight_layout()
plt.show()
from mpl_toolkits.axes_grid1
import make_axes_locatable
import matplotlib.pyplot as plt
from matplotlib
import cm
import seaborn as sns #optional
import numpy as np
import pandas as pd

bars = [
   [0, 100.0, 100.0, 100.0, 0],
   [0, 75.0, 100.0, 75.0, 0],
   [0, 62.5, 87.5, 62.5, 0],
   [0, 53.125, 75.0, 53.125, 0]
]

for bar in bars:
   df = pd.DataFrame((np.array(bar)))
ax = plt.subplot()
im = ax.imshow(df.transpose(), cmap = cm.jet, interpolation = 'nearest', vmin = 0, vmax = 100)
divider = make_axes_locatable(ax)
cax = divider.append_axes("right", size = "5%", pad = 0.05)
cb = plt.colorbar(im, cax = cax)
cb.set_label('Temp deg C')
ax.xaxis.set_major_formatter(plt.NullFormatter())
ax.yaxis.set_major_formatter(plt.NullFormatter())
plt.show()
library(rgdal)
library(raster)
library(gstat)

# read in a base map
m < -getData("GADM", country = "United States", level = 1)
m < -m[!m$NAME_1 % in % c("Alaska", "Hawaii"), ]

# specify the tiger file to download
tiger < -"ftp://ftp2.census.gov/geo/tiger/TIGER2010/CBSA/2010/tl_2010_us_cbsa10.zip"

# create a temporary file and a temporary directory
tf < -tempfile();
td < -tempdir()

# download the tiger file to the local disk
download.file(tiger, tf, mode = 'wb')

# unzip the tiger file into the temporary directory
z < -unzip(tf, exdir = td)

# isolate the file that ends with ".shp"
shapefile < -z[grep('shp$', z)]

# read the shapefile into working memory
cbsa.map < -readOGR(shapefile, layer = "tl_2010_us_cbsa10")

# remove CBSAs ending with alaska, hawaii, and puerto rico
cbsa.map < -cbsa.map[!grepl("AK$|HI$|PR$", cbsa.map$NAME10), ]

# cbsa.map$NAME10 now has a length of 933
length(cbsa.map$NAME10)

# extract centroid
for each CBSA
cbsa.centroids < -data.frame(coordinates(cbsa.map), cbsa.map$GEOID10)
names(cbsa.centroids) < -c("lon", "lat", "GEOID10")

# add lat lon to popualtion data
nrow(x)
x < -merge(x, cbsa.centroids, by = "GEOID10")
nrow(x) # centroids could not be assigned to all records
for some reason

# create a raster object
r < -raster(nrow = 500, ncol = 500,
   xmn = bbox(m)["x", "min"], xmx = bbox(m)["x", "max"],
   ymn = bbox(m)["y", "min"], ymx = bbox(m)["y", "max"],
   crs = proj4string(m))

# run inverse distance weighted model - modified code from ? interpolate...needs more research
model < -gstat(id = "trinary", formula = trinary~1, weights = "weight", locations = ~lon + lat, data = x,
   nmax = 7, set = list(idp = 0.5))
r < -interpolate(r, model, xyNames = c("lon", "lat"))
r < -mask(r, m) # discard interpolated values outside the states

# project map
for plotting(optional)
# North America Lambert Conformal Conic
nalcc < -CRS("+proj=lcc +lat_1=20 +lat_2=60 +lat_0=40 +lon_0=-96 +x_0=0 +y_0=0 +ellps=GRS80 +datum=NAD83 +units=m +no_defs")
m < -spTransform(m, nalcc)
r < -projectRaster(r, crs = nalcc)

# plot map
par(mar = c(0, 0, 0, 0), bty = "n")
cols < -c(rgb(0.9, 0.8, 0.8), rgb(0.9, 0.4, 0.3),
   rgb(0.8, 0.8, 0.9), rgb(0.4, 0.6, 0.9),
   rgb(0.8, 0.9, 0.8), rgb(0.4, 0.9, 0.6))
col.ramp < -colorRampPalette(cols) # custom colour ramp
plot(r, axes = FALSE, legend = FALSE, col = col.ramp(100))
plot(m, add = TRUE) # overlay base map
legend("right", pch = 22, pt.bg = cols[c(2, 4, 6)], legend = c(0, 1, 2), bty = "n")
import matplotlib.pyplot as plt
from matplotlib.colors
import LinearSegmentedColormap
import numpy as np

import scipy.ndimage.filters as filters

def plot(data, title, save_path):
   colors = [(0, 0, 1), (0, 1, 1), (0, 1, 0.75), (0, 1, 0), (0.75, 1, 0),
      (1, 1, 0), (1, 0.8, 0), (1, 0.7, 0), (1, 0, 0)
   ]

cm = LinearSegmentedColormap.from_list('sample', colors)

plt.imshow(data, cmap = cm)
plt.colorbar()
plt.title(title)
plt.savefig(save_path)
plt.close()

if __name__ == "__main__":
   w = 640
h = 480

data = np.zeros(h * w)
data = data.reshape((h, w))

# Create a sharp square peak, just
for example
for x in range(300, 340):
   for y in range(300, 340):
   data[x][y] = 100

# Smooth it to create a "blobby"
look
data = filters.gaussian_filter(data, sigma = 15)

plot(data, 'Sample plot', 'sample.jpg')
gradient_list < -list()
for (lambda_x in seq(0, 1, by = 0.01)) {
   for (lambda_y in seq(0, 1, by = 0.01)) {
      x_value < -lambda_x * 0 + (1 - lambda_x) * 1
      y_value < -lambda_y * 0 + (1 - lambda_y) * 1

      inside_polygon < -sp::point.in.polygon(x_value, y_value, triangle_lines$X, triangle_lines$Y) % > % as.logical()

      if (inside_polygon) {
         point < -c(x_value, y_value)
         distances < -sqrt(2) - sqrt((df$x - point[1]) ^ 2 + (df$y - point[2]) ^ 2)
         weighted_distances < -distances / sum(distances)
         amount < -sum(weighted_distances * df$z)
         gradient_list < -append(gradient_list, list(c(point, amount)))
      }
   }
}

gradient_df < -do.call(rbind, gradient_list) % > % as.data.frame()
   colnames(gradient_df) < -c("x", "y", "amount")

ggplot(gradient_df, aes(x = x, y = y)) +
   geom_point(aes(colour = amount), size = 2) +
   theme_void() +
   geom_line(data = triangle_lines, aes(X, Y, group = grp), size = 3, colour = "white", lineend = "round")

Suggestion : 5

For heatmap-weight, specify a range that reflects your data (the dbh property ranges from 1-62 in the GeoJSON source). Because larger trees have a high dbh, give them more weight in your heatmap by creating a stop function that increases heatmap-weight as dbh increases.,To add a heatmap layer to your map, you will need to configure a few properties. Understanding what these properties mean is key to creating a heatmap that accurately represents your data and strikes the right balance between too much detail and being a single, generalized blob.,Data. In this tutorial, you’ll be using a GeoJSON file of street trees in the city of Pittsburgh from the Western Pennsylvania Regional Data Center.,heatmap-weight: Measures how much each individual point contributes to the appearance of your heatmap. Heatmap layers have a weight of one by default, which means that all points are weighted equally. Increasing the heatmap-weight property to five has the same effect as placing five points in the same location. You can use a stop function to set the weight of your points based on a specified property.

1._
<!DOCTYPE html>
<html lang='en'>
<head>
    <meta charset='utf-8' />
    <title>Make a heatmap with Mapbox GL JS</title>
    <meta name='viewport' content='width=device-width, initial-scale=1' />
    <script src='https://api.tiles.mapbox.com/mapbox-gl-js/v2.9.2/mapbox-gl.js'></script>
    <link href='https://api.tiles.mapbox.com/mapbox-gl-js/v2.9.2/mapbox-gl.css' rel='stylesheet' />
    <style>
      body {
        margin: 0;
        padding: 0;
      }

      #map {
        position: absolute;
        top: 0;
        bottom: 0;
        width: 100%;
      }
    </style>
</head>
<body>
  <div id='map'></div>
  <script>
    mapboxgl.accessToken = 'YOUR_MAPBOX_ACCESS_TOKEN';
    const map = new mapboxgl.Map({
      container: 'map',
      style: 'mapbox://styles/mapbox/dark-v10',
      center: [-79.999732, 40.4374],
      zoom: 11
    });

   // we will add more code here in the next steps

  </script>
</body>
</html>

You will first need to add the GeoJSON you downloaded at the beginning of this guide as the source for your heatmap. You can do this by using the addSource method. This source will be used to create not only a heatmap layer but also a circle layer. The heatmap layer will fade out while the circle layer fades in to show individual data points at higher zoom levels. Add the following code after the map you initialized in the previous step.

map.on('load', () => {
   map.addSource('trees', {
      type: 'geojson',
      data: './trees.geojson'
   });
   // add heatmap layer here
   // add circle layer here
});

Finish configuring your heatmap layer by setting values for heatmap-radius and heatmap-opacity. heatmap-radius should increase with zoom level to preserve the smoothness of the heatmap as the points become more dispersed. heatmap-opacity should be decreased from 1 to 0 between zoom levels 14 and 15 to provide a smooth transition as your circle layer fades in to replace the heatmap layer. Add the following code within the 'load' event handler after the addSource method.

map.addLayer({
      id: 'trees-heat',
      type: 'heatmap',
      source: 'trees',
      maxzoom: 15,
      paint: {
         // increase weight as diameter breast height increases
         'heatmap-weight': {
            property: 'dbh',
            type: 'exponential',
            stops: [
               [1, 0],
               [62, 1]
            ]
         },
         // increase intensity as zoom level increases
         'heatmap-intensity': {
            stops: [
               [11, 1],
               [15, 3]
            ]
         },
         // assign color values be applied to points depending on their density
         'heatmap-color': [
            'interpolate',
            ['linear'],
            ['heatmap-density'],
            0,
            'rgba(236,222,239,0)',
            0.2,
            'rgb(208,209,230)',
            0.4,
            'rgb(166,189,219)',
            0.6,
            'rgb(103,169,207)',
            0.8,
            'rgb(28,144,153)'
         ],
         // increase radius as zoom increases
         'heatmap-radius': {
            stops: [
               [11, 15],
               [15, 20]
            ]
         },
         // decrease opacity to transition into the circle layer
         'heatmap-opacity': {
            default: 1,
            stops: [
               [14, 1],
               [15, 0]
            ]
         }
      }
   },
   'waterway-label'
);
demonstrates how to add a popup to the circle layer of a heatmap
map.on('click', 'trees-point', (event) => {
  new mapboxgl.Popup()
    .setLngLat(event.features[0].geometry.coordinates)
    .setHTML(`<strong>DBH:</strong> ${event.features[0].properties.dbh}`)
    .addTo(map);
});