how can i compare weights of different keras models?

  • Last Update :
  • Techknowledgy :

Visual example:

from tensorflow.keras.layers
import Input, Conv2D, Dense, Flatten
from tensorflow.keras.models
import Model

ipt = Input(shape = (16, 16, 16))
x = Conv2D(12, 8, 1)(ipt)
x = Flatten()(x)
out = Dense(16)(x)

model = Model(ipt, out)
model.compile('adam', 'mse')

X = np.random.randn(10, 16, 16, 16) # toy data
Y = np.random.randn(10, 16) # toy labels
for _ in range(10):
   model.train_on_batch(X, Y)

def get_weights_print_stats(layer):
   W = layer.get_weights()
print(len(W))
for w in W:
   print(w.shape)
return W

def hist_weights(weights, bins = 500):
   for weight in weights:
   plt.hist(np.ndarray.flatten(weight), bins = bins)

W = get_weights_print_stats(model.layers[1])
# 2
#(8, 8, 16, 12)
#(12, )

hist_weights(W)

Suggestion : 2

Retrieve weights of layer of interest. Ex: model.layers[1].get_weights() ,Understand weight roles and dimensionality. Ex: LSTMs have three sets of weights: kernel, recurrent, and bias, each serving a different purpose. Within each weight matrix are gate weights - Input, Cell, Forget, Output. For Conv layers, the distinction's between filters (dim0), kernels, and strides.,Organize weight matrices for visualization in a meaningful manner per (2). Ex: for Conv, unlike for LSTM, feature-specific treatment isn't really necessary, and we can simply flatten kernel weights and bias weights and visualize them in a histogram, Stability: if weights are changing greatly and quickly, or if there are many high-valued weights, it may indicate impaired gradient performance, remedied by e.g. gradient clipping or weight constraints

Visual example:

from tensorflow.keras.layers
import Input, Conv2D, Dense, Flatten
from tensorflow.keras.models
import Model

ipt = Input(shape = (16, 16, 16))
x = Conv2D(12, 8, 1)(ipt)
x = Flatten()(x)
out = Dense(16)(x)

model = Model(ipt, out)
model.compile('adam', 'mse')

X = np.random.randn(10, 16, 16, 16) # toy data
Y = np.random.randn(10, 16) # toy labels
for _ in range(10):
   model.train_on_batch(X, Y)

def get_weights_print_stats(layer):
   W = layer.get_weights()
print(len(W))
for w in W:
   print(w.shape)
return W

def hist_weights(weights, bins = 500):
   for weight in weights:
   plt.hist(np.ndarray.flatten(weight), bins = bins)

W = get_weights_print_stats(model.layers[1])
# 2
#(8, 8, 16, 12)
#(12, )

hist_weights(W)

Suggestion : 3

 August 05, 2022     keras, python, tensorflow     No comments   

Visual example:

from tensorflow.keras.layers
import Input, Conv2D, Dense, Flatten
from tensorflow.keras.models
import Model

ipt = Input(shape = (16, 16, 16))
x = Conv2D(12, 8, 1)(ipt)
x = Flatten()(x)
out = Dense(16)(x)

model = Model(ipt, out)
model.compile('adam', 'mse')

X = np.random.randn(10, 16, 16, 16) # toy data
Y = np.random.randn(10, 16) # toy labels
for _ in range(10):
   model.train_on_batch(X, Y)

def get_weights_print_stats(layer):
   W = layer.get_weights()
print(len(W))
for w in W:
   print(w.shape)
return W

def hist_weights(weights, bins = 500):
   for weight in weights:
   plt.hist(np.ndarray.flatten(weight), bins = bins)

W = get_weights_print_stats(model.layers[1])
# 2
#(8, 8, 16, 12)
#(12, )

hist_weights(W)

Suggestion : 4

Last updated 2022-01-10 UTC.

Saving a Keras model:

model = ...# Get model(Sequential, Functional Model, or Model subclass)
model.save('path/to/location')

Loading the model back:

from tensorflow
import keras
model = keras.models.load_model('path/to/location')

Setup

import numpy as np
import tensorflow as tf
from tensorflow
import keras
4/4 [==============================] - 1s 2ms/step - loss: 0.5884
2021-08-25 17:49:05.320893: W tensorflow/python/util/util.cc:348] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them.
INFO:tensorflow:Assets written to: my_model/assets
4/4 [==============================] - 0s 2ms/step - loss: 0.5197
<keras.callbacks.History at 0x7f99486ad490>

Calling model.save('my_model') creates a folder named my_model, containing the following:

ls my_model

Suggestion : 5

This behavior only applies for BatchNormalization. For every other layer, weight trainability and "inference vs training mode" remain independent.,the weights of the model,Instantiate a base model and load pre-trained weights,As you can see, "inference mode vs training mode" and "layer weight trainability" are two very different concepts.

strategy = tf.distribute.MirroredStrategy()
with strategy.scope():
   # This could be any kind of model--Functional, subclass...
   model = tf.keras.Sequential([
      tf.keras.layers.Conv2D(32, 3, activation = 'relu', input_shape = (28, 28, 1)),
      tf.keras.layers.GlobalMaxPooling2D(),
      tf.keras.layers.Dense(10)
   ])
model.compile(loss = tf.keras.losses.SparseCategoricalCrossentropy(from_logits = True),
   optimizer = tf.keras.optimizers.Adam(),
   metrics = [tf.keras.metrics.SparseCategoricalAccuracy()])
model.fit(train_dataset, epochs = 12, callbacks = callbacks)
# Model where a shared LSTM is used to encode two different sequences in parallel
input_a = keras.Input(shape = (140, 256))
input_b = keras.Input(shape = (140, 256))

shared_lstm = keras.layers.LSTM(64)

# Process the first sequence on one GPU
with tf.device_scope('/gpu:0'):
   encoded_a = shared_lstm(input_a)
# Process the next sequence on another GPU
with tf.device_scope('/gpu:1'):
   encoded_b = shared_lstm(input_b)

# Concatenate results on CPU
with tf.device_scope('/cpu:0'):
   merged_vector = keras.layers.concatenate(
      [encoded_a, encoded_b], axis = -1)
cluster_resolver = ...
   if cluster_resolver.task_type in ("worker", "ps"):
   # Start a[`tf.distribute.Server`](https: //www.tensorflow.org/api_docs/python/tf/distribute/Server) and wait.
      ...
      elif cluster_resolver.task_type == "evaluator":
      # Run an(optional) side - car evaluation
      ...

      # Otherwise, this is the coordinator that controls the training w / the strategy.strategy = tf.distribute.experimental.ParameterServerStrategy(
         cluster_resolver = ...) train_dataset = ...

      with strategy.scope():
      model = tf.keras.Sequential([
         layers.Conv2D(32, 3, activation = 'relu', input_shape = (28, 28, 1)),
         layers.MaxPooling2D(),
         layers.Flatten(),
         layers.Dense(64, activation = 'relu'),
         layers.Dense(10, activation = 'softmax')
      ]) model.compile(
         loss = 'sparse_categorical_crossentropy',
         optimizer = tf.keras.optimizers.SGD(learning_rate = 0.001),
         metrics = ['accuracy'],
         steps_per_execution = 10)

      model.fit(x = train_dataset, epochs = 3, steps_per_epoch = 100)
# By
default `MultiWorkerMirroredStrategy`
uses cluster information
# from `TF_CONFIG`, and "AUTO"
collective op communication.
strategy = tf.distribute.experimental.MultiWorkerMirroredStrategy()
train_dataset = get_training_dataset()
with strategy.scope():
   # Define and compile the model in the scope of the strategy.Doing so
# ensures the variables created are distributed and initialized properly
# according to the strategy.
model = tf.keras.Sequential([
   layers.Conv2D(32, 3, activation = 'relu', input_shape = (28, 28, 1)),
   layers.MaxPooling2D(),
   layers.Flatten(),
   layers.Dense(64, activation = 'relu'),
   layers.Dense(10, activation = 'softmax')
])
model.compile(
   loss = 'sparse_categorical_crossentropy',
   optimizer = tf.keras.optimizers.SGD(learning_rate = 0.001),
   metrics = ['accuracy'])
model.fit(x = train_dataset, epochs = 3, steps_per_epoch = 100)