Visual example:
from tensorflow.keras.layers import Input, Conv2D, Dense, Flatten from tensorflow.keras.models import Model ipt = Input(shape = (16, 16, 16)) x = Conv2D(12, 8, 1)(ipt) x = Flatten()(x) out = Dense(16)(x) model = Model(ipt, out) model.compile('adam', 'mse') X = np.random.randn(10, 16, 16, 16) # toy data Y = np.random.randn(10, 16) # toy labels for _ in range(10): model.train_on_batch(X, Y) def get_weights_print_stats(layer): W = layer.get_weights() print(len(W)) for w in W: print(w.shape) return W def hist_weights(weights, bins = 500): for weight in weights: plt.hist(np.ndarray.flatten(weight), bins = bins) W = get_weights_print_stats(model.layers[1]) # 2 #(8, 8, 16, 12) #(12, ) hist_weights(W)
Retrieve weights of layer of interest. Ex: model.layers[1].get_weights() ,Understand weight roles and dimensionality. Ex: LSTMs have three sets of weights: kernel, recurrent, and bias, each serving a different purpose. Within each weight matrix are gate weights - Input, Cell, Forget, Output. For Conv layers, the distinction's between filters (dim0), kernels, and strides.,Organize weight matrices for visualization in a meaningful manner per (2). Ex: for Conv, unlike for LSTM, feature-specific treatment isn't really necessary, and we can simply flatten kernel weights and bias weights and visualize them in a histogram, Stability: if weights are changing greatly and quickly, or if there are many high-valued weights, it may indicate impaired gradient performance, remedied by e.g. gradient clipping or weight constraints
Visual example:
from tensorflow.keras.layers import Input, Conv2D, Dense, Flatten from tensorflow.keras.models import Model ipt = Input(shape = (16, 16, 16)) x = Conv2D(12, 8, 1)(ipt) x = Flatten()(x) out = Dense(16)(x) model = Model(ipt, out) model.compile('adam', 'mse') X = np.random.randn(10, 16, 16, 16) # toy data Y = np.random.randn(10, 16) # toy labels for _ in range(10): model.train_on_batch(X, Y) def get_weights_print_stats(layer): W = layer.get_weights() print(len(W)) for w in W: print(w.shape) return W def hist_weights(weights, bins = 500): for weight in weights: plt.hist(np.ndarray.flatten(weight), bins = bins) W = get_weights_print_stats(model.layers[1]) # 2 #(8, 8, 16, 12) #(12, ) hist_weights(W)
August 05, 2022 keras, python, tensorflow No comments
Visual example:
from tensorflow.keras.layers import Input, Conv2D, Dense, Flatten from tensorflow.keras.models import Model ipt = Input(shape = (16, 16, 16)) x = Conv2D(12, 8, 1)(ipt) x = Flatten()(x) out = Dense(16)(x) model = Model(ipt, out) model.compile('adam', 'mse') X = np.random.randn(10, 16, 16, 16) # toy data Y = np.random.randn(10, 16) # toy labels for _ in range(10): model.train_on_batch(X, Y) def get_weights_print_stats(layer): W = layer.get_weights() print(len(W)) for w in W: print(w.shape) return W def hist_weights(weights, bins = 500): for weight in weights: plt.hist(np.ndarray.flatten(weight), bins = bins) W = get_weights_print_stats(model.layers[1]) # 2 #(8, 8, 16, 12) #(12, ) hist_weights(W)
Last updated 2022-01-10 UTC.
Saving a Keras model:
model = ...# Get model(Sequential, Functional Model, or Model subclass) model.save('path/to/location')
Loading the model back:
from tensorflow
import keras
model = keras.models.load_model('path/to/location')
Setup
import numpy as np
import tensorflow as tf
from tensorflow
import keras
4/4 [==============================] - 1s 2ms/step - loss: 0.5884
2021-08-25 17:49:05.320893: W tensorflow/python/util/util.cc:348] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them.
INFO:tensorflow:Assets written to: my_model/assets
4/4 [==============================] - 0s 2ms/step - loss: 0.5197
<keras.callbacks.History at 0x7f99486ad490>
Calling model.save('my_model')
creates a folder named my_model
,
containing the following:
ls my_model
This behavior only applies for BatchNormalization. For every other layer, weight trainability and "inference vs training mode" remain independent.,the weights of the model,Instantiate a base model and load pre-trained weights,As you can see, "inference mode vs training mode" and "layer weight trainability" are two very different concepts.
strategy = tf.distribute.MirroredStrategy()
with strategy.scope():
# This could be any kind of model--Functional, subclass...
model = tf.keras.Sequential([
tf.keras.layers.Conv2D(32, 3, activation = 'relu', input_shape = (28, 28, 1)),
tf.keras.layers.GlobalMaxPooling2D(),
tf.keras.layers.Dense(10)
])
model.compile(loss = tf.keras.losses.SparseCategoricalCrossentropy(from_logits = True),
optimizer = tf.keras.optimizers.Adam(),
metrics = [tf.keras.metrics.SparseCategoricalAccuracy()])
model.fit(train_dataset, epochs = 12, callbacks = callbacks)
# Model where a shared LSTM is used to encode two different sequences in parallel input_a = keras.Input(shape = (140, 256)) input_b = keras.Input(shape = (140, 256)) shared_lstm = keras.layers.LSTM(64) # Process the first sequence on one GPU with tf.device_scope('/gpu:0'): encoded_a = shared_lstm(input_a) # Process the next sequence on another GPU with tf.device_scope('/gpu:1'): encoded_b = shared_lstm(input_b) # Concatenate results on CPU with tf.device_scope('/cpu:0'): merged_vector = keras.layers.concatenate( [encoded_a, encoded_b], axis = -1)
cluster_resolver = ... if cluster_resolver.task_type in ("worker", "ps"): # Start a[`tf.distribute.Server`](https: //www.tensorflow.org/api_docs/python/tf/distribute/Server) and wait. ... elif cluster_resolver.task_type == "evaluator": # Run an(optional) side - car evaluation ... # Otherwise, this is the coordinator that controls the training w / the strategy.strategy = tf.distribute.experimental.ParameterServerStrategy( cluster_resolver = ...) train_dataset = ... with strategy.scope(): model = tf.keras.Sequential([ layers.Conv2D(32, 3, activation = 'relu', input_shape = (28, 28, 1)), layers.MaxPooling2D(), layers.Flatten(), layers.Dense(64, activation = 'relu'), layers.Dense(10, activation = 'softmax') ]) model.compile( loss = 'sparse_categorical_crossentropy', optimizer = tf.keras.optimizers.SGD(learning_rate = 0.001), metrics = ['accuracy'], steps_per_execution = 10) model.fit(x = train_dataset, epochs = 3, steps_per_epoch = 100)
# By default `MultiWorkerMirroredStrategy` uses cluster information # from `TF_CONFIG`, and "AUTO" collective op communication. strategy = tf.distribute.experimental.MultiWorkerMirroredStrategy() train_dataset = get_training_dataset() with strategy.scope(): # Define and compile the model in the scope of the strategy.Doing so # ensures the variables created are distributed and initialized properly # according to the strategy. model = tf.keras.Sequential([ layers.Conv2D(32, 3, activation = 'relu', input_shape = (28, 28, 1)), layers.MaxPooling2D(), layers.Flatten(), layers.Dense(64, activation = 'relu'), layers.Dense(10, activation = 'softmax') ]) model.compile( loss = 'sparse_categorical_crossentropy', optimizer = tf.keras.optimizers.SGD(learning_rate = 0.001), metrics = ['accuracy']) model.fit(x = train_dataset, epochs = 3, steps_per_epoch = 100)