Utilities

minibatch

numpy_ml.neural_nets.utils.minibatch(X, batchsize=256, shuffle=True)[source]

Compute the minibatch indices for a training dataset.

Parameters:
  • X (ndarray of shape (N, *)) – The dataset to divide into minibatches. Assumes the first dimension represents the number of training examples.
  • batchsize (int) – The desired size of each minibatch. Note, however, that if X.shape[0] % batchsize > 0 then the final batch will contain fewer than batchsize entries. Default is 256.
  • shuffle (bool) – Whether to shuffle the entries in the dataset before dividing into minibatches. Default is True.
Returns:

  • mb_generator (generator) – A generator which yields the indices into X for each batch
  • n_batches (int) – The number of batches

calc_pad_dims_2D

numpy_ml.neural_nets.utils.calc_pad_dims_2D(X_shape, out_dim, kernel_shape, stride, dilation=0)[source]

Compute the padding necessary to ensure that convolving X with a 2D kernel of shape kernel_shape and stride stride produces outputs with dimension out_dim.

Parameters:
  • X_shape (tuple of (n_ex, in_rows, in_cols, in_ch)) – Dimensions of the input volume. Padding is applied to in_rows and in_cols.
  • out_dim (tuple of (out_rows, out_cols)) – The desired dimension of an output example after applying the convolution.
  • kernel_shape (2-tuple) – The dimension of the 2D convolution kernel.
  • stride (int) – The stride for the convolution kernel.
  • dilation (int) – Number of pixels inserted between kernel elements. Default is 0.
Returns:

padding_dims (4-tuple) – Padding dims for X. Organized as (left, right, up, down)

calc_pad_dims_1D

numpy_ml.neural_nets.utils.calc_pad_dims_1D(X_shape, l_out, kernel_width, stride, dilation=0, causal=False)[source]

Compute the padding necessary to ensure that convolving X with a 1D kernel of shape kernel_shape and stride stride produces outputs with length l_out.

Parameters:
  • X_shape (tuple of (n_ex, l_in, in_ch)) – Dimensions of the input volume. Padding is applied on either side of l_in.
  • l_out (int) – The desired length an output example after applying the convolution.
  • kernel_width (int) – The width of the 1D convolution kernel.
  • stride (int) – The stride for the convolution kernel.
  • dilation (int) – Number of pixels inserted between kernel elements. Default is 0.
  • causal (bool) – Whether to compute the padding dims for a regular or causal convolution. If causal, padding is added only to the left side of the sequence. Default is False.
Returns:

padding_dims (2-tuple) – Padding dims for X. Organized as (left, right)

pad1D

numpy_ml.neural_nets.utils.pad1D(X, pad, kernel_width=None, stride=None, dilation=0)[source]

Zero-pad a 3D input volume X along the second dimension.

Parameters:
  • X (ndarray of shape (n_ex, l_in, in_ch)) – Input volume. Padding is applied to l_in.
  • pad (tuple, int, or {'same', 'causal'}) – The padding amount. If ‘same’, add padding to ensure that the output length of a 1D convolution with a kernel of kernel_shape and stride stride is the same as the input length. If ‘causal’ compute padding such that the output both has the same length as the input AND output[t] does not depend on input[t + 1:]. If 2-tuple, specifies the number of padding columns to add on each side of the sequence.
  • kernel_width (int) – The dimension of the 2D convolution kernel. Only relevant if p=’same’ or ‘causal’. Default is None.
  • stride (int) – The stride for the convolution kernel. Only relevant if p=’same’ or ‘causal’. Default is None.
  • dilation (int) – The dilation of the convolution kernel. Only relevant if p=’same’ or ‘causal’. Default is None.
Returns:

  • X_pad (ndarray of shape (n_ex, padded_seq, in_channels)) – The padded output volume
  • p (2-tuple) – The number of 0-padded columns added to the (left, right) of the sequences in X.

pad2D

numpy_ml.neural_nets.utils.pad2D(X, pad, kernel_shape=None, stride=None, dilation=0)[source]

Zero-pad a 4D input volume X along the second and third dimensions.

Parameters:
  • X (ndarray of shape (n_ex, in_rows, in_cols, in_ch)) – Input volume. Padding is applied to in_rows and in_cols.
  • pad (tuple, int, or 'same') – The padding amount. If ‘same’, add padding to ensure that the output of a 2D convolution with a kernel of kernel_shape and stride stride has the same dimensions as the input. If 2-tuple, specifies the number of padding rows and colums to add on both sides of the rows/columns in X. If 4-tuple, specifies the number of rows/columns to add to the top, bottom, left, and right of the input volume.
  • kernel_shape (2-tuple) – The dimension of the 2D convolution kernel. Only relevant if p=’same’. Default is None.
  • stride (int) – The stride for the convolution kernel. Only relevant if p=’same’. Default is None.
  • dilation (int) – The dilation of the convolution kernel. Only relevant if p=’same’. Default is 0.
Returns:

  • X_pad (ndarray of shape (n_ex, padded_in_rows, padded_in_cols, in_channels)) – The padded output volume.
  • p (4-tuple) – The number of 0-padded rows added to the (top, bottom, left, right) of X.

dilate

numpy_ml.neural_nets.utils.dilate(X, d)[source]

Dilate the 4D volume X by d.

Notes

For a visual depiction of a dilated convolution, see [1].

References

[1]Dumoulin & Visin (2016). “A guide to convolution arithmetic for deep learning.” https://arxiv.org/pdf/1603.07285v1.pdf
Parameters:
  • X (ndarray of shape (n_ex, in_rows, in_cols, in_ch)) – Input volume.
  • d (int) – The number of 0-rows to insert between each adjacent row + column in X.
Returns:

Xd (ndarray of shape (n_ex, out_rows, out_cols, out_ch)) – The dilated array where

\[\begin{split}\text{out_rows} &= \text{in_rows} + d(\text{in_rows} - 1) \\ \text{out_cols} &= \text{in_cols} + d (\text{in_cols} - 1)\end{split}\]

calc_fan

numpy_ml.neural_nets.utils.calc_fan(weight_shape)[source]

Compute the fan-in and fan-out for a weight matrix/volume.

Parameters:weight_shape (tuple) – The dimensions of the weight matrix/volume. The final 2 entries must be in_ch, out_ch.
Returns:
  • fan_in (int) – The number of input units in the weight tensor
  • fan_out (int) – The number of output units in the weight tensor

calc_conv_out_dims

numpy_ml.neural_nets.utils.calc_conv_out_dims(X_shape, W_shape, stride=1, pad=0, dilation=0)[source]

Compute the dimension of the output volume for the specified convolution.

Parameters:
  • X_shape (3-tuple or 4-tuple) – The dimensions of the input volume to the convolution. If 3-tuple, entries are expected to be (n_ex, in_length, in_ch). If 4-tuple, entries are expected to be (n_ex, in_rows, in_cols, in_ch).
  • weight_shape (3-tuple or 4-tuple) – The dimensions of the weight volume for the convolution. If 3-tuple, entries are expected to be (f_len, in_ch, out_ch). If 4-tuple, entries are expected to be (fr, fc, in_ch, out_ch).
  • pad (tuple, int, or {'same', 'causal'}) – The padding amount. If ‘same’, add padding to ensure that the output length of a 1D convolution with a kernel of kernel_shape and stride stride is the same as the input length. If ‘causal’ compute padding such that the output both has the same length as the input AND output[t] does not depend on input[t + 1:]. If 2-tuple, specifies the number of padding columns to add on each side of the sequence. Default is 0.
  • stride (int) – The stride for the convolution kernel. Default is 1.
  • dilation (int) – The dilation of the convolution kernel. Default is 0.
Returns:

out_dims (3-tuple or 4-tuple) – The dimensions of the output volume. If 3-tuple, entries are (n_ex, out_length, out_ch). If 4-tuple, entries are (n_ex, out_rows, out_cols, out_ch).

im2col

numpy_ml.neural_nets.utils.im2col(X, W_shape, pad, stride, dilation=0)[source]

Pads and rearrange overlapping windows of the input volume into column vectors, returning the concatenated padded vectors in a matrix X_col.

Notes

A NumPy reimagining of MATLAB’s im2col ‘sliding’ function.

Code extended from Andrej Karpathy’s im2col.py.

Parameters:
  • X (ndarray of shape (n_ex, in_rows, in_cols, in_ch)) – Input volume (not padded).
  • W_shape (4-tuple containing (kernel_rows, kernel_cols, in_ch, out_ch)) – The dimensions of the weights/kernels in the present convolutional layer.
  • pad (tuple, int, or 'same') – The padding amount. If ‘same’, add padding to ensure that the output of a 2D convolution with a kernel of kernel_shape and stride stride produces an output volume of the same dimensions as the input. If 2-tuple, specifies the number of padding rows and colums to add on both sides of the rows/columns in X. If 4-tuple, specifies the number of rows/columns to add to the top, bottom, left, and right of the input volume.
  • stride (int) – The stride of each convolution kernel
  • dilation (int) – Number of pixels inserted between kernel elements. Default is 0.
Returns:

X_col (ndarray of shape (Q, Z)) – The reshaped input volume where where:

\[\begin{split}Q &= \text{kernel_rows} \times \text{kernel_cols} \times \text{n_in} \\ Z &= \text{n_ex} \times \text{out_rows} \times \text{out_cols}\end{split}\]

col2im

numpy_ml.neural_nets.utils.col2im(X_col, X_shape, W_shape, pad, stride, dilation=0)[source]

Take columns of a 2D matrix and rearrange them into the blocks/windows of a 4D image volume.

Notes

A NumPy reimagining of MATLAB’s col2im ‘sliding’ function.

Code extended from Andrej Karpathy’s im2col.py.

Parameters:
  • X_col (ndarray of shape (Q, Z)) – The columnized version of X (assumed to include padding)
  • X_shape (4-tuple containing (n_ex, in_rows, in_cols, in_ch)) – The original dimensions of X (not including padding)
  • W_shape (4-tuple containing (kernel_rows, kernel_cols, in_ch, out_ch)) – The dimensions of the weights in the present convolutional layer
  • pad (4-tuple of (left, right, up, down)) – Number of zero-padding rows/cols to add to X
  • stride (int) – The stride of each convolution kernel
  • dilation (int) – Number of pixels inserted between kernel elements. Default is 0.
Returns:

img (ndarray of shape (n_ex, in_rows, in_cols, in_ch)) – The reshaped X_col input matrix

conv2D

numpy_ml.neural_nets.utils.conv2D(X, W, stride, pad, dilation=0)[source]

A faster (but more memory intensive) implementation of the 2D “convolution” (technically, cross-correlation) of input X with a collection of kernels in W.

Notes

Relies on the im2col() function to perform the convolution as a single matrix multiplication.

For a helpful diagram, see Pete Warden’s 2015 blogpost [1].

References

[1]Warden (2015). “Why GEMM is at the heart of deep learning,” https://petewarden.com/2015/04/20/why-gemm-is-at-the-heart-of-deep-learning/
Parameters:
  • X (ndarray of shape (n_ex, in_rows, in_cols, in_ch)) – Input volume (unpadded).
  • W (ndarray of shape (kernel_rows, kernel_cols, in_ch, out_ch)) – A volume of convolution weights/kernels for a given layer.
  • stride (int) – The stride of each convolution kernel.
  • pad (tuple, int, or 'same') – The padding amount. If ‘same’, add padding to ensure that the output of a 2D convolution with a kernel of kernel_shape and stride stride produces an output volume of the same dimensions as the input. If 2-tuple, specifies the number of padding rows and colums to add on both sides of the rows/columns in X. If 4-tuple, specifies the number of rows/columns to add to the top, bottom, left, and right of the input volume.
  • dilation (int) – Number of pixels inserted between kernel elements. Default is 0.
Returns:

Z (ndarray of shape (n_ex, out_rows, out_cols, out_ch)) – The covolution of X with W.

conv1D

numpy_ml.neural_nets.utils.conv1D(X, W, stride, pad, dilation=0)[source]

A faster (but more memory intensive) implementation of a 1D “convolution” (technically, cross-correlation) of input X with a collection of kernels in W.

Notes

Relies on the im2col() function to perform the convolution as a single matrix multiplication.

For a helpful diagram, see Pete Warden’s 2015 blogpost [1].

References

[1]Warden (2015). “Why GEMM is at the heart of deep learning,” https://petewarden.com/2015/04/20/why-gemm-is-at-the-heart-of-deep-learning/
Parameters:
  • X (ndarray of shape (n_ex, l_in, in_ch)) – Input volume (unpadded)
  • W (ndarray of shape (kernel_width, in_ch, out_ch)) – A volume of convolution weights/kernels for a given layer
  • stride (int) – The stride of each convolution kernel
  • pad (tuple, int, or 'same') – The padding amount. If ‘same’, add padding to ensure that the output of a 1D convolution with a kernel of kernel_shape and stride stride produces an output volume of the same dimensions as the input. If 2-tuple, specifies the number of padding colums to add on both sides of the columns in X.
  • dilation (int) – Number of pixels inserted between kernel elements. Default is 0.
Returns:

Z (ndarray of shape (n_ex, l_out, out_ch)) – The convolution of X with W.

deconv2D_naive

numpy_ml.neural_nets.utils.deconv2D_naive(X, W, stride, pad, dilation=0)[source]

Perform a “deconvolution” (more accurately, a transposed convolution) of an input volume X with a weight kernel W, incorporating stride, pad, and dilation.

Notes

Rather than using the transpose of the convolution matrix, this approach uses a direct convolution with zero padding, which, while conceptually straightforward, is computationally inefficient.

For further explanation, see [1].

References

[1]Dumoulin & Visin (2016). “A guide to convolution arithmetic for deep learning.” https://arxiv.org/pdf/1603.07285v1.pdf
Parameters:
  • X (ndarray of shape (n_ex, in_rows, in_cols, in_ch)) – Input volume (not padded)
  • W (ndarray of shape (kernel_rows, kernel_cols, in_ch, out_ch)) – A volume of convolution weights/kernels for a given layer
  • stride (int) – The stride of each convolution kernel
  • pad (tuple, int, or 'same') – The padding amount. If ‘same’, add padding to ensure that the output of a 2D convolution with a kernel of kernel_shape and stride stride produces an output volume of the same dimensions as the input. If 2-tuple, specifies the number of padding rows and colums to add on both sides of the rows/columns in X. If 4-tuple, specifies the number of rows/columns to add to the top, bottom, left, and right of the input volume.
  • dilation (int) – Number of pixels inserted between kernel elements. Default is 0.
Returns:

Y (ndarray of shape (n_ex, out_rows, out_cols, n_out)) – The decovolution of (padded) input volume X with W using stride s and dilation d.

conv2D_naive

numpy_ml.neural_nets.utils.conv2D_naive(X, W, stride, pad, dilation=0)[source]

A slow but more straightforward implementation of a 2D “convolution” (technically, cross-correlation) of input X with a collection of kernels W.

Notes

This implementation uses for loops and direct indexing to perform the convolution. As a result, it is slower than the vectorized conv2D() function that relies on the col2im() and im2col() transformations.

Parameters:
  • X (ndarray of shape (n_ex, in_rows, in_cols, in_ch)) – Input volume.
  • W (ndarray of shape (kernel_rows, kernel_cols, in_ch, out_ch)) – The volume of convolution weights/kernels.
  • stride (int) – The stride of each convolution kernel.
  • pad (tuple, int, or 'same') – The padding amount. If ‘same’, add padding to ensure that the output of a 2D convolution with a kernel of kernel_shape and stride stride produces an output volume of the same dimensions as the input. If 2-tuple, specifies the number of padding rows and colums to add on both sides of the rows/columns in X. If 4-tuple, specifies the number of rows/columns to add to the top, bottom, left, and right of the input volume.
  • dilation (int) – Number of pixels inserted between kernel elements. Default is 0.
Returns:

Z (ndarray of shape (n_ex, out_rows, out_cols, out_ch)) – The covolution of X with W.

he_uniform

numpy_ml.neural_nets.utils.he_uniform(weight_shape)[source]

Initializes network weights W with using the He uniform initialization strategy.

Notes

The He uniform initializations trategy initializes thew eights in W using draws from Uniform(-b, b) where

\[b = \sqrt{\frac{6}{\text{fan_in}}}\]

Developed for deep networks with ReLU nonlinearities.

Parameters:weight_shape (tuple) – The dimensions of the weight matrix/volume.
Returns:W (ndarray of shape weight_shape) – The initialized weights.

he_normal

numpy_ml.neural_nets.utils.he_normal(weight_shape)[source]

Initialize network weights W using the He normal initialization strategy.

Notes

The He normal initialization strategy initializes the weights in W using draws from TruncatedNormal(0, b) where the variance b is

\[b = \frac{2}{\text{fan_in}}\]

He normal initialization was originally developed for deep networks with ReLU nonlinearities.

Parameters:weight_shape (tuple) – The dimensions of the weight matrix/volume.
Returns:W (ndarray of shape weight_shape) – The initialized weights.

glorot_uniform

numpy_ml.neural_nets.utils.glorot_uniform(weight_shape, gain=1.0)[source]

Initialize network weights W using the Glorot uniform initialization strategy.

Notes

The Glorot uniform initialization strategy initializes weights using draws from Uniform(-b, b) where:

\[b = \text{gain} \sqrt{\frac{6}{\text{fan_in} + \text{fan_out}}}\]

The motivation for Glorot uniform initialization is to choose weights to ensure that the variance of the layer outputs are approximately equal to the variance of its inputs.

This initialization strategy was primarily developed for deep networks with tanh and logistic sigmoid nonlinearities.

Parameters:weight_shape (tuple) – The dimensions of the weight matrix/volume.
Returns:W (ndarray of shape weight_shape) – The initialized weights.

glorot_normal

numpy_ml.neural_nets.utils.glorot_normal(weight_shape, gain=1.0)[source]

Initialize network weights W using the Glorot normal initialization strategy.

Notes

The Glorot normal initializaiton initializes weights with draws from TruncatedNormal(0, b) where the variance b is

\[b = \frac{2 \text{gain}^2}{\text{fan_in} + \text{fan_out}}\]

The motivation for Glorot normal initialization is to choose weights to ensure that the variance of the layer outputs are approximately equal to the variance of its inputs.

This initialization strategy was primarily developed for deep networks with Tanh and Sigmoid nonlinearities.

Parameters:weight_shape (tuple) – The dimensions of the weight matrix/volume.
Returns:W (ndarray of shape weight_shape) – The initialized weights.

truncated_normal

numpy_ml.neural_nets.utils.truncated_normal(mean, std, out_shape)[source]

Generate draws from a truncated normal distribution via rejection sampling.

Notes

The rejection sampling regimen draws samples from a normal distribution with mean mean and standard deviation std, and resamples any values more than two standard deviations from mean.

Parameters:
  • mean (float or array_like of floats) – The mean/center of the distribution
  • std (float or array_like of floats) – Standard deviation (spread or “width”) of the distribution.
  • out_shape (int or tuple of ints) – Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn.
Returns:

samples (ndarray of shape out_shape) – Samples from the truncated normal distribution parameterized by mean and std.