VanillaALS

class numpy_ml.factorization.VanillaALS(K, alpha=1, max_iter=200, tol=0.0001)[source]

Approximately factor a real-valued matrix using regularized alternating least-squares (ALS).

Notes

The regularized ALS minimization problem is

\[\min_{\mathbf{W}, \mathbf{H}} ||\mathbf{X} - \mathbf{WH}||^2 - \alpha \left( ||\mathbf{W}||^2 + ||\mathbf{H}||^2 \right)\]

where \(||\cdot||\) denotes the Frobenius norm, X is the \(N \times M\) data matrix, \(\mathbf{W}\) and \(\mathbf{H}\) are learned factor matrices with dimensions \(N \times K\) and \(K \times M\), respectively, and \(\alpha\) is a user-defined regularization weight.

ALS proceeds by alternating between fixing W and optimizing for H and fixing H and optimizing for W. Vanilla ALS has no convergance guarantees and the objective function is prone to oscillation across updates, particularly for dense input matrices [1].

References

[1]Gillis, N. (2014). The why and how of nonnegative matrix factorization. Regularization, optimization, kernels, and support vector machines, 12(257), 257-291.
Parameters:
  • K (int) – The number of latent factors to include in the factor matrices W and H.
  • alpha (float) – The L2 regularization weight on the factor matrices. Larger values result in more aggressive regularization. Default is 1.
  • max_iter (int) – The maximum number of iterations to run before stopping. Default is 200.
  • tol (float) – The tolerance for the stopping condition. Default is 1e-4.
parameters[source]

Return a dictionary of the current model parameters

hyperparameters[source]

Return a dictionary of the model hyperparameters

fit(X, W=None, H=None, n_initializations=10, verbose=False)[source]

Factor a data matrix into two low rank factors via ALS.

Parameters:
  • X (numpy array of shape (N, M)) – The data matrix to factor.
  • W (numpy array of shape (N, K) or None) – An initial value for the W factor matrix. If None, initialize W randomly. Default is None.
  • H (numpy array of shape (K, M) or None) – An initial value for the H factor matrix. If None, initialize H randomly. Default is None.
  • n_initializations (int) – Number of re-initializations of the algorithm to perform before taking the answer with the lowest reconstruction error. This value is ignored and set to 1 if both W and H are not None. Default is 10.
  • verbose (bool) – Whether to print the loss at each iteration. Default is False.

NMF

class numpy_ml.factorization.NMF(K, max_iter=200, tol=0.0001)[source]

Nonnegative matrix factorization (NMF) performed using fast hierarchical alternating least squares (HALS) [*].

Notes

The NMF minimization problem is

\[\min_{\mathbf{W}, \mathbf{H}} ||\mathbf{X} - \mathbf{WH}||^2 \ \ \ \ \text{subject to } \mathbf{W}, \mathbf{H} \geq 0\]

where \(||\cdot||\) denotes the Frobenius norm, and the notation \(\mathbf{A} \geq 0\) indicates that each element of A is greater than or equal to 0. In the above equation, X is the \(N \times M\) data matrix, \(\mathbf{W}\) and \(\mathbf{H}\) are learned factor matrices with dimensions \(N \times K\) and \(K \times M\), respectively.

As with other ALS-based approaches, there is no guarantee that NMF will converge to a stationary point, let alone a global minimum. As a result it is generally good practice to run the algorithm multiple times with different initializations, taking the outcome that achieves the lowest reconstruction error.

References

[*]Cichocki, A., & Phan, A. (2009). Fast local algorithms for large scale nonnegative matrix and tensor factorizations. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, 92(3), 708-721.
Parameters:
  • K (int) – The number of latent factors to include in the factor matrices W and H.
  • max_iter (int) – The maximum number of iterations to run before stopping. Default is 200.
  • tol (float) – The tolerance for the stopping condition. Default is 1e-4.
parameters[source]

Return a dictionary of the current model parameters

hyperparameters[source]

Return a dictionary of the model hyperparameters

fit(X, W=None, H=None, n_initializations=10, verbose=False)[source]

Factor a data matrix into two nonnegative low rank factor matrices via fast HALS.

Notes

This method implements Algorithm 2 from [†]. In contrast to vanilla ALS, HALS proceeds by minimizing a set of local cost functions with the same global minima. Each cost function is defined on a “residue” of the factor matrices W and H:

\[\mathbf{X}^{(j)} := \mathbf{X} - \mathbf{WH}^\top + \mathbf{w}_j \mathbf{h}_j^\top\]

where \(\mathbf{X}^{(j)}\) is the \(j^{th}\) residue, X is the input data matrix, and \(\mathbf{w}_j\) and \(\mathbf{h}_j\) are the \(j^{th}\) columns of the current factor matrices W and H. HALS proceeds by minimizing the cost for each residue, first with respect to \(\mathbf{w}_j\), and then with respect to \(\mathbf{h}_j\). In either case, the cost for residue j, \(\mathcal{L}^{(j)}\) is simply:

\[\mathcal{L}^{(j)} := || \mathbf{X}^{(j)} - \mathbf{w}_j \mathbf{h}_j^\top ||\]

where \(||\cdot||\) denotes the Frobenius norm. For NMF, minimization is performed under the constraint that all elements of both W and H are nonnegative.

References

[†]Cichocki, A., & Phan, A. (2009). Fast local algorithms for large scale nonnegative matrix and tensor factorizations. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, 92(3), 708-721.
Parameters:
  • X (numpy array of shape (N, M)) – The data matrix to factor.
  • W (numpy array of shape (N, K) or None) – An initial value for the W factor matrix. If None, initialize W using vanilla ALS. Default is None.
  • H (numpy array of shape (K, M) or None) – An initial value for the H factor matrix. If None, initialize H using vanilla ALS. Default is None.
  • n_initializations (int) – Number of re-initializations of the algorithm to perform before taking the answer with the lowest reconstruction error. This value is ignored and set to 1 if both W and H are not None. Default is 10.
  • verbose (bool) – Whether to print the loss at each iteration. Default is False.