darfix.decomposition.ipca.IPCA#
- class darfix.decomposition.ipca.IPCA(data, chunksize, num_components=None, whiten=False, indices=None, rowvar=True)[source]#
Bases:
Base
Compute PCA in chunks, using the Incremental principal component analysis implementation in scikit-learn. To compute W, partially fits the rows in chunks (reduced number of images). Then, to compute H, applies dimensionality reduction for every chunk, and horizontally stacks the projection into H.
- Parameters:
data (array_like) – array of shape (n_samples, n_features). See rowvar.
chunksize (int) – Size of every group of samples to apply PCA to. PCA will be fit with arrays of shape (chunksize, n_features), where nfeatures is the number of features per sample. Depending on rowvar, the chunks will be from the rows or from the columns.
num_components (Union[None,int], optional) – Number of components to keep, defaults to None.
whiten (bool, optional) – If True, whitening is applied to the components.
indices (Union[None,array_like], optional) – The indices of the samples to use, defaults to None. If rowvar is False, corresponds to the indices of the features to use.
rowvar (bool, optional) – If rowvar is True (default), then each row represents a sample, with features in the columns. Otherwise, the relationship is transposed: each column represents a sample, while the rows contain features.
- property data#
- fit_transform(max_iter=1, error_step=None, W=None, H=None)[source]#
Fit to data, then transform it
- Parameters:
max_iter (int, optional) – Maximum number of iterations, defaults to 100
error_step (Union[None,int], optional) – If None, error is not computed, defaults to None Else compute error for every error_step iterations.
compute_w (bool, optional) – When False, W is not computed, defaults to True
compute_h (bool, optional) – When False, H is not computed, defaults to True
- frobenius_norm(chunks=200)#
Frobenius norm (||data - WH||) of a data matrix and a low rank approximation given by WH. Minimizing the Fnorm is the most common optimization criterion for matrix factorization methods. Returns: ——- frobenius norm: F = ||data - WH||
- property indices#
- property num_components#
- property num_features#
- property num_samples#
- property singular_values#
The singular values corresponding to each of the selected components.
- Retuns:
array, shape (n_components,)
- squared_frobenius_norm(chunks=200)#
Frobenius norm (||data - WH||) of a data matrix and a low rank approximation given by WH. Minimizing the Fnorm is the most common optimization criterion for matrix factorization methods. Returns: ——- frobenius norm: F = ||data - WH||