HeavyEdge-Classify documentation#
HeavyEdge-Classify is a Python package for probabilistic classification of coating edge profiles.
Note
This package provides only the model architecture and command line interfaces. It does not include any pre-trained model or training data.
Usage#
HeavyEdge-Classify is designed to be used either as a command line program or as a Python module.
Command line#
Command line interface provides pre-defined subroutines for training and prediction. It can be invoked by:
heavyedge classify-train <args>
heavyedge classify-predict <args>
Refer to help message of each command for their arguments.
Python module#
The Python module heavyedge_classify provides functions and classes for Python runtime.
Refer to Runtime API section for high-level interface.
Module reference#
This section provides reference for heavyedge_classify Python module.
Runtime API#
High-level Python runtime interface.
- heavyedge_classify.api.classify_train(profiles, labels, cv=5, calibration='sigmoid', normalize=True, n_jobs=None, random_state=0, logger=<function <lambda>>, n_splits=None)[source]#
Train classification model.
- Parameters:
- profilesheavyedge.ProfileData
Open h5 file of profiles.
- labelsnp.ndarray
Label array. The order of labels should match the order of profiles.
- cvint, iterable, or cross-validation generator, default=5
Cross-validation strategy. If an integer is passed, it is the number of folds for stratified k-fold CV.
- calibration{“sigmoid”, “isotonic”, “temperature”, “sigmoid_ovo”, “isotonic_ovo”}
Calibration method for the classifier.
- normalizebool, default=True
Whether to normalize profiles by area under curve. Set this to False if profiles are already normalized.
- n_jobsint, default=None
Number of jobs to run in parallel
- random_stateint, default=0
Random seed for reproducibility.
- loggercallable, optional
Logger function which accepts a progress message string.
- n_splitsint, optional
Number of splits for cross-validation. If passed, overrides cv.
Deprecated since version 1.4.0: The n_splits parameter is deprecated and will be removed in a future version. Use cv instead.
- Returns:
- model
Trained model object.
Examples
>>> from heavyedge import ProfileData >>> from heavyedge_classify.samples import get_sample_path >>> from heavyedge_classify.api import classify_train >>> import numpy as np >>> profiles = ProfileData(get_sample_path("Profiles.h5")) >>> labels = np.load(get_sample_path("labels.npy")) >>> classify_train(profiles, labels) CalibratedClassifierCV(...)
- heavyedge_classify.api.classify_predict(model, profiles, normalize=True, batch_size=None, logger=<function <lambda>>)[source]#
Predict probabilistic labels of profiles using a trained model.
- Parameters:
- model
Trained model object.
- profilesheavyedge.ProfileData
Open h5 file of profiles.
- normalizebool, default=True
Whether to normalize profiles by area under curve. Set this to False if profiles are already normalized.
- batch_sizeint, optional
Batch size to load data. If not passed, all data are loaded at once.
- loggercallable, optional
Logger function which accepts a progress message string.
- Yields:
- predicted_labelsnp.ndarray
Predicted probabilistic label array.
Examples
>>> import pickle >>> from heavyedge import ProfileData >>> from heavyedge_classify.samples import get_sample_path >>> from heavyedge_classify.api import classify_predict >>> with open(get_sample_path("model.pkl"), "rb") as f: ... model = pickle.load(f) >>> profiles = ProfileData(get_sample_path("Profiles.h5")) >>> [lab.shape for lab in classify_predict(model, profiles, batch_size=50)] [(50, 3), (25, 3)]
Low-level API#
MiniRocket-based probabilistic classifier of 1D signals.
- heavyedge_classify.model.minirocket_classifier(cv=5, calibration='sigmoid', n_jobs=None, verbose=False, random_state=0, n_splits=None)[source]#
MiniRocket-based probabilistic classifier of 1D signals.
- Parameters:
- cvint, iterable, or cross-validation generator, default=5
Cross-validation strategy. If an integer is passed, it is the number of folds for stratified k-fold CV.
- calibration{“sigmoid”, “isotonic”, “temperature”, “sigmoid_ovo”, “isotonic_ovo”}
Calibration method for the classifier.
- n_jobsint, default=None
Number of jobs to run in parallel.
- verbosebool, default=False
Prints pipeline steps.
- random_stateint, default=0
Random seed for reproducibility.
- n_splitsint, optional
Number of splits for cross-validation. If passed, overrides cv.
Deprecated since version 1.4.0: The n_splits parameter is deprecated and will be removed in a future version. Use cv instead.
- Returns:
- model
MiniRocket-based probabilistic classifier.
Examples
>>> from heavyedge import ProfileData >>> from heavyedge_classify.samples import get_sample_path >>> from heavyedge_classify.model import minirocket_classifier >>> import numpy as np >>> model = minirocket_classifier(cv=5, random_state=42) >>> X, _, _ = ProfileData(get_sample_path("Profiles.h5"))[:] >>> y = np.load(get_sample_path("labels.npy")) >>> model.fit(X[:5], y[:5]) CalibratedClassifierCV(...)