sample.utils.learn module¶
Machine learning utilities
- class sample.utils.learn.OptionalStorage(save: bool = False)¶
Bases:
BaseEstimator
Storage that can be deactivated. It’s main use is in optionally saving intermediate values in machine learning processes
- Parameters:
save (bool) – Determines if the storage is actually operational
Example
>>> # Let's define our own regressor >>> from sample.utils.learn import * >>> from sklearn.base import BaseEstimator >>> from sklearn.utils import check_X_y >>> import numpy as np >>> class CoolRegression(base.BaseEstimator): ... def __init__(self, storage=None, **kwargs): ... # This object will memorize some intermediate values ... self.storage = storage ... self.set_params(**kwargs) ... @default_property ... def storage(self): ... return OptionalStorage() ... def fit(self, X: np.ndarray, y: np.ndarray): ... # Delete state from eventual previous runs ... self.storage.reset() ... check_X_y(X, y) ... xcorr = X.T @ X ... # Save the X.T @ X square matrix ... self.storage.append("xcorr", xcorr) ... pinv = np.linalg.pinv(xcorr) ... # Save the pseudo-inverse matrix ... self.storage.append("pinv", pinv) ... self.coeffs_ = pinv @ X.T @ y ... return self >>> # Let's simulate some data points for testing >>> n = 32 >>> x = np.random.randn(n, 2) * 24 >>> y = x[:, 0] * 4 - x[:, 1] * 0.5 + np.random.randn(n) * 0.5 >>> # By default, the storage is inactive >>> cr = CoolRegression().fit(x, y) >>> try: ... cr.storage.cache_.keys() ... except AttributeError as e: ... print(e) 'OptionalStorage' object has no attribute 'cache_' >>> # If save==True, then the storage is active >>> cr = CoolRegression(storage__save=True).fit(x, y) >>> list(cr.storage.cache_.keys()) ['xcorr', 'pinv'] >>> # And the matrices have the expected shape >>> cr.storage["xcorr"][0].shape (2, 2) >>> cr.storage["pinv"][0].shape (2, 2)
- append(key: str, value: T, index: Optional[int] = None) T ¶
Append variable in cache if
self.save
is True- Parameters:
key (str) – Data name
value – Data
index (int) – Optional. The value will be added to a list, at the specified index
- Returns:
The input value
- Return type:
object
- get_state(deepcopy: bool = True)¶
Retrieve the storage state
- Parameters:
deepcopy (bool) – If
True
, make a deep copy if the state- Returns:
The state
- Return type:
dict
- reset()¶
Reset memory
- sample.utils.learn.default_property(**kwargs) Callable[[Callable[[Any], Any]], _DefaultProperty] ¶
- sample.utils.learn.default_property(default_fn: Callable[[Any], Any], **kwargs) _DefaultProperty
Default property attribute. Can be used as a decorator. It’s meant to be used to avoid unexpected sharing of instances between different objects when using default argument values. Unexpected sharing of instances can be a problem when those instances not only encapsulate initialization arguments but also store some state, such as the parameters learned during training.
Example
>>> # Let's define a class for member objects >>> class Member: ... def __init__(self, field=0): ... self.field = field >>> # And a class that correctly uses the default_property decorator >>> from sample.utils.learn import default_property >>> class Good: ... def __init__(self, member=None, field=0): ... self.member = member ... self.field = field ... @default_property ... def member(self): ... return Member() >>> # Instances created with default arguments do not share >>> # the same instance of the member object >>> Good().member is not None True >>> Good().member is not Good().member True >>> # Unless we explicitly provide a value >>> x = Member() >>> Good(x).member is Good(x).member True >>> # A class with default values will share >>> # the same instance of the member object >>> class Bad: ... def __init__(self, member=Member(), field=0): ... self.member = member ... self.field = field >>> Bad().member is Bad().member True >>> # Note that the member object is created on set, and not on get >>> g = Good() >>> g.member is g.member True >>> # You can also change the condition that triggers the default >>> # (by default, it is triggered when the input is None) >>> # E.g.: this function sets the default when the input value is not an >>> # instance of the Member class >>> Good.member.defaulter(lambda _, v: not isinstance(v, Member)) >>> isinstance(Good(g).member, Member) True >>> Good(g).member is not g True >>> # While allowing assignments to get trough for valid inputs >>> Good(x).member is x True >>> # Default values can be deactivated using the constant defaulter >>> Good.member.defaulter() >>> Good().member is None True >>> # Default properties can be defined dynamically >>> # Some callables such as lambdas and partials will >>> # require an additional 'name' argument >>> try: ... Good.member = default_property(lambda _: {}) ... except ValueError as e: ... print(e) default_property(): name '<lambda>' is forbidden because of possible conflicts with other lambda functions. Please, specify a different 'name' >>> Good.member = default_property(name="member")(lambda _: {}) >>> isinstance(Good().member, dict) and not Good().member True >>> Good().member is not Good().member True