sample.utils.learn module

Machine learning utilities

class sample.utils.learn.OptionalStorage(save: bool = False)

Bases: BaseEstimator

Storage that can be deactivated. It’s main use is in optionally saving intermediate values in machine learning processes

Parameters:

save (bool) – Determines if the storage is actually operational

Example

>>> # Let's define our own regressor
>>> from sample.utils.learn import *
>>> from sklearn.base import BaseEstimator
>>> from sklearn.utils import check_X_y
>>> import numpy as np
>>> class CoolRegression(base.BaseEstimator):
...   def __init__(self, storage=None, **kwargs):
...     # This object will memorize some intermediate values
...     self.storage = storage
...     self.set_params(**kwargs)
...   @default_property
...   def storage(self):
...     return OptionalStorage()
...   def fit(self, X: np.ndarray, y: np.ndarray):
...     # Delete state from eventual previous runs
...     self.storage.reset()
...     check_X_y(X, y)
...     xcorr = X.T @ X
...     # Save the X.T @ X square matrix
...     self.storage.append("xcorr", xcorr)
...     pinv = np.linalg.pinv(xcorr)
...     # Save the pseudo-inverse matrix
...     self.storage.append("pinv", pinv)
...     self.coeffs_ = pinv @ X.T @ y
...     return self
>>> # Let's simulate some data points for testing
>>> n = 32
>>> x = np.random.randn(n, 2) * 24
>>> y = x[:, 0] * 4 - x[:, 1] * 0.5 + np.random.randn(n) * 0.5
>>> # By default, the storage is inactive
>>> cr = CoolRegression().fit(x, y)
>>> try:
...   cr.storage.cache_.keys()
... except AttributeError as e:
...   print(e)
'OptionalStorage' object has no attribute 'cache_'
>>> # If save==True, then the storage is active
>>> cr = CoolRegression(storage__save=True).fit(x, y)
>>> list(cr.storage.cache_.keys())
['xcorr', 'pinv']
>>> # And the matrices have the expected shape
>>> cr.storage["xcorr"][0].shape
(2, 2)
>>> cr.storage["pinv"][0].shape
(2, 2)
append(key: str, value: T, index: Optional[int] = None) T

Append variable in cache if self.save is True

Parameters:
  • key (str) – Data name

  • value – Data

  • index (int) – Optional. The value will be added to a list, at the specified index

Returns:

The input value

Return type:

object

get_state(deepcopy: bool = True)

Retrieve the storage state

Parameters:

deepcopy (bool) – If True, make a deep copy if the state

Returns:

The state

Return type:

dict

reset()

Reset memory

sample.utils.learn.default_property(**kwargs) Callable[[Callable[[Any], Any]], _DefaultProperty]
sample.utils.learn.default_property(default_fn: Callable[[Any], Any], **kwargs) _DefaultProperty

Default property attribute. Can be used as a decorator. It’s meant to be used to avoid unexpected sharing of instances between different objects when using default argument values. Unexpected sharing of instances can be a problem when those instances not only encapsulate initialization arguments but also store some state, such as the parameters learned during training.

Example

>>> # Let's define a class for member objects
>>> class Member:
...   def __init__(self, field=0):
...     self.field = field
>>> # And a class that correctly uses the default_property decorator
>>> from sample.utils.learn import default_property
>>> class Good:
...   def __init__(self, member=None, field=0):
...     self.member = member
...     self.field = field
...   @default_property
...   def member(self):
...     return Member()
>>> # Instances created with default arguments do not share
>>> # the same instance of the member object
>>> Good().member is not None
True
>>> Good().member is not Good().member
True
>>> # Unless we explicitly provide a value
>>> x = Member()
>>> Good(x).member is Good(x).member
True
>>> # A class with default values will share
>>> # the same instance of the member object
>>> class Bad:
...   def __init__(self, member=Member(), field=0):
...     self.member = member
...     self.field = field
>>> Bad().member is Bad().member
True
>>> # Note that the member object is created on set, and not on get
>>> g = Good()
>>> g.member is g.member
True
>>> # You can also change the condition that triggers the default
>>> # (by default, it is triggered when the input is None)
>>> # E.g.: this function sets the default when the input value is not an
>>> # instance of the Member class
>>> Good.member.defaulter(lambda _, v: not isinstance(v, Member))
>>> isinstance(Good(g).member, Member)
True
>>> Good(g).member is not g
True
>>> # While allowing assignments to get trough for valid inputs
>>> Good(x).member is x
True
>>> # Default values can be deactivated using the constant defaulter
>>> Good.member.defaulter()
>>> Good().member is None
True
>>> # Default properties can be defined dynamically
>>> # Some callables such as lambdas and partials will
>>> # require an additional 'name' argument
>>> try:
...   Good.member = default_property(lambda _: {})
... except ValueError as e:
...   print(e)
default_property(): name '<lambda>' is forbidden because of possible conflicts with other lambda functions. Please, specify a different 'name'
>>> Good.member = default_property(name="member")(lambda _: {})
>>> isinstance(Good().member, dict) and not Good().member
True
>>> Good().member is not Good().member
True