Sensor validation

Sensor data quality plays a vital role in Internet of Things (IoT) applications as they are rendered useless if the data quality is bad.

Analysis elements

`entropy(s)`

Approximate entropy

The approximate entropy quantifies the amount of regularity and the unpredictability of fluctuations over time-series data.

Parameters:

Name	Type	Description	Default
`s`	`ndarray`	A single feature	required

Returns:

Type	Description
`float`	Approximate entropy of feature s

Source code in ceruleo/dataset/analysis/numerical_features.py

def entropy(s: np.ndarray) -> float:
    """
    Approximate entropy

    The approximate entropy quantifies the amount of regularity and the unpredictability of fluctuations over time-series data.

    Parameters:
        s: A single feature

    Returns:
        Approximate entropy of feature s
    """
    return ant.app_entropy(s)

`n_unique(s)`

Number of unique values in the array

Parameters:

Name	Type	Description	Default
`s`	`ndarray`	A single feature	required

Returns:

Type	Description
`int`	Number of unique values

Source code in ceruleo/dataset/analysis/numerical_features.py

def n_unique(s: np.ndarray) -> int:
    """
    Number of unique values in the array

    Parameters:
        s: A single feature

    Returns:
        Number of unique values
    """
    return len(np.unique(s))

`monotonicity(s)`

Monotonicity of a feature, the two extreme values are 0 if the feature is constant and 1 if it is strictly monotonic.

Parameters:

Name	Type	Description	Default
`s`	`ndarray`	A single feature	required

Returns:

Type	Description
`float`	Monotonicity of the feature

Source code in ceruleo/dataset/analysis/numerical_features.py

def monotonicity(s: np.ndarray) -> float:
    """
    Monotonicity of a feature, the two extreme values are 0 if the feature is constant and 1 if it is strictly monotonic.

    Parameters:
        s: A single feature

    Returns:
        Monotonicity of the feature
    """
    N = s.shape[0]
    diff = np.diff(s)
    return 1 / (N - 1) * np.abs(np.sum(diff > 0) - np.sum(diff < 0))

`autocorrelation(s)`

Autocorrelation of a feature

Parameters:

Name	Type	Description	Default
`s`	`ndarray`	A single feature	required

Returns:

Type	Description
`float`	Autocorrelation of the feature

Source code in ceruleo/dataset/analysis/numerical_features.py

def autocorrelation(s: np.ndarray) -> float:
    """
    Autocorrelation of a feature

    Parameters:
        s: A single feature

    Returns:
        Autocorrelation of the feature
    """
    diff = np.diff(s)
    return np.sum(diff**2) / s.shape[0]

`correlation(s, y=None)`

Correlation of the feature with the target

Parameters:

Name	Type	Description	Default
`s`	`ndarray`	A single feature	required
`y`	`Optional[ndarray]`	The RUL target	`None`

Returns:

Type	Description
`float`	Correlation between the feature and the RUL target

Source code in ceruleo/dataset/analysis/numerical_features.py

def correlation(s: np.ndarray, y: Optional[np.ndarray] = None) -> float:
    """
    Correlation of the feature with the target

    Parameters:
        s: A single feature
        y: The RUL target

    Returns:
        Correlation between the feature and the RUL target
    """
    N = s.shape[0]
    if not (s[0] == s).all():
        corr = spearmanr(s, np.arange(N), nan_policy="omit")
        corr = corr.correlation
    else:
        corr = np.nan
    return corr

`mutual_information(x, y)`

Mutual information between a feature and the target

Reference

Parameters:

Name	Type	Description	Default
`x`	`ndarray`	A single feature	required
`y`	`ndarray`	RUL Target	required

Returns:

Type	Description
`float`	Mutual information between x and y

Source code in ceruleo/dataset/analysis/numerical_features.py

def mutual_information(x: np.ndarray, y: np.ndarray) -> float:
    """Mutual information between a feature and the target

    [Reference](Remaining Useful Life Prediction Using Ranking Mutual Information Based Monotonic Health Indicator)

    Parameters:
        x: A single feature
        y: RUL Target

    Returns:
        Mutual information between x and y

    """
    x = x.reshape(-1, 1)
    x = np.nan_to_num(x)
    return mutual_info_regression(x, y)