Skip to content

Sensor validation

Sensor data quality plays a vital role in Internet of Things (IoT) applications as they are rendered useless if the data quality is bad.

Analysis elements

entropy(s)

Approximate entropy

The approximate entropy quantifies the amount of regularity and the unpredictability of fluctuations over time-series data.

Parameters:

Name Type Description Default
s ndarray

A single feature

required

Returns:

Type Description
float

Approximate entropy of feature s

Source code in ceruleo/dataset/analysis/numerical_features.py
def entropy(s: np.ndarray) -> float:
    """
    Approximate entropy

    The approximate entropy quantifies the amount of regularity and the unpredictability of fluctuations over time-series data.

    Parameters:
        s: A single feature

    Returns:
        Approximate entropy of feature s
    """
    return ant.app_entropy(s)

n_unique(s)

Number of unique values in the array

Parameters:

Name Type Description Default
s ndarray

A single feature

required

Returns:

Type Description
int

Number of unique values

Source code in ceruleo/dataset/analysis/numerical_features.py
def n_unique(s: np.ndarray) -> int:
    """
    Number of unique values in the array

    Parameters:
        s: A single feature

    Returns:
        Number of unique values
    """
    return len(np.unique(s))

monotonicity(s)

Monotonicity of a feature, the two extreme values are 0 if the feature is constant and 1 if it is strictly monotonic.

Parameters:

Name Type Description Default
s ndarray

A single feature

required

Returns:

Type Description
float

Monotonicity of the feature

Source code in ceruleo/dataset/analysis/numerical_features.py
def monotonicity(s: np.ndarray) -> float:
    """
    Monotonicity of a feature, the two extreme values are 0 if the feature is constant and 1 if it is strictly monotonic.

    Parameters:
        s: A single feature

    Returns:
        Monotonicity of the feature
    """
    N = s.shape[0]
    diff = np.diff(s)
    return 1 / (N - 1) * np.abs(np.sum(diff > 0) - np.sum(diff < 0))

autocorrelation(s)

Autocorrelation of a feature

Parameters:

Name Type Description Default
s ndarray

A single feature

required

Returns:

Type Description
float

Autocorrelation of the feature

Source code in ceruleo/dataset/analysis/numerical_features.py
def autocorrelation(s: np.ndarray) -> float:
    """
    Autocorrelation of a feature

    Parameters:
        s: A single feature

    Returns:
        Autocorrelation of the feature
    """
    diff = np.diff(s)
    return np.sum(diff**2) / s.shape[0]

correlation(s, y=None)

Correlation of the feature with the target

Parameters:

Name Type Description Default
s ndarray

A single feature

required
y Optional[ndarray]

The RUL target

None

Returns:

Type Description
float

Correlation between the feature and the RUL target

Source code in ceruleo/dataset/analysis/numerical_features.py
def correlation(s: np.ndarray, y: Optional[np.ndarray] = None) -> float:
    """
    Correlation of the feature with the target

    Parameters:
        s: A single feature
        y: The RUL target

    Returns:
        Correlation between the feature and the RUL target
    """
    N = s.shape[0]
    if not (s[0] == s).all():
        corr = spearmanr(s, np.arange(N), nan_policy="omit")
        corr = corr.correlation
    else:
        corr = np.nan
    return corr

mutual_information(x, y)

Mutual information between a feature and the target

Reference

Parameters:

Name Type Description Default
x ndarray

A single feature

required
y ndarray

RUL Target

required

Returns:

Type Description
float

Mutual information between x and y

Source code in ceruleo/dataset/analysis/numerical_features.py
def mutual_information(x: np.ndarray, y: np.ndarray) -> float:
    """Mutual information between a feature and the target

    [Reference](Remaining Useful Life Prediction Using Ranking Mutual Information Based Monotonic Health Indicator)

    Parameters:
        x: A single feature
        y: RUL Target

    Returns:
        Mutual information between x and y

    """
    x = x.reshape(-1, 1)
    x = np.nan_to_num(x)
    return mutual_info_regression(x, y)