Module: secbench.processing

The secbench.processing module contains Secbench’s processing tools to support side-channel analysis. It is divided into several “thematic” submodules (secbench.processing.SUBMODULE):

  • helpers: contains general purposes helpers and low-level functions useful for side-channel analysis.

  • metrics: contains functions to implement different leakage models.

  • metrics: contains side-channel leakage metrics. These metrics are fully compatible with the sklearn API (they can be used as score functions).

  • profiled: contains tools for performing profiled (template attacks, neural networks).

  • signal: contains many tools for filtering traces and synchronization.

  • crypto: simulation models for cryptographic primitives. The AES model is very useful.

Helpers

The submodule secbench.processing.helpers contains general purposes helpers useful for side-channel analysis.

qplot(x, y=None, n=20, percentile_min=5, percentile_max=95, color='r', plot_mean=False, plot_median=False, line_color='k', ax=None, **kwargs)

Generate a quantile plot for a given dataset.

The qplot function visualizes the distribution of data by plotting percentiles and optionally includes the mean and median lines.

Parameters:
  • x – If y is not given, should be a 2-dimensional Numpy array of shape (n_traces, n_features) which represents the data to be plot. An X-axis will be generated. If y is provided, then this input is a 1-D Numpy array which contains the X range of the plot.

  • y – a 2-dimensional Numpy array of shape (n_traces, n_features). Represents the data to be plot. X axis is given by argument x.

  • n – The number of percentile groups to calculate. The higher this value the smoother the plot, but the slower it is to render.

  • percentile_min – The minimum percentile to start the calculation.

  • percentile_max – The maximum percentile to end the calculation.

  • color – The color of the filled percentile areas.

  • plot_mean – Whether to plot the mean line.

  • plot_median – Whether to plot the median line.

  • line_color – The color of the mean and median lines.

  • ax – The axis on which to plot. If not provided, the current active axis is used.

  • kwargs – Additional keyword arguments passed to matplotlib’s fill_between.

rank_of(scores, k, randomize=True)

Compute the rank of a given hypothesis in an array of score.

Note that this function is randomized so that if multiple hypothesis have the same score, the score returned will not be dependent on the sorting algorithm. This properly turns to be important for evaluating guessing entropy.

If you do not want this behavior, pass randomize=False to this function.

Parameters:
  • scores – An 1-D array of scores (integers or float)

  • k – the index of the key on which to compute the rank.

  • randomize – randomize indices that have the same rank.

key_scores(pred_lg, target_variable_fn, secret_values, *args, **kwargs)

Compute the log Maximum likelihood of each key hypothesis.

This function is typically used at the end of side-channel attacks to score key hypothesis.

Note

This function operates on logarithms to avoid numerical stability issue (when working with probabilities).

Parameters:
  • pred_lg – logarithm of predictions obtained on some traces (the expected format for this data is the same as Scikit-learn’s model.predict_proba() methods). This array has a shape (n_traces, n_classes).

  • target_variable_fn – how to compute target variables under a key hypothesis. The first argument is a value picked in secret_values array, the args and kwargs are then forwarded.

  • args – arbitrary arguments forwarded to target_variable_fn.

  • kwargs – arbitrary keyword arguments forwarded to target_variable_fn.

  • secret_values – Any iterator that returns secret hypothesis to be tested.

Returns:

the score of each key. This array has shape (n_classes,). Where n_classes is the size of the secret_values iterator.

encode_labels(ys, dtype=<class 'numpy.uint16'>, indep=True)

Encode labels in the range [0; n_classes)

This function is a wrapper around the LabelEncoder from sklearn.

Parameters:
  • dtype – type for output labels.

  • indep – should the labels share the same support? This option is only valid when the input has two dimensions. When set to True (the default) all columns of ys are encoded independently.

chunks(w, *args)

Iterate several arrays with a fixed slice size on axis 0.

This is super helpful to decompose computations in smaller blocks (e.g., when the dataset cannot fit in memory). It is worth mentioning that this iterator is “tqdm” friendly!

Examples:

>>> xs = np.arange(10)
>>> ys = xs * xs
>>> for x, y in ChunkIterator(3, xs, ys):
...     print(x, y)
[0 1 2] [0 1 4]
[3 4 5] [ 9 16 25]
[6 7 8] [36 49 64]
[9] [81]
add_remove(xs, ratio=0.1)

Implement the “add-remove” algorithm to simulate jitter in traces.

Parameters:

xs – input array of shape (n_samples, n_features) to be modified.

Returns:

an array with the same shape and type as xs, but with the add remove transformation applied.

Leakage models

We provide numpy-optimized versions of the Hamming weight and Hamming distance functions.

hamming_weight(x)

Vectorized Hamming Weight

Returns the number of bits set to 1 (aka. bit count) in each element of the given array.

Note

This function does a simple dispatching to scalar specific hamming weight functions (hamming_weight_N). You should consider calling them directly if the datatypes are known beforehand.

Examples:

>>> hamming_weight(np.arange(16, dtype=np.uint8))
array([0, 1, 1, 2, 1, 2, 2, 3, 1, 2, 2, 3, 2, 3, 3, 4], dtype=uint8)
hamming_distance(x, y)

Vectorized Hamming Distance

Returns the number of bits that changed between x and y

Examples:

>>> x = np.array([42, 3, 12], dtype=np.uint8)
>>> y = np.array([11, 3, 59], dtype=np.uint8)
>>> hamming_distance(x, y)
array([2, 0, 5], dtype=uint8)

The following functions allow to perform various bit decompositions:

unpackbits(y, count=0)

Unpack an array into its bit representation.

This is function can be viewed as a more general version than Numpy’s unpackbits function. It supports arbitrary integer types (uint16, uint32, etc.).

Note

Any other type (e.g., np.float32) will be decomposed according to their byte representation.

Parameters:
  • y – a numpy array of integer elements. This array can be of any shape.

  • count – number of bits to keep in the final decomposition. Must be in the range [0, n_bits) otherwise this parameter has no effect.

Returns:

If y has a shape (n_0, …, n_k), the output will be an array of type np.int8, with shape (n_0, …, n_k, N) where N is the number of bits needed to encode the integer elements. The bits are returned in little endian order (least significant first).

lra_unpackbits(y, center=False, with_intercept=True, count=0)

Decompose a target vector into bits, suited for LRA use.

Parameters:
  • y – a numpy array of integers to be decomposed. This array can be of any shape.

  • with_intercept – add a column of one to the output to capture the intercept when the result is passed to a least square solver.

  • count – number of bits to keep in the final decomposition. Must be in the range [0, n_bits) otherwise this parameter has no effect.

  • center – if given, center the bit decomposition such as the mean is 0 for random inputs.

Returns:

If y has a shape (n_0, ..., n_k), the output will be an array of type np.int8, with shape (n_0, ..., n_k, N + I) where N is the number of bits needed to encode the integer elements, in I is 1 when with_intercept=True. The bits are returned in little endian order (least significant first). The intercept is added in the last column.

See also

The same rules apply on the input as for unpackbits().

lra_unpackbits_2nd_order(y, with_intercept=True, count=0, center=False)

Bit decomposition of algebraic degree 2, suited for LRA use.

Usage is the same, as lra_unpackbits().

The order of bits in the decomposition is:

  • First, the bits of y decomposition

  • Then, the bits y_i * y_j for 0 < i < j < N

  • If applicable the intercept in the last column

Returns:

If y has a shape (n_0, ..., n_k), the output will be an array of type np.int8, with shape (n_0, ..., n_k, W + I) where W = N * (N + 1) / 2 (N being the number of bits of y decomposition) needed to encode the integer elements, in I is 1 when with_intercept=True. The bits are returned in little endian order (least significant first). The intercept is added in the last column.

lra_unpackbits_shd(y, y_prev, with_intercept=True)

Generate signed hamming distance leakage between two variables where 0 -> 1 and 1 -> 0 are encoded in separate variables.

Parameters:
  • y – 1D array containing target variable.

  • y_prev – 1D array containing the distant variable.

Leakage Metrics

The namespace secbench.processing.metrics contains common leakage metrics used in side-channel analysis.

All those metrics are compliant with sklearn scoring API. This allows you to do things like:

from sklearn.feature_selection import SelectKBest
import secbench.processing.metrics as metrics

# Keep 20 highest SNR points
selector = SelectKBest(metrics.snr_accumulator, k=20)
selector.fit(data, ciphertexts)
print(selector.get_support())
print(selector.scores_)

Furthermore, you can also use these metrics as part of a sklearn pipeline. Here is a 4-line “template attack” pipeline (parameters may be tuned).

from sklearn.pipeline import make_pipeline
from sklearn.feature_selection import SelectKBest
from sklearn.decomposion import PCA

pipeline = make_pipeline(
    SelectKBest(metrics.nicv, k=100),
    PCA(n_components=5),
    QuadraticDiscriminantAnalysis())
pipeline.fit(X_train, y_train)
print(pipeline.score(X_valid, y_valid))

Currently, the following metrics are supported. All of them are univariate:

snr(X, y, encode_labels=False, num_classes=None, **kwargs)

Compute a signal-to-noise ratio.

\[snr = \frac{Var(E(X | Y))}{E(Var(X | Y))}\]
Parameters:
  • X – training data. An array of shape (n_samples, n_features).

  • y – Target values. An array of shape (n_samples,) or (n_samples, n_targets).

  • encode_labels – whether labels should be re-encoded

  • num_classes – number of classes (otherwise inferred from the maximum value of labels).

Returns:

an array of shape (n_features,) or (n_targets, n_features) of scores.

Changed in version 2.6.0: SNR implementation is based on the cond_mean_var() helper.

welch_t_test(X, y, **kwargs)

Compute a Welch T-Test.

\[t = \frac{E(X | Y = y_1) - E(X | Y = y_2)} {Var(X | Y = y_1)/N_1 + Var(X|Y = y_2)/N_2}\]
Parameters:
  • X – training data. An array of shape (n_samples, n_features).

  • y – Target values. An array of shape (n_samples,) or (n_samples, n_targets). Target values must be in the set {0, 1} only for this test.

Returns:

an array of shape (n_features,) or (n_targets, n_features) of scores.

Changed in version 2.6.0: The implementation is based on the cond_mean_var() helper.

nicv(X, y, num_classes=None, encode_labels=False, **kwargs)

Compute a Normalized Interclass Variance (Nicv).

\[nicv = \frac{Var(E(X | Y))}{Var(X)}\]

Note

This metric is usually very similar to the SNR. Only the denominator of the formula differs.

Parameters:
  • X – training data. An array of shape (n_samples, n_features).

  • y – Target values. An array of shape (n_samples,) or (n_samples, n_targets).

  • encode_labels – whether labels should be re-encoded

  • num_classes – number of classes (otherwise inferred from the maximum value of labels).

Returns:

an array of shape (n_features,) or (n_targets, n_features) of scores.

Returns:

an array of shape (n_features,) or (n_targets, n_features) of scores.

Changed in version 2.6.0: The implementation is based on the cond_mean_var() helper.

sost(X, y, num_classes=None, encode_labels=False, **kwargs)

Compute the Sum of Square T differences.

\[\sum_{i, j, i < j}{ \frac{(E(X | y_i) - E(X | y_j))^2}{Var(X | y_i) + Var(X | y_j)}}\]
Parameters:
  • X – training data. An array of shape (n_samples, n_features).

  • y – Target values. An array of shape (n_samples,) or (n_samples, n_targets).

  • encode_labels – whether labels should be re-encoded

  • num_classes – number of classes (otherwise inferred from the maximum value of labels).

Returns:

an array of shape (n_features,) or (n_targets, n_features) of scores.

Changed in version 2.6.0: The implementation is based on the cond_mean_var() helper.

pearson(X, y)

Compute Pearson’s correlation coefficient.

Parameters:
  • X – training data. An array of shape (n_samples, n_features).

  • y – Target values. An array of shape (n_samples,) or (n_samples, n_targets).

Returns:

an array of shape (n_features,) or (n_targets, n_features) of scores.

The LRA class allows various kind of linear regression analysis.

class LRA

An LRA metric with a specific decomposition model.

This metric returns the R^2 score of the least square solution of:

\[X = \beta \cdot y + \beta_0\]
Example:

from secbench.processing import lra_unpackbits
from secbench.processing.metrics import LRA

metric = LRA(model=lra_unpack_bits)
score = metric(samples, targets)

Instances of this class are callable and have the same prototype as other leakage metrics (i.e., f(X, y)).

__init__(model=<function lra_unpackbits>, use_numpy=False, use_cond_mean=False, n_classes=None, **kwargs)

Create an LRA metric.

Parameters:
  • model – target variable decomposition model.

  • use_numpy – select Numpy implementation of least square method.

  • use_cond_mean – compute average of samples per class before doing the LRA. The resulting LRA is much faster.

  • n_classes – number of classes (only needed when use_cond_mean=True).

  • kwargs – The remaining keyword arguments are forwarded to the model as keywords arguments.

scores_and_coeffs(X, y, y_prev=None)

Return both R^2 scores and coefficients from the LRA.

Parameters:
  • X – training data. An array of shape (n_samples, n_features).

  • y – Target values. An array of shape (n_samples,) or (n_samples, n_targets).

Returns:

a tuple of arrays (scores, coeffs). Where scores is an array of (n_features,) or (n_targets, n_features), and where coeffs is an array of shape (n_features, n_bits) or (n_targets, n_features, n_bits).

Under the hoods, most univariate metrics seen before are computed using a so-called “conditional mean and variance”. The latter is just the mean and variance of traces grouped per label value.

For leakage assessment, we highly recommend to:

  • compute a condition mean and variance once

  • (if your dataset is huge) then save the result somewhere for latter re-use

  • finally, freeze the metrics you actually need.

The easiest way to compute conditional mean and variance is through the cond_mean_var() function. This function returns a CondMeanVar.

cond_mean_var(X, y, num_classes, chunk_size=0, initial_state=None, preprocess_block=<function identity_fn>, parallel_samples=None)

Compute a conditional mean and variance.

Parameters:
  • X – An array of shape (n_samples, n_features).

  • y – Target values. An array of shape (n_samples,) or (n_samples, n_targets).

  • chunk_size – Process data per block. When a HDF5 array is passed as input for X and y, this allows to process the data by small parts (that can fit in RAM).

  • initial_state – An initial accumulator (i.e., a CondMeanVar instance) or path from which we will call CondMeanVar.from_file(). Accumulation will be started from this state and not an empty accumulator.

  • preprocess_block – a function applied on the data block before being accumulated. This function has for signature fn(X, y) -> X_new. A typical example is to do an FFT of the data. Data passed to this callback is guaranteed to be in RAM and as numpy array.

  • parallel_samples – each thread will process a fixed number of samples determined by this value. The number of thread is configured by the RAYON_NUM_THREAD environment variable.

Returns:

A CondMeanVar instance.

Note

The conditional mean and variance is implemented as an accumulator. It means you can feed new data in an existing instance.

From a CondMeanVar instance, you can do many operations on it. The implementation is done in Rust, is heavily optimized and multi-threaded.

class CondMeanVar

Optimized implementation of conditional mean and variance.

__init__(targets, samples, num_classes)

Create an empty accumulator.

Parameters:
  • targets – Number of target variables

  • samples – Number of samples per traces

  • num_classes – Number of classes for target variables.

classmethod from_file(path, prefix='')

Load a CondMeanVar instance from a HDF5 snapshot.

save(path, prefix='')

Create a snapshot of the current accumulator in a HDF5 file.

This snapshot can be reloaded with CondMeanVar.from_file().

freeze()

Return the current mean and variance per class.

Returns:

a tuple of arrays (mean, variance), both arrays have shape (n_targets, n_classes, n_features).

freeze_global_mean_var()

Return the mean and variance of the data accumulated so far.

Returns:

a tuple (mean, var, samples), where mean and variance are 1-D arrays with the same number of samples than the input data.

split(chunk_size)

Turn the object in a parallel instance (CondMeanVarP).

The latter can be converted back to a normal accumulator with CondMeanVarP.merge().

Parameters:

chunk_size – The number of samples processed by each thread. Smaller chunks lead to higher parallelism, but might decrease performances. As a rule of thumb, use something between 2-8 cache lines (e.g., chunk_size = 256 for 8 bit data).

Returns:

freeze_dom()

Compute the difference of means.

The accumulator must have two classes.

freeze_nicv()

Compute the normalized interclass variance for the current accumulator.

freeze_snr()

Compute a signal-to-noise ratio for the current accumulator.

freeze_sost()

Compute the sum of square T differences.

freeze_welch_t_test()

Compute Welch’s T-Test for the current accumulator.

The accumulator must have two classes.

process_block(X, y)

Add new data in the accumulator

Parameters:
  • X – an array of shape (n_samples, n_features) containing data.

  • y – an array of shape (n_samples, n_targets) containing the labels.

Perceived Information

The perceived information is a tool to evaluate the quality of a model. It should converge towards the mutual information, but is guaranteed to be lower (assuming the model is not overfitting).

perceived_information(model, X, y_true, entropy)

Compute the perceived information of a given model.

Parameters:
  • model – sklearn-like model, which must have a predict_proba method.

  • X – A numpy array of shape (n_samples, n_features) that represents inputs data.

  • y_true – A numpy array of shape (n_samples,) that contains correct labels associated with data.

  • entropy – entropy of the labels.

Profiled Attacks

The attack variants implemented in secbench.processing.profiled subclass the ProfiledAttack interface. You will typically implement the following workflow:

  1. Create a ProfiledAttack instance (depending on the model you want).

  2. Train it using fit(),

  3. Run it on attack data, using either key_scores() (blackbox attack) or guessing_entropy() (for evaluation).

We currently provide two main models:

  1. SklearnModel, which can wrap any Estimator from Sklearn.

  2. ScaNetwork, which is designed to wrap a Tensorflow network. You can pass an

    arbitrary tensorflow network. For simple cases, we provide a easy interface to create a network: GenericNetworkBuilder.

Abstract interface

class ProfiledAttack

Generic abstraction of a profiled side-channel attack.

__init__(target_variable_fn=None)

Create a new instance.

A function that computes target variables can be found. This function takes as first argument secret data (e.g., secret key) and as second argument public data (e.g., plaintexts)

abstractmethod fit(X, y, secret=None, **kwargs)

Fit the model using training data.

If you defined a target variable function (target_variable_fn argument) in the constructor, you should also pass secret data here. The model will then be trained to predict z = self.target(secret, y) instead of y.

abstractmethod predict_proba(X)

Return intermediate variables probability distribution for a given observation.

Returns:

an array of shape (n_traces, n_classes)

target(secret, *args, **kwargs)

Compute the target variable that we predict from public and private information.

predict_proba_log(X)

Same as ProfiledAttack.predict_proba, but in logarithm domain.

key_scores(X, secret_values, *args, **kwargs)

Compute the score of each key hypothesis.

This functions internally calls secbench.processing.helpers.key_scores(), refer to its docstring for more information.

guessing_entropy(X, y, expected_secret, traces_selector, num_classes)

Compute a guessing entropy by computing key rank for different number of traces.

Parameters:
  • X – attack data

  • y – public labels associated with traces

  • expected_secret – secret data used on the attack traces

  • traces_selector – return a subset of indices to select traces. This is used to see the evolution of the rank with the number of traces.

  • num_classes – number of classes being predicted.

Sklearn wrapper

class SklearnModel

Wrap any Scikit-learn model.

__init__(model, target_variable_fn=None)

Create a new instance.

A function that computes target variables can be found. This function takes as first argument secret data (e.g., secret key) and as second argument public data (e.g., plaintexts)

fit(X, y, secret=None, **kwargs)

Fit the model using training data.

If you defined a target variable function (target_variable_fn argument) in the constructor, you should also pass secret data here. The model will then be trained to predict z = self.target(secret, y) instead of y.

predict_proba(X)

Return intermediate variables probability distribution for a given observation.

Returns:

an array of shape (n_traces, n_classes)

Tensorflow neural network wrapper

class ScaNetwork
__init__(network, target_variable_fn=None, add_softmax=True)

Create a new instance.

A function that computes target variables can be found. This function takes as first argument secret data (e.g., secret key) and as second argument public data (e.g., plaintexts)

network()

Return the raw neural network used by this model.

load_weights(path)

Reload weights saved during a training.

compute_gradients(X)
fit_raw(*args, **kwargs)
fit(X, y, secret=None, **kwargs)

Fit the model using training data.

If you defined a target variable function (target_variable_fn argument) in the constructor, you should also pass secret data here. The model will then be trained to predict z = self.target(secret, y) instead of y.

predict_proba(traces)

Return intermediate variables probability distribution for a given observation.

Returns:

an array of shape (n_traces, n_classes)

class GenericNetworkBuilder

Generic serializable specification of a neural network.

Parameters:
  • conv_layers – Specification of convolutional layers (can be empty for MLP architectures).

  • batch_normalization – whether batch normalization is applied at the input of the network.

  • dense_layers – Specification of dense layers applied after convolutional layers.

batch_normalization
conv_layers
dense_layers
make_network(num_samples, num_classes=None)
compile(num_samples, num_classes=None, **kwargs)
build(num_samples, num_classes=None, target_variable_fn=None, **kwargs)

Create a ScaNetwork instance using this neural network.

Parameters:
  • num_samples – How many samples are passed as input to the neural network.

  • num_classes – Specify the number of classes to predict. If specified, the last dense layer of the neural network will “patched” to have the correct number of neurons.

  • target_variable_fn – optional function for calculating intermediate variable.

  • kwargs – keyword arguments are forwarded to Tensorflow’s compile method. If unset, we apply “sane” defaults.

__init__(batch_normalization, conv_layers, dense_layers)

Other tools

class ClassPCA

This transformer performs a class PCA.

First it computes mean of training data per classes, resulting in an array of shape (n_classes, n_features). Then it fits a PCA on this data.

The resulting PCA is applied on any input data of shape (n_samples, n_features).

__init__(n_components=None, copy=True, whiten=False, svd_solver='auto', tol=0.0, iterated_power='auto', random_state=None)
fit(X, y)
transform(X)

Cryptographic Models

The secbench.processing.crypto provides simulation models for cryptographic algorithm. Those models can be used to compute intermediate variables for side-channel analysis.

PCG32 Random Generator Model

class Pcg32
classmethod __new__(*args, **kwargs)
fill(dst)
generate()

AES

AES is the most frequently targeted algorithm for side-channel analysis.

AES Common Constants

aes_nist_key()

The example key used by the NIST description of AES.

This key is so common in SCA that it deserves to be available in any SCA framework!

Added in version 1.3.0.

aes_sbox(x)

Vectorized AES substitution operation model (aka., SBOX)

Examples:

>>> aes_sbox(np.arange(10))
array([ 99, 124, 119, 123, 242, 107, 111, 197,  48,   1], dtype=uint8)
aes_sbox_leakage(k, p, noise=0.1)

Ideal leakage model of first Subbytes operation of the AES

This function is a vectorized version of hamming_weight(aes_sbox(k ^ p)), with some optional noise.

Parameters:

noise – Noise magnitude (gaussian noise is added)

aes_inv_sbox(x)

Vectorized AES inverse substitution (aka., SBOX^{-1})

Models of T-Tables (e.g., used by OpenSSL implementations):

aes_t_table(i, x)

Vectorized AES T-Table model

>>> aes_t_table(0, np.arange(3))
array([3328402341, 4168907908, 4000806809], dtype=uint32)
aes_t_indices(i)

Return the T-table indice used by the x-th key byte

Key Expansion

  • Forward: aes.aes_expand_key() (all rounds), aes.aes_expand_key_step() (single step)

  • Inverse: aes.aes_inv_expand_key() (all rounds), aes.aes_inv_expand_key_step() (single step)

aes_expand_key(key)

Compute AES intermediate round keys.

Parameters:

key – The key to expand (a numpy array of 16 bytes).

Returns:

A (11, 16) array representing the AES round keys.

aes_expand_key_step(round_key, aes_round)

Compute one iteration of the key expansion.

aes_inv_expand_key(round_key)

Compute AES intermediate round keys.

Parameters:

key – The key to expand (a numpy array of 16 bytes).

Returns:

A (11, 16) array representing the AES round keys. Index 10 is the key given as input and index 0 is the 0th round key. Thus, the array returned can be used for an AES encryption “as this”.

aes_inv_expand_key_step(round_key, aes_round)

Compute one iteration of the reverse key expansion.

Functional Model

Functional model of the core AES operations are available in the namespace aes.AesOps class.

For side-channel and the computation of intermediate values, we provide a flexible implementation of the AES. The class aes.AES allows to stop execution at any round. Execution can also be resumed from any round. Here is an example usage:

from secbench.processing.crypto.aes import AesOps, AES

key = np.array(
    [0x2b, 0x7e, 0x15, 0x16, 0x28, 0xae, 0xd2, 0xa6, 0xab, 0xf7, 0x15, 0x88, 0x09, 0xcf, 0x4f, 0x3c],
    dtype=np.uint8)
pts = np.random.randint(0, 256, size=(10, 16), dtype=np.uint8)

cipher = AES(key)

# Normal Encryption
ciphertexts = cipher.encrypt(pts)

# Stop after the sub_bytes of 3rd round
intermediates = cipher.encrypt(pts, stop_round=3, stop_after=AesOps.sub_bytes)
# Finish encryption
finished = cipher.encrypt(intermediates, start_round=3, start_after=AesOps.sub_bytes)
# Encrypt until the round 5
intermediates = cipher.encrypt(pts, stop_round=5)
# Finish encryption
cipher.encrypt(intermediates, start_round=6)

The same thing can be done with the AES decryption (decrypt method).

class AesOps

Set of basic operations on an AES state (i.e., a 4x4 byte matrix).

The operations are all static methods, this class is used only for namespacing purposes.

Added in version 1.3.0.

static add_round_key(state, rk)
static sub_bytes(state)
static inv_sub_bytes(state)
static shift_rows(states)
static inv_shift_rows(states)
static mix_columns(states)
static inv_mix_columns(states)
class AES

Flexible execution model of the AES cipher.

Added in version 1.3.0.

__init__(key)

Initialize an AES cipher with a given key.

classmethod from_round_key(key, round)

Initialize an AES cipher from a specific round key.

encrypt(states, start_after='start', stop_after=<function AesOps.add_round_key>, start_round=0, stop_round=10)

Perform an AES encryption of the given initial states.

Specific start and stop rounds can be passed to this function.

If start_round is 0 and stop_round if 10, a normal encryption is performed.

Parameters:
  • states – input states (a numpy array (N, 16) of np.uint8)

  • start_round – Encrypt from this round.

  • stop_round – Encrypt until this round.

  • start_after – A method of secbench.processing.aes.AesOps indicating the start point within the start_round, (default: ‘start’)

  • stop_after – Last operation computed in the stop round (a method of secbench.processing.aes.AesOps).

Returns:

A (N, 16) array corresponding to the final (or intermediates) states.

decrypt(states, start_after='start', stop_after=<function AesOps.inv_sub_bytes>, start_round=10, stop_round=0)

Perform an AES decryption of the given initial states.

This method works exactly as secbench.processing.aes.AES.encrypt(). Please refer to its documentation.

Input Generators

We implement several helpers to generate inputs for side-channel leakage assessments.

generate_plaintexts(n_traces, random_rows=None, random_cols=None, random_bytes=None, seed=None, fixed_value=0)

Flexible generation of AES plaintexts.

This function allows to easily select which part of the plaintext are randomized.

It is important to understand that a flat plaintext is transformed in a column representation. Namely, the plaintext [P0 P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 P11 P12 P13 P14 P15] is manipulated by the AES as:

     cols ->
rows P0  P4  P8  P12
  |  P1  P5  P9  P13
  v  P2  P6  P10 P15
     P3  P7  P11 P16
Parameters:
  • n_traces – number of generated plaintexts

  • random_rows – indices in the range [0, 4) of rows to randomize

  • random_cols – indices in the range [0, 4) of columns to randomize

  • random_bytes – specific bytes to randomize. You should pass state bytes to randomize.

  • seed – seed for deterministic generation of inputs.

  • fixed_value – which value is written in fixed part of the plaintext.

Returns:

a Numpy array of shape (n_traces, 16) and dtype np.uint8.

biased_state_plaintexts(n_traces, key, target_round, target_op, random_bytes=None, seed=None)

Generate plaintexts that induce bias state in selected round of a AES execution.

The state at the target round (and target operation) is set to either [0x00, ..., 0x00] or [0xFF, ..., 0xFF]. Then, the selected bytes (random_bytes) are randomized. Finally, the state is inverted to compute the input plaintext.

Parameters:
  • key – AES key (128 bits, 16 bytes).

  • target_round – AES round of the targeted aes operation

  • target_op – targeted aes operation.

  • n_traces – number of generated plaintexts

  • random_bytes – unbiased bytes of the biased state (random)

  • seed – seed for deterministic generation of inputs.

Returns:

A tuple of Numpy arrays, (labels, plaintexts). labels``has shape (``n_traces) and contains 0 or 1. This indicates whether the state was filled with zeros or ones. The second member is plaintext, and has shape (n_traces, 16).

generate_round_states(key, pts, target_op=<function AesOps.add_round_key>, model='hw')

Compute intermediate states of AES computation for a given operation.

Distances on the first round are done with plaintexts.

Parameters:
  • key – AES key (128 bits)

  • pts – AES plaintexts

  • target_op – AES operation that output the labels

  • model – model applied on states, 3 supported model “hw” (identity), “hd” (xor) and “shd” (concatenation).

biased_hd_plaintexts(n_traces, key, target_round, biased_row=0, sign='unsigned', seed=None)

Generate plaintexts that induce biased distance between two successive add round key outputs.

The bias generated (0 or 0xFF) is available in labels returned. One state row (4 bytes) is biased.

Parameters:
  • key – aes key (128 bits)

  • target_round – aes round of the second add rond key

  • n_traces – number of generated plaintexts

  • biased_row – biased state row if 12 the two first rows are biased

  • sign – sign of the distance ‘unsigned’, ‘positive’ or ‘negative’ (default:’unsigned’)

Returns:

A tuple of Numpy arrays, (labels, plaintexts). labels``has shape (``n_traces) and contains 0 or 1. This indicates whether the state was filled with zeros or ones. The second member is plaintext, and has shape (n_traces, 16).

CRC8

crc8(array, crc=0)

Lookup-table based CRC8 implementation

Implement a standard CRC with parameters: width=8 poly=0x4d init=0xff refin=true refout=true xorout=0xff check=0xd8 name=”CRC-8/KOOP”

Here is an equivalent C code that generates the CRC table:

uint8_t crc8(uint8_t crc, uint8_t* data, size_t size)
{
    if (data == NULL)
        return 0;
    crc = ~crc & 0xff;
    for (size_t i = 0; i < size; i++) {
        crc ^= data[i];
        for (unsigned k = 0; k < 8; k++) {
            crc = crc & 1 ? (crc >> 1) ^ 0xb2 : crc >> 1;
        }
    }
    return crc ^ 0xff;
}
Example:

>>> crc8(b"DEADBEEF")
61
>>> crc8(b"Hello world")
19

Signal Processing

Fourier Transforms and Filtering

rfft_mag(X, *, output=None, parallel=False, chunk_size=None, dtype=<class 'numpy.float32'>)

Magnitude of the real Fourier transform of the signal.

Parameters:
  • X – a numpy array of shape (n_samples, n_features) or (n_features,).

  • output – if given, compute the result in this array. Otherwise, an output array will be allocated.

  • parallel – if True, processes groups of chunk_size rows of X in parallel. The number of threads is defined by environment variable RAYON_NUM_THREADS. Otherwise, processing is done sequentially.

  • chunk_size – number of rows of X processed in parallel.

  • dtype – output type (only np.float32 is exposed currently).

fft_filter(X, kernel, *, output=None, parallel=False, chunk_size=None, dtype=<class 'numpy.float32'>, two_pass=False)

Filter a given signal using FFT method.

Similar functionality is provided by scipy.signal.lfilter or scipy.signal.filtfilt. However, in comparison, this method has a low memory usage, allows working in-place and supports fine control over parallelism.

Parameters:
  • X – a numpy array of shape (n_samples, n_features) or (n_features,).

  • kernel – a numpy array of shape (n_coeffs,) and dtype np.float32. The kernel must be smaller that the number of features in the input.

  • two_pass – if True, performs left filtering pass then right filtering pass. This provides functionality similar to scipy.signal.filtfilt.

  • output – if given, compute the result in this array. Otherwise, an output array will be allocated.

  • parallel – if True, processes groups of chunk_size rows of X in parallel. The number of threads is defined by environment variable RAYON_NUM_THREADS. Otherwise, processing is done sequentially.

  • chunk_size – number of rows of X processed in parallel.

  • dtype – output type (only np.float32 is exposed currently).

generate_lp_firls(f_low, f_high, fs, numtaps=201)

Build FIR filter coefficients using the least-square approach.

You need to specify f_low, and f_high, which define respectively the low and high frequency where the frequency response will drop.

plot_filter_response(axs, taps, fs, sos=False)

Plot the frequency and phase response of a filter.

Example:

fs = 1e6
taps = generate_lp_firls(1e3, 5e3, fs)

fig, axs = plt.subplots(1, 2)
fig.tight_layout()
plot_filter_response(axs, taps, fs)
plot_fft(ax, x, fs)

Plot the discrete Fourier transform of a signal.

fs = 1e6
xs = np.linspace(0, 1, int(fs))
ref = np.sin(2 * np.pi * xs * 1000 + 0.5)
plot_fft(fs, ref)
spectrogram(ax, x, fs, nperseg=1024, noverlap=None, vmin=None)

Sliding window fourier transform.

Example:

fs = 1e9
_fig, _ax = plt.subplots()
spectrogram(_ax, np.random.random(size=5000), fs, nperseg=512)
plt.show()

Synchronization

We recommend that you take a look at the notebook tutorial (Pattern Matching and Alignment) that shows how to use those functions.

match_correlation(X, kernel, *, output=None, parallel=False, chunk_size=None, dtype=<class 'numpy.float32'>)

Match a given kernel using normalized cross-correlation.

Best match is found at the maximum.

Parameters:
  • X – a numpy array of shape (n_samples, n_features) or (n_features,).

  • kernel – kernel to be matched in the traces.

  • output – if given, compute the result in this array. Otherwise, an output array will be allocated.

  • parallel – if True, processes groups of chunk_size rows of X in parallel. The number of threads is defined by environment variable RAYON_NUM_THREADS. Otherwise, processing is done sequentially.

  • chunk_size – number of rows of X processed in parallel.

  • dtype – output type (only np.float32 is exposed currently).

match_euclidean(X, kernel, *, output=None, parallel=False, chunk_size=None, dtype=<class 'numpy.float32'>)

Match a given kernel using Euclidean distance.

Best match is found at the minimum.

Note

This function returns the square of the Euclidean distance with the pattern, since taking the square root is a waste of time for SCA alignment. You can call manually apply np.sqrt on the result if needed.

Parameters:
  • X – a numpy array of shape (n_samples, n_features) or (n_features,).

  • kernel – kernel to be matched in the traces.

  • output – if given, compute the result in this array. Otherwise, an output array will be allocated.

  • parallel – if True, processes groups of chunk_size rows of X in parallel. The number of threads is defined by environment variable RAYON_NUM_THREADS. Otherwise, processing is done sequentially.

  • chunk_size – number of rows of X processed in parallel.

  • dtype – output type (only np.float32 is exposed currently).

phase_correlation(X, kernel, *, output=None, parallel=False, chunk_size=None, dtype=<class 'numpy.float32'>)

Compute phase correlation between an input signal and a kernel.

Parameters:
  • X – a numpy array of shape (n_samples, n_features) or (n_features,).

  • kernel – a numpy array of shape (n_coeffs,) and dtype np.float32. The kernel must be smaller that the number of features in the input.

  • output – if given, compute the result in this array. Otherwise, an output array will be allocated.

  • parallel – if True, processes groups of chunk_size rows of X in parallel. The number of threads is defined by environment variable RAYON_NUM_THREADS. Otherwise, processing is done sequentially.

  • chunk_size – number of rows of X processed in parallel.

  • dtype – output type (only np.float32 is exposed currently).

Misc

downsample(X, samples_out=None, factor=None)

Downsample data using the Largest-Triangle-Three-Buckets algorithm.

It is possible to downsample an array either into sample_out samples, or by a factor.

Parameters:
  • X – A numpy array of shape (n_samples, n_features) or (n_features, )

  • samples_out – Maximum length of the output trace.

  • factor – Decimation factor (e.g., take one sample over factor samples)

Return (x_s, y_s):

A tuple containing the x and y coordinates of the downsampled trace(s)

sliding_mean(X, *, window_size, padding_value=None, output=None, parallel=False, chunk_size=None, dtype=None)

Compute a sliding mean of the input array.

Parameters:
  • X – a numpy array of shape (n_samples, n_features) or (n_features,).

  • window_size – size of the window (i.e., number of samples added).

  • padding_value – value used for padding initial samples (0 if None).

  • output – if given, compute the result in this array. Otherwise, an output array will be allocated.

  • parallel – if True, processes groups of chunk_size rows of X in parallel. The number of threads is defined by environment variable RAYON_NUM_THREADS. Otherwise, processing is done sequentially.

  • chunk_size – number of rows of X processed in parallel.

  • dtype – output type (only np.float32 is exposed currently).

sliding_var(X, *, window_size, padding_value=None, output=None, parallel=False, chunk_size=None, dtype=None)

Compute a sliding variation of the input array.

Parameters:
  • X – a numpy array of shape (n_samples, n_features) or (n_features,).

  • window_size – size of the window (i.e., number of samples added).

  • padding_value – value used for padding initial samples (0 if None).

  • output – if given, compute the result in this array. Otherwise, an output array will be allocated.

  • parallel – if True, processes groups of chunk_size rows of X in parallel. The number of threads is defined by environment variable RAYON_NUM_THREADS. Otherwise, processing is done sequentially.

  • chunk_size – number of rows of X processed in parallel.

  • dtype – output type (only np.float32 is exposed currently).

sliding_std(X, *, window_size, padding_value=None, output=None, parallel=False, chunk_size=None, dtype=None)

Compute a sliding standard deviation of the input array.

Parameters:
  • X – a numpy array of shape (n_samples, n_features) or (n_features,).

  • window_size – size of the window (i.e., number of samples added).

  • padding_value – value used for padding initial samples (0 if None).

  • output – if given, compute the result in this array. Otherwise, an output array will be allocated.

  • parallel – if True, processes groups of chunk_size rows of X in parallel. The number of threads is defined by environment variable RAYON_NUM_THREADS. Otherwise, processing is done sequentially.

  • chunk_size – number of rows of X processed in parallel.

  • dtype – output type (only np.float32 is exposed currently).

sliding_skew(X, *, window_size, padding_value=None, output=None, parallel=False, chunk_size=None, dtype=None)

Compute a sliding skewness of the input array.

Parameters:
  • X – a numpy array of shape (n_samples, n_features) or (n_features,).

  • window_size – size of the window (i.e., number of samples added).

  • padding_value – value used for padding initial samples (0 if None).

  • output – if given, compute the result in this array. Otherwise, an output array will be allocated.

  • parallel – if True, processes groups of chunk_size rows of X in parallel. The number of threads is defined by environment variable RAYON_NUM_THREADS. Otherwise, processing is done sequentially.

  • chunk_size – number of rows of X processed in parallel.

  • dtype – output type (only np.float32 is exposed currently).

sliding_kurt(X, *, window_size, padding_value=None, output=None, parallel=False, chunk_size=None, dtype=None)

Compute a sliding kurtosis of the input array.

Parameters:
  • X – a numpy array of shape (n_samples, n_features) or (n_features,).

  • window_size – size of the window (i.e., number of samples added).

  • padding_value – value used for padding initial samples (0 if None).

  • output – if given, compute the result in this array. Otherwise, an output array will be allocated.

  • parallel – if True, processes groups of chunk_size rows of X in parallel. The number of threads is defined by environment variable RAYON_NUM_THREADS. Otherwise, processing is done sequentially.

  • chunk_size – number of rows of X processed in parallel.

  • dtype – output type (only np.float32 is exposed currently).