Partition utils

Functions that are not fuzzy-specific, but util for some computations. Dedicated mostly to compute quantiles for fuzzy partitions.

ex_fuzzy.utils.assign_time(a: array, observations: list[array]) int[source]

Assigns a temporal moment to a set of observations.

Parameters:
  • a – array of boolean values.

  • observations – list of boolean arrays with the corresponding timestamps.

Returns:

the index of the correspondent time moment for the a-th observation.

Raises:

ValueError if a is not timestamped in any of the observation arrays.

ex_fuzzy.utils.classify_temp(dates: DataFrame, cutpoints: tuple[str, str], time: str) array[source]

Classifies a set of dates according to the temporal cutpoints. Uses {time} as a the time resolution. Returns an array where true values are those values contained between those two date points.

Parameters:
  • dates – data observations to cut.

  • cutpoints – points to check.

  • time – time field to use as the criteria.

Returns:

boolean array. True values are those contained between the cutpoints.

ex_fuzzy.utils.compute_quantiles(x, n_partitions)[source]

Computes the quantiles needed for n-partition fuzzy membership.

Parameters:
  • x – numpy array, vector of shape (samples, ).

  • n_partitions – int, number of partitions.

Returns:

numpy array, quantiles for partitioning.

ex_fuzzy.utils.construct_conditional_frequencies(X: array, discrete_time_labels: list[int], initial_ffss: list[FS])[source]

Computes the conditional temporal function for a set of fuzzy sets according to their variation in time.

Parameters:
  • X – numpy array, shape samples x features.

  • discrete_time_labels – discrete time labels.

  • initial_fs – initial fuzzy set list.

Returns:

conditional frequencies. Array shape (time steps, initial fuzzy sets)

ex_fuzzy.utils.construct_crisp_categorical_partition(x: array, name: str, fz_type_studied: FUZZY_SETS) fuzzyVariable[source]

Creates a fuzzy variable for a categorical feature.

Parameters:

x – array with values of the categorical variable.

:param name of the fuzzy variable. :param fz_type_studied: fuzzy set type studied. :return: a fuzzy variable that works as a categorical crips variable (each fuzzy set is 1 exactly on each class value, and 0 on the rest).

ex_fuzzy.utils.construct_partitions(X: array, fz_type_studied: FUZZY_SETS = FUZZY_SETS.t1, categorical_mask: array = None, n_partitions=3) list[fuzzyVariable][source]

Returns a list of linguistic variables according to the kind of fuzzy specified.

Parameters:
  • X – numpy array|pandas dataframe, shape samples x features.

  • fz_type_studied – fuzzy set type studied.

  • categorial_mask – a boolean mask vector that indicates for each variables if its categorical or not.

  • n_partitions – number of partitions to use in the fuzzy set.

ex_fuzzy.utils.create_multi_tempVariables(X_train: array, time_moments: array, fuzzy_type: FUZZY_SETS) list[list[temporalFS]][source]

Creates a of list of lists of temporal fuzzy variables. Each corresponds to a fuzzy partition in a different moment in time. (So, instead of having one vl for all time moments, you have one different for each time moment that represents the same idea)

Parameters:
  • X_train – numpy array, shape samples x features.

  • time_moments – time moments. Array shape (samples,). Each value is an integer denoting the n-th time moment of that observation.

  • precomputed_partitions – precomputed partitions for each feature.

Returns:

list of lists of temporal fuzzy variables.

ex_fuzzy.utils.create_tempVariables(X_train: array, time_moments: array, precomputed_partitions: list[fuzzyVariable]) list[temporalFS][source]

Creates a list of temporal fuzzy variables.

Parameters:
  • X_train – numpy array, shape samples x features.

  • time_moments – time moments. Array shape (samples,). Each value is an integer denoting the n-th time moment of that observation.

  • precomputed_partitions – precomputed partitions for each feature.

Returns:

list of temporal fuzzy variables.

ex_fuzzy.utils.extend_fuzzy_sets_enum(new_fuzzy_sets_enum: FUZZY_SETS) list[FUZZY_SETS][source]

Extends the fuzzy sets enum with additional types.

Parameters:

fuzzy_sets_enum – fuzzy sets enum.

Returns:

extended fuzzy sets enum.

ex_fuzzy.utils.fixed_quantile_compute(x: array) list[float][source]

Computes a series of quantiles for each feature in numpy array. Quantiles: [0, 0.20, 0.30, 0.45, 0.55, 0.7, 0.8, 1]

Parameters:

x – array samples x features

Returns:

list of quantiles for each feature

ex_fuzzy.utils.gt2_fuzzy_partitions_dataset(x0: array, resolution_exp: int = 2, n_partition=3) list[fuzzyVariable][source]

Partitions the dataset features into different fuzzy variables using gt2 fuzzy sets. Parameters are prefixed. Use it for simple testing and initial solution.

Parameters:
  • x – numpy array|pandas dataframe, shape samples x features.

  • resolution_exp – exponent of the resolution of the partition. Default is 2, which means 0.01. (Number of significant decimals)

  • n_partition – number of partitions to use in the fuzzy variables.

Returns:

list of fuzzy variables.

ex_fuzzy.utils.mcc_loss(ruleBase: RuleBase, X: array, y: array, tolerance: float, alpha: float = 0.99, beta: float = 0.0125, gamma: float = 0.0125, precomputed_truth=None) float[source]

Fitness function for the optimization problem. Uses only the MCC, ignores the size penalization terms.

Parameters:
  • ruleBase – RuleBase object

  • X – array of train samples. X shape = (n_samples, n_features)

  • y – array of train labels. y shape = (n_samples,)

  • tolerance – float. Tolerance for the size evaluation.

  • alpha – ignored.

  • beta – ignored.

  • gamma – ignored.

Returns:

float. Fitness value.

ex_fuzzy.utils.partition3_quantile_compute(x: array) list[float][source]

Computes a series of quantiles partitioning the variable in 3 cases.

Quantiles: [0.00, 0.20, 0.50, 0.80, 1.00]

Parameters:

x – array samples x features

Returns:

list of quantiles for each feature

ex_fuzzy.utils.quartile_compute(x: array) list[float][source]

Computes the quartiles for each feature.

Parameters:

x – array samples x features

Returns:

list of quartiles for each feature

ex_fuzzy.utils.t1_fuzzy_partitions_dataset(x0: array, n_partition=3) list[fuzzyVariable][source]

Partitions the dataset features into different fuzzy variables. Parameters are prefixed. Use it for simple testing and initial solution.

Parameters:
  • x – numpy array|pandas dataframe, shape samples x features.

  • n_partition – number of partitions to use in the fuzzy variables.

Returns:

list of fuzzy variables.

ex_fuzzy.utils.t1_n_partition_parameters(x, n_partitions)[source]

Partitions the fuzzy variable in n trapezoidal memberships.

Parameters:
  • x – numpy array, matrix of shape (samples, variables).

  • n_partitions – int, number of partitions.

Returns:

numpy array, tensor of shape (variables, n_partitions, 4) containing the trapezoidal parameters.

ex_fuzzy.utils.t1_simple_partition(x: array) array[source]

Partitions the fuzzy variable in four trapezoidal memberships.

Parameters:

x – numpy array, vector of shape (samples, ).

Returns:

numpy array, vector of shape (variables, 4, 4).

ex_fuzzy.utils.t1_simple_triangular_partition(x: array, n_partitions: int = 3) list[array][source]

Partitions the dataset features into different fuzzy variables. Parameters are prefixed. Use it for simple testing and initial solution.

Parameters:

x – numpy array|pandas dataframe, shape samples x features.

Returns:

list of fuzzy variables.

ex_fuzzy.utils.t1_simple_triangular_partition_parameters(x: array) array[source]

Partitions the fuzzy variable in three triangular memberships.

Parameters:

x – numpy array, vector of shape (samples, ).

Returns:

numpy array, vector of shape (variables, 3, 3).

ex_fuzzy.utils.t1_three_partition(x: array) array[source]

Partitions the fuzzy variable in three trapezoidal memberships.

Parameters:

x – numpy array, vector of shape (samples, ).

Returns:

numpy array, vector of shape (variables, 3, 4).

ex_fuzzy.utils.t2_fuzzy_partitions_dataset(x0: array, n_partition=3) list[fuzzyVariable][source]

Partitions the dataset features into different fuzzy variables using iv fuzzy sets. Parameters are prefixed. Use it for simple testing and initial solution.

Parameters:
  • x – numpy array|pandas dataframe, shape samples x features.

  • n_partition – number of partitions to use in the fuzzy variables.

Returns:

list of fuzzy variables.

ex_fuzzy.utils.t2_n_partition_parameters(x, n_partitions)[source]

Partitions the fuzzy variable in n trapezoidal memberships.

Parameters:
  • x – numpy array, matrix of shape (samples, variables).

  • n_partitions – int, number of partitions.

Returns:

numpy array, tensor of shape (variables, n_partitions, 4, 2) containing the trapezoidal parameters.

ex_fuzzy.utils.temporal_assemble(X: array, y: array, temporal_moments: list[array])[source]

Assembles the data in the temporal moments in order to have partitions with balanced time moments in each one.

Parameters:
  • X – data observations.

  • y – labels.

  • temporal_moments – list of boolean arrays. True values are those contained between the cutpoints in each moment.

Returns:

tuple of lists of data and labels for each temporal moment. First tuple is: X_train, X_test, y_train, y_test Second tuple is: train temporal moments, test temporal moments.

ex_fuzzy.utils.temporal_cuts(X: DataFrame, cutpoints: list[tuple[str, str]], time_resolution: str = 'hour') list[array][source]

Returns a list of boolean indexes for each temporal moment. Performs the cuts between time steps using the cutpoints list.

Parameters:
  • X – data observations to cut in temrporal moments.

  • temporal_moments – list of temporal moments to cut.

  • cutpoints – list of tuples with the cutpoints for each temporal moment.

  • time_resolution – time field to use as the criteria.

Returns:

list of boolean arrays. True values are those contained between the cutpoints in each moment.