Evolutionary Algorithms to Fit the rules#

Evolutionary Optimization for Fuzzy Rule Base Learning

This module implements genetic algorithm-based optimization for learning fuzzy rule bases. It provides automatic rule discovery, parameter tuning, and structure optimization for fuzzy inference systems using evolutionary computation techniques.

Main Components:
  • FitRuleBase: Core optimization problem class for genetic algorithms

  • Fitness functions: Multiple objective functions for rule quality assessment

  • Genetic operators: Specialized crossover, mutation, and selection for fuzzy rules

  • Multi-objective optimization: Support for accuracy vs. complexity trade-offs

  • Parallel evaluation: Efficient fitness evaluation using multiple threads

  • Integration with Pymoo: Leverages the Pymoo optimization framework

The module supports automatic learning of:
  • Rule antecedents (which variables and linguistic terms to use)

  • Rule consequents (output class assignments)

  • Rule structure (number of rules, complexity constraints)

  • Membership function parameters (when combined with other modules)

Key Features:
  • Stratified cross-validation for robust fitness evaluation

  • Multiple fitness metrics (accuracy, MCC, F1-score, etc.)

  • Support for Type-1, Type-2, and General Type-2 fuzzy systems

  • Automatic handling of imbalanced datasets

  • Configurable complexity penalties to avoid overfitting

class ex_fuzzy.evolutionary_fit.BaseFuzzyRulesClassifier(nRules=30, nAnts=4, fuzzy_type=FUZZY_SETS.t1, tolerance=0.0, class_names=None, n_linguistic_variables=3, verbose=False, linguistic_variables=None, categorical_mask=None, domain=None, n_class=None, precomputed_rules=None, runner=1, ds_mode=0, fuzzy_modifiers=False, allow_unknown=False)[source]#

Bases: ClassifierMixin

Class that is used as a classifier for a fuzzy rule based system. Supports precomputed and optimization of the linguistic variables.

__init__(nRules=30, nAnts=4, fuzzy_type=FUZZY_SETS.t1, tolerance=0.0, class_names=None, n_linguistic_variables=3, verbose=False, linguistic_variables=None, categorical_mask=None, domain=None, n_class=None, precomputed_rules=None, runner=1, ds_mode=0, fuzzy_modifiers=False, allow_unknown=False)[source]#

Inits the optimizer with the corresponding parameters.

Parameters:
  • nRules (int) – number of rules to optimize.

  • nAnts (int) – max number of antecedents to use.

  • type (fuzzy) – FUZZY_SET enum type in fuzzy_sets module. The kind of fuzzy set used.

  • tolerance (float) – tolerance for the dominance score of the rules.

  • n_linguist_variables – number of linguistic variables per antecedent.

  • verbose – if True, prints the progress of the optimization.

  • linguistic_variables (list[fuzzyVariable]) – list of fuzzyVariables type. If None (default) the optimization process will init+optimize them.

  • domain (list[float]) – list of the limits for each variable. If None (default) the classifier will compute them empirically.

  • n_class (int) – names of the classes in the problem. If None (default) the classifier will compute it empirically.

  • precomputed_rules (MasterRuleBase) – MasterRuleBase object. If not None, the classifier will use the rules in the object and ignore the conflicting parameters.

  • runner (int) – number of threads to use. If None (default) the classifier will use 1 thread.

  • ds_mode (int) – mode for the dominance score. 0: normal dominance score, 1: rules without weights, 2: weights optimized for each rule based on the data.

  • fuzzy_modifiers (bool) – if True, the classifier will use the modifiers in the optimization process.

  • allow_unknown (bool) – if True, the classifier will allow the unknown class in the classification process. (Which would be a -1 value)

customized_loss(loss_function)[source]#

Function to customize the loss function used for the optimization.

Parameters:

loss_function – function that takes as input the true labels and the predicted labels and returns a float.

Returns:

None

fit(X, y, n_gen=70, pop_size=30, checkpoints=0, candidate_rules=None, initial_rules=None, random_state=33, var_prob=0.3, sbx_eta=3.0, mutation_eta=7.0, tournament_size=3, bootstrap_size=1000, checkpoint_path='', p_value_compute=False, checkpoint_callback=None)[source]#

Fits a fuzzy rule based classifier using a genetic algorithm to the given data.

Parameters:
  • X (array) – numpy array samples x features

  • y (array) – labels. integer array samples (x 1)

  • n_gen (int) – integer. Number of generations to run the genetic algorithm.

  • pop_size (int) – integer. Population size for each gneration.

  • checkpoints (int) – integer. Number of checkpoints to save the best rulebase found so far.

  • candidate_rules (MasterRuleBase) – if these rules exist, the optimization process will choose the best rules from this set. If None (default) the rules will be generated from scratch.

  • initial_rules (MasterRuleBase) – if these rules exist, the optimization process will start from this set. If None (default) the rules will be generated from scratch.

  • random_state (int) – integer. Random seed for the optimization process.

  • var_prob (float) – float. Probability of crossover for the genetic algorithm.

  • sbx_eta (float) – float. Eta parameter for the SBX crossover.

  • checkpoint_path (str) – string. Path to save the checkpoints. If None (default) the checkpoints will be saved in the current directory.

  • mutation_eta (float) – float. Eta parameter for the polynomial mutation.

  • tournament_size (int) – integer. Size of the tournament for the genetic algorithm.

  • checkpoint_callback (Callable[[int, MasterRuleBase], None]) – function. Callback function that get executed at each checkpoint (‘checkpoints’ must be greater than 0), its arguments are the generation number and the rule_base of the checkpoint.

Returns:

None. The classifier is fitted to the data.

Return type:

None

print_rule_bootstrap_results()[source]#

Prints the bootstrap results for each rule.

p_value_validation(bootstrap_size=100)[source]#

Computes the permutation and bootstrapping p-values for the classifier and its rules.

Parameters:

bootstrap_size (int) – integer. Number of bootstraps samples to use.

load_master_rule_base(rule_base)[source]#

Loads a master rule base to be used in the prediction process.

Parameters:

rule_base (MasterRuleBase) – ruleBase object.

Returns:

None

Return type:

None

explainable_predict(X, out_class_names=False)[source]#

Returns the predicted class for each sample.

forward(X, out_class_names=False)[source]#

Returns the predicted class for each sample.

Parameters:
  • X (array) – np array samples x features.

  • out_class_names – if True, the output will be the class names instead of the class index.

Returns:

np array samples (x 1) with the predicted class.

Return type:

array

predict(X, out_class_names=False)[source]#

Returns the predicted class for each sample.

Parameters:
  • X (array) – np array samples x features.

  • out_class_names – if True, the output will be the class names instead of the class index.

Returns:

np array samples (x 1) with the predicted class.

Return type:

array

predict_proba_rules(X, truth_degrees=True)[source]#

Returns the predicted class probabilities for each sample.

Parameters:
  • X (array) – np array samples x features.

  • truth_degrees (bool) – if True, the output will be the truth degrees of the rules. If false, will return the association degrees i.e. the truth degree multiplied by the weights/dominance of the rules. (depending on the inference mode chosen)

Returns:

np array samples x classes with the predicted class probabilities.

Return type:

array

predict_membership_class(X)[source]#

Returns the predicted class memberships for each sample.

Parameters:

X (array) – np array samples x features.

Returns:

np array samples x classes with the predicted class probabilities.

Return type:

array

predict_proba(X)[source]#

Returns the predicted class probabilities for each sample.

Parameters:

X (array) – np array samples x features.

Returns:

np array samples x classes with the predicted class probabilities.

Return type:

array

print_rules(return_rules=False, bootstrap_results=False)[source]#

Print the rules contained in the fitted rulebase.

plot_fuzzy_variables()[source]#

Plot the fuzzy partitions in each fuzzy variable.

rename_fuzzy_variables()[source]#

Renames the linguist labels so that high, low and so on are consistent. It does so usually after an optimization process.

Returns:

None. Names are sorted accorded to the central point of the fuzzy memberships.

Return type:

None

get_rulebase()[source]#

Get the rulebase obtained after fitting the classifier to the data.

Returns:

a matrix format for the rulebase.

Return type:

list[array]

reparametrice_loss(alpha, beta)[source]#

Changes the parameters in the loss function.

Note:

Does not check for convexity preservation. The user can play with these parameters as it wills.

Parameters:
  • alpha (float) – controls the MCC term.

  • beta (float) – controls the average rule size loss.

__call__(X)[source]#

Returns the predicted class for each sample.

Parameters:

X (array) – np array samples x features.

Returns:

np array samples (x 1) with the predicted class.

Return type:

array

class ex_fuzzy.evolutionary_fit.ExploreRuleBases(X, y, nRules, n_classes, candidate_rules, thread_runner=None, tolerance=0.01)[source]#

Bases: Problem

Class to model as pymoo problem the fitting of a rulebase to a set of data given a series of candidate rules for a classification problem using Evolutionary strategies Supports type 1 and t2.

__init__(X, y, nRules, n_classes, candidate_rules, thread_runner=None, tolerance=0.01)[source]#

Cosntructor method. Initializes the classifier with the number of antecedents, linguist variables and the kind of fuzzy set desired.

Parameters:
  • X (array) – np array or pandas dataframe samples x features.

  • y (array) – np vector containing the target classes. vector sample

  • n_class – number of classes in the problem. If None (as default) it will be computed from the data.

  • cancidate_rules – MasterRuleBase object. If not None, the classifier will use the rules in the object and ignore the conflicting parameters.

fitness_func(ruleBase, X, y, tolerance, alpha=0.0, beta=0.0, precomputed_truth=None)[source]#

Fitness function for the optimization problem. :param ruleBase: RuleBase object :param X: array of train samples. X shape = (n_samples, n_features) :param y: array of train labels. y shape = (n_samples,) :param tolerance: float. Tolerance for the size evaluation. :return: float. Fitness value.

class ex_fuzzy.evolutionary_fit.FitRuleBase(X, y, nRules, nAnts, n_classes, thread_runner=None, linguistic_variables=None, n_linguistic_variables=3, fuzzy_type=FUZZY_SETS.t1, domain=None, categorical_mask=None, tolerance=0.01, alpha=0.0, beta=0.0, ds_mode=0, encode_mods=False, allow_unknown=False)[source]#

Bases: Problem

Class to model as pymoo problem the fitting of a rulebase for a classification problem using Evolutionary strategies. Supports type 1 and iv fs (iv-type 2)

vl_names = [[], [], ['Low', 'High'], ['Low', 'Medium', 'High'], ['Low', 'Medium', 'High', 'Very High'], ['Very Low', 'Low', 'Medium', 'High', 'Very High']]#
__init__(X, y, nRules, nAnts, n_classes, thread_runner=None, linguistic_variables=None, n_linguistic_variables=3, fuzzy_type=FUZZY_SETS.t1, domain=None, categorical_mask=None, tolerance=0.01, alpha=0.0, beta=0.0, ds_mode=0, encode_mods=False, allow_unknown=False)[source]#

Cosntructor method. Initializes the classifier with the number of antecedents, linguist variables and the kind of fuzzy set desired.

Parameters:
  • X (array) – np array or pandas dataframe samples x features.

  • y (array) – np vector containing the target classes. vector sample

  • nRules (int) – number of rules to optimize.

  • nAnts (int) – max number of antecedents to use.

  • n_class – number of classes in the problem. If None (as default) it will be computed from the data.

  • linguistic_variables (list[fuzzyVariable]) – list of linguistic variables precomputed. If given, the rest of conflicting arguments are ignored.

  • n_linguistic_variables (int) – number of linguistic variables per antecedent.

  • fuzzy_type – Define the fuzzy set or fuzzy set extension used as linguistic variable.

  • domain (list) – list with the upper and lower domains of each input variable. If None (as default) it will stablish the empirical min/max as the limits.

  • tolerance (float) – float. Tolerance for the size evaluation.

  • alpha (float) – float. Weight for the rulebase size term in the fitness function. (Penalizes number of rules)

  • beta (float) – float. Weight for the average rule size term in the fitness function.

  • ds_mode (int) – int. Mode for the dominance score. 0: normal dominance score, 1: rules without weights, 2: weights optimized for each rule based on the data.

  • encode_mods (bool) – bool. If True, the optimization process will include the modifiers for the membership functions.

  • allow_unknown (bool) – if True, the classifier will allow the unknown class in the classification process. (Which would be a -1 value)

encode_rulebase(rule_base, optimize_lv, encode_mods=False)[source]#

Given a rule base, constructs the corresponding gene associated with that rule base.

GENE STRUCTURE

First: antecedents chosen by each rule. Size: nAnts * nRules (index of the antecedent) Second: Variable linguistics used. Size: nAnts * nRules Third: Parameters for the fuzzy partitions of the chosen variables. Size: nAnts * self.n_linguistic_variables * 8|4 (2 trapezoidal memberships if t2) Four: Consequent classes. Size: nRules

Parameters:
  • rule_base (MasterRuleBase) – rule base object.

  • optimize_lv (bool) – if True, the gene is prepared to optimize the membership functions.

  • encode_mods (bool) – if True, the gene is prepared to encode the modifiers for the membership functions.

Returns:

np array of size self.single_gen_size.

Return type:

array

fitness_func(ruleBase, X, y, tolerance, alpha=0.0, beta=0.0, precomputed_truth=None)[source]#

Fitness function for the optimization problem. :param ruleBase: RuleBase object :param X: array of train samples. X shape = (n_samples, n_features) :param y: array of train labels. y shape = (n_samples,) :param tolerance: float. Tolerance for the size evaluation. :param alpha: float. Weight for the accuracy term. :param beta: float. Weight for the average rule size term. :param precomputed_truth: np array. If given, it will be used as the truth values for the evaluation. :return: float. Fitness value.