Evolutionary Fit Module#

The ex_fuzzy.evolutionary_fit module implements genetic algorithm-based optimization for learning fuzzy rule bases.

Overview#

This module provides automatic rule discovery, parameter tuning, and structure optimization for fuzzy inference systems using evolutionary computation techniques.

Core Capabilities:

  • Automatic Rule Learning: Discover optimal rule antecedents and consequents

  • Multi-objective Optimization: Balance accuracy vs. interpretability

  • Parallel Evaluation: Efficient fitness computation using threading

  • Cross-validation: Robust fitness evaluation with stratified CV

  • Pymoo Integration: Leverages the powerful Pymoo optimization framework

Optimization Targets:

  • Rule antecedents (variable and linguistic term selection)

  • Rule consequents (output class assignments)

  • Rule structure (number of rules, complexity constraints)

  • Membership function parameters (integration with other modules)

Classes#

FitRuleBase(X, y, nRules, nAnts, n_classes)

Class to model as pymoo problem the fitting of a rulebase for a classification problem using Evolutionary strategies.

Functions#

Optimization Problem#

FitRuleBase#

class ex_fuzzy.evolutionary_fit.FitRuleBase(X, y, nRules, nAnts, n_classes, thread_runner=None, linguistic_variables=None, n_linguistic_variables=3, fuzzy_type=FUZZY_SETS.t1, domain=None, categorical_mask=None, tolerance=0.01, alpha=0.0, beta=0.0, ds_mode=0, encode_mods=False, allow_unknown=False)[source]#

Bases: Problem

Class to model as pymoo problem the fitting of a rulebase for a classification problem using Evolutionary strategies. Supports type 1 and iv fs (iv-type 2)

Core Methods

_evaluate

encode_rulebase

Given a rule base, constructs the corresponding gene associated with that rule base.

vl_names = [[], [], ['Low', 'High'], ['Low', 'Medium', 'High'], ['Low', 'Medium', 'High', 'Very High'], ['Very Low', 'Low', 'Medium', 'High', 'Very High']]#
__init__(X, y, nRules, nAnts, n_classes, thread_runner=None, linguistic_variables=None, n_linguistic_variables=3, fuzzy_type=FUZZY_SETS.t1, domain=None, categorical_mask=None, tolerance=0.01, alpha=0.0, beta=0.0, ds_mode=0, encode_mods=False, allow_unknown=False)[source]#

Cosntructor method. Initializes the classifier with the number of antecedents, linguist variables and the kind of fuzzy set desired.

Parameters:
  • X (array) – np array or pandas dataframe samples x features.

  • y (array) – np vector containing the target classes. vector sample

  • nRules (int) – number of rules to optimize.

  • nAnts (int) – max number of antecedents to use.

  • n_class – number of classes in the problem. If None (as default) it will be computed from the data.

  • linguistic_variables (list[fuzzyVariable]) – list of linguistic variables precomputed. If given, the rest of conflicting arguments are ignored.

  • n_linguistic_variables (int) – number of linguistic variables per antecedent.

  • fuzzy_type – Define the fuzzy set or fuzzy set extension used as linguistic variable.

  • domain (list) – list with the upper and lower domains of each input variable. If None (as default) it will stablish the empirical min/max as the limits.

  • tolerance (float) – float. Tolerance for the size evaluation.

  • alpha (float) – float. Weight for the rulebase size term in the fitness function. (Penalizes number of rules)

  • beta (float) – float. Weight for the average rule size term in the fitness function.

  • ds_mode (int) – int. Mode for the dominance score. 0: normal dominance score, 1: rules without weights, 2: weights optimized for each rule based on the data.

  • encode_mods (bool) – bool. If True, the optimization process will include the modifiers for the membership functions.

  • allow_unknown (bool) – if True, the classifier will allow the unknown class in the classification process. (Which would be a -1 value)

bounds()#
do(X, return_values_of, *args, **kwargs)#
evaluate(X, *args, return_values_of=None, return_as_dictionary=False, **kwargs)#
has_bounds()#
has_constraints()#
ideal_point(*args, use_cache=True, set_cache=True, **kwargs)#
property n_constr#
nadir_point(*args, use_cache=True, set_cache=True, **kwargs)#
name()#
pareto_front(*args, use_cache=True, set_cache=True, **kwargs)#
pareto_set(*args, use_cache=True, set_cache=True, **kwargs)#
encode_rulebase(rule_base, optimize_lv, encode_mods=False)[source]#

Given a rule base, constructs the corresponding gene associated with that rule base.

GENE STRUCTURE

First: antecedents chosen by each rule. Size: nAnts * nRules (index of the antecedent) Second: Variable linguistics used. Size: nAnts * nRules Third: Parameters for the fuzzy partitions of the chosen variables. Size: nAnts * self.n_linguistic_variables * 8|4 (2 trapezoidal memberships if t2) Four: Consequent classes. Size: nRules

Parameters:
  • rule_base (MasterRuleBase) – rule base object.

  • optimize_lv (bool) – if True, the gene is prepared to optimize the membership functions.

  • encode_mods (bool) – if True, the gene is prepared to encode the modifiers for the membership functions.

Returns:

np array of size self.single_gen_size.

Return type:

array

fitness_func(ruleBase, X, y, tolerance, alpha=0.0, beta=0.0, precomputed_truth=None)[source]#

Fitness function for the optimization problem. :param ruleBase: RuleBase object :param X: array of train samples. X shape = (n_samples, n_features) :param y: array of train labels. y shape = (n_samples,) :param tolerance: float. Tolerance for the size evaluation. :param alpha: float. Weight for the accuracy term. :param beta: float. Weight for the average rule size term. :param precomputed_truth: np array. If given, it will be used as the truth values for the evaluation. :return: float. Fitness value.

EvolutionaryFitness#

Main Functions#

Evolutionary Optimization#

Fitness Evaluation#

Examples#

Basic Rule Base Optimization#

import ex_fuzzy.evolutionary_fit as evf
import ex_fuzzy.fuzzy_sets as fs
import numpy as np
from sklearn.datasets import load_iris

# Load data
X, y = load_iris(return_X_y=True)

# Create linguistic variables
antecedents = [
    fs.fuzzyVariable(f"feature_{i}", X[:, i], 3, fs.FUZZY_SETS.t1)
    for i in range(X.shape[1])
]

# Setup optimization problem
problem = evf.FitRuleBase(
    antecedents=antecedents,
    X=X,
    y=y,
    n_rules=10,
    n_class=3,
    fitness_function='accuracy'
)

# Run genetic algorithm
result = evf.evolutionary_fit(
    problem=problem,
    n_gen=50,
    pop_size=100,
    verbose=True
)

# Extract best rule base
best_rulebase = problem.decode_solution(result.X)
print(f"Best fitness: {result.F}")

Multi-objective Optimization#

# Optimize for both accuracy and rule complexity
problem = evf.FitRuleBase(
    antecedents=antecedents,
    X=X,
    y=y,
    n_rules=15,
    n_class=3,
    fitness_function=['accuracy', 'complexity'],
    weights=[0.8, 0.2]  # 80% accuracy, 20% simplicity
)

# Use NSGA-II for multi-objective optimization
from pymoo.algorithms.moo.nsga2 import NSGA2

algorithm = NSGA2(
    pop_size=100,
    eliminate_duplicates=True
)

result = evf.evolutionary_fit(
    problem=problem,
    algorithm=algorithm,
    n_gen=100,
    verbose=True
)

# Analyze Pareto front
pareto_solutions = result.X
pareto_fitness = result.F

Cross-validation Based Fitness#

# Use cross-validation for robust fitness evaluation
problem = evf.FitRuleBase(
    antecedents=antecedents,
    X=X,
    y=y,
    n_rules=12,
    n_class=3,
    fitness_function='mcc',  # Matthews Correlation Coefficient
    cv_folds=5,
    stratified=True
)

# Configure genetic operators
algorithm = GA(
    pop_size=80,
    sampling=IntegerRandomSampling(),
    crossover=SBX(eta=15, prob=0.9),
    mutation=PolynomialMutation(eta=20, prob=0.1),
    eliminate_duplicates=True
)

result = evf.evolutionary_fit(
    problem=problem,
    algorithm=algorithm,
    n_gen=75,
    seed=42
)

Parallel Fitness Evaluation#

from multiprocessing import cpu_count

# Setup parallel evaluation
n_threads = min(4, cpu_count())
pool = ThreadPool(n_threads)
runner = StarmapParallelization(pool.starmap)

problem = evf.FitRuleBase(
    antecedents=antecedents,
    X=X,
    y=y,
    n_rules=20,
    n_class=3,
    fitness_function='f1_macro',
    parallelization=runner
)

# Run with parallel evaluation
result = evf.evolutionary_fit(
    problem=problem,
    n_gen=60,
    pop_size=120,
    verbose=True
)

pool.close()
pool.join()

Custom Fitness Functions#

def custom_fitness(y_true, y_pred, rulebase=None):
    """Custom fitness combining accuracy and rule diversity."""
    from sklearn.metrics import accuracy_score

    accuracy = accuracy_score(y_true, y_pred)

    # Penalty for rule complexity
    if rulebase is not None:
        complexity_penalty = len(rulebase.rules) * 0.01
        return accuracy - complexity_penalty

    return accuracy

# Use custom fitness
problem = evf.FitRuleBase(
    antecedents=antecedents,
    X=X,
    y=y,
    n_rules=15,
    n_class=3,
    fitness_function=custom_fitness
)

Advanced Configuration#

# Advanced evolutionary parameters
from pymoo.operators.selection.tournament import TournamentSelection
from pymoo.operators.survival.rank_and_crowding import RankAndCrowding

algorithm = GA(
    pop_size=150,
    sampling=IntegerRandomSampling(),
    selection=TournamentSelection(pressure=2),
    crossover=SBX(eta=10, prob=0.85),
    mutation=PolynomialMutation(eta=25, prob=0.15),
    survival=RankAndCrowding(),
    eliminate_duplicates=True
)

# Problem with complexity constraints
problem = evf.FitRuleBase(
    antecedents=antecedents,
    X=X,
    y=y,
    n_rules=25,
    n_class=3,
    fitness_function='balanced_accuracy',
    max_complexity=0.8,  # Limit rule complexity
    min_support=0.05,    # Minimum rule support
    cv_folds=3
)

# Track evolution progress
from pymoo.core.callback import Callback

class ProgressCallback(Callback):
    def notify(self, algorithm):
        print(f"Generation {algorithm.n_gen}: Best = {algorithm.pop.get('F').min():.4f}")

result = evf.evolutionary_fit(
    problem=problem,
    algorithm=algorithm,
    n_gen=100,
    callback=ProgressCallback(),
    verbose=False
)

Hyperparameter Tuning#

from sklearn.model_selection import GridSearchCV
from ex_fuzzy.classifiers import RuleMineClassifier

# Define parameter ranges for evolutionary optimization
param_grid = {
    'n_rules': [10, 15, 20, 25],
    'pop_size': [50, 100, 150],
    'n_gen': [30, 50, 75],
    'fitness_function': ['accuracy', 'f1_macro', 'mcc']
}

# Custom estimator using evolutionary fit
class EvolutionaryClassifier(ClassifierMixin):
    def __init__(self, n_rules=15, pop_size=100, n_gen=50, fitness_function='accuracy'):
        self.n_rules = n_rules
        self.pop_size = pop_size
        self.n_gen = n_gen
        self.fitness_function = fitness_function

    def fit(self, X, y):
        # Implementation using evolutionary_fit
        pass

    def predict(self, X):
        # Implementation using optimized rule base
        pass

# Grid search over evolutionary parameters
classifier = EvolutionaryClassifier()
grid_search = GridSearchCV(classifier, param_grid, cv=3)
grid_search.fit(X, y)

Fitness Functions#

Available Fitness Functions#

The module supports multiple built-in fitness functions:

Function

Description

'accuracy'

Classification accuracy (default)

'balanced_accuracy'

Balanced accuracy for imbalanced datasets

'f1_macro'

Macro-averaged F1 score

'f1_micro'

Micro-averaged F1 score

'f1_weighted'

Weighted F1 score by class support

'mcc'

Matthews Correlation Coefficient

'precision_macro'

Macro-averaged precision

'recall_macro'

Macro-averaged recall

'roc_auc'

Area under ROC curve (binary only)

'complexity'

Rule base complexity measure

See Also#

References#