Rule Mining Module#

The ex_fuzzy.rule_mining module provides fuzzy rule mining capabilities for extracting meaningful rules from datasets.

Overview#

This module implements algorithms for discovering frequent fuzzy patterns and generating fuzzy rules using support-based itemset mining.

Functions#

rule_search(data, fuzzy_variables[, ...])

Computes the apriori algorithm for the given dataframe and threshold the support.

generate_rules_from_itemsets(itemsets, nAnts)

Given a list of itemsets, it creates the rules for each one and returns a list of rules containing them.

mine_rulebase_support(x, fuzzy_variables[, ...])

Search the data for associations that are frequent given a list of fuzzy variables for each antecedent.

prune_rules_confidence_lift(x, y, rules, ...)

Removes the rules from the rule base that do not meet a minimum value for confidence and lift measures.

simple_mine_rulebase(x[, fuzzy_type, ...])

Search the data for associations that are frequent.

multiclass_mine_rulebase(x, y, fuzzy_variables)

Search the data for associations that are frequent and have good confidence/lift values given a list of fuzzy variables for each antecedent.

simple_multiclass_mine_rulebase(x, y, fuzzy_type)

Search the data for associations that are frequent and have good confidence/lift values given a list of fuzzy variables for each antecedent.

Core Functions#

Rule Generation#

ex_fuzzy.rule_mining.generate_rules_from_itemsets(itemsets, nAnts)[source]#

Given a list of itemsets, it creates the rules for each one and returns a list of rules containing them.

Parameters:
  • itemsets (list) – list of tuple (antecedent, linguistic variable value)

  • nAnts (int) – number of possible antecedents.

Returns:

the rules for ech itemset.

Return type:

list[RuleSimple]

Mining Functions#

ex_fuzzy.rule_mining.mine_rulebase_support(x, fuzzy_variables, support_threshold=0.05, max_depth=3)[source]#

Search the data for associations that are frequent given a list of fuzzy variables for each antecedent.

Parameters:
  • x (DataFrame) – the data to mine. Dims: samples x features.

  • fuzzy_variables (list[fuzzyVariable]) – list of the fuzzy variables for each of the input variables.

  • support_threshold (float) – minimum threshold to decide if prune or not the rule.

  • max_depth (int) – maximum number of antecedents per rule.

Returns:

a rulebase object with the rules denoted as good.

Return type:

RuleBase

ex_fuzzy.rule_mining.multiclass_mine_rulebase(x, y, fuzzy_variables, support_threshold=0.05, max_depth=3, confidence_threshold=0.05, lift_threshold=1.05)[source]#

Search the data for associations that are frequent and have good confidence/lift values given a list of fuzzy variables for each antecedent. Computes a different ruleBase for each class and then uses them to form a MasterRuleBase.

Parameters:
  • x (DataFrame) – the data to mine. Dims: samples x features.

  • fuzzy_variables (list[fuzzyVariable]) – list of the fuzzy variables for each of the input variables.

  • support_threshold (float) – minimum threshold to decide if prune or not the rule.

  • max_depth (int) – maximum number of antecedents per rule.

  • confidence_threshold (float) – minimum confidence value.

  • lift_threshold (float)

Returns:

a rulebase object with the rules denoted as good.

Return type:

MasterRuleBase

ex_fuzzy.rule_mining.simple_mine_rulebase(x, fuzzy_type=FUZZY_SETS.t1, support_threshold=0.05, max_depth=3)[source]#

Search the data for associations that are frequent. Computes the fuzzy variables using a 3 label partition (low, medium, high).

Parameters:
  • x (DataFrame) – the data to mine. Dims: samples x features.

  • fuzzy_type (FUZZY_SETS) – fuzzy type to use.

  • support_threshold (float) – minimum threshold to decide if prune or not the rule.

  • max_depth (int) – maximum number of antecedents per rule.

Returns:

a rulebase object with the rules denoted as good.

Return type:

RuleBase

ex_fuzzy.rule_mining.simple_multiclass_mine_rulebase(x, y, fuzzy_type, support_threshold=0.05, max_depth=3, confidence_threshold=0.5, lift_threshold=1.1)[source]#

Search the data for associations that are frequent and have good confidence/lift values given a list of fuzzy variables for each antecedent. Computes a different ruleBase for each class and then uses them to form a MasterRuleBase.

Computes the fuzzy variables using a 3 label partition (low, medium, high).

Parameters:
  • x (DataFrame) – the data to mine. Dims: samples x features.

  • fuzzy_type (FUZZY_SETS) – fuzzy type to use.

  • support_threshold (float) – minimum threshold to decide if prune or not the rule.

  • max_depth (int) – maximum number of antecedents per rule.

Returns:

a rulebase object with the rules denoted as good.

Return type:

MasterRuleBase

Rule Pruning#

ex_fuzzy.rule_mining.prune_rules_confidence_lift(x, y, rules, fuzzy_variables, confidence_threshold=0.5, lift_threshold=1.05)[source]#

Removes the rules from the rule base that do not meet a minimum value for confidence and lift measures.

Confidence is the ratio of rules that have a particular antecedent and consequent, and those that only have the antecedent. Lift is ratio between confidence and expected confidence, which is the percentage of class samples in the original data.

Parameters:
  • x (DataFrame) – data to mine. samples x features.

  • y (array) – class vector.

  • rules (MasterRuleBase) – MasterRuleBase object with the rules to prune.

  • fuzzy_variables (list[fuzzyVariable]) – a list of the fuzzy variables per antecedent.

  • confidence_threshold (float) – minimum confidence required to the rules.

  • lift_threshold (float) – minimum lift required to the rules.

Examples#

Basic Rule Mining#

import ex_fuzzy.rule_mining as rm
import ex_fuzzy.fuzzy_sets as fs
import pandas as pd
import numpy as np

# Prepare data
X = np.random.rand(100, 4)
y = np.random.randint(0, 3, 100)
data = pd.DataFrame(X)

# Create fuzzy variables
fuzzy_vars = [fs.fuzzyVariable(f"var_{i}", X[:, i], 3, fs.FUZZY_SETS.t1)
              for i in range(X.shape[1])]

# Mine rules
itemsets = rm.rule_search(data, fuzzy_vars, support_threshold=0.1)
rules = rm.generate_rules_from_itemsets(itemsets, nAnts=3)

Multiclass Rule Mining#

# Mine rules for multiclass problem
rules = rm.multiclass_mine_rulebase(
    x=data,
    y=y,
    fuzzy_variables=fuzzy_vars,
    support_threshold=0.05,
    max_depth=3
)

See Also#