Rule Mining methods

Module to perform rule mining in a pandas dataframe or numpy array. The methods use the support of the different itemsets to look for good rule candidates. It can be used then by a Genetic optimizator from evolutionary_fit module to search the optimal combination of them.

ex_fuzzy.rule_mining.generate_rules_from_itemsets(itemsets: list, nAnts: int) list[RuleSimple][source]

Given a list of itemsets, it creates the rules for each one and returns a list of rules containing them.

Parameters:
  • itemsets – list of tuple (antecedent, linguistic variable value)

  • nAnts – number of possible antecedents.

Returns:

the rules for ech itemset.

ex_fuzzy.rule_mining.mine_rulebase_support(x: DataFrame, fuzzy_variables: list[fuzzyVariable], support_threshold: float = 0.05, max_depth: int = 3) RuleBase[source]

Search the data for associations that are frequent given a list of fuzzy variables for each antecedent.

Parameters:
  • x – the data to mine. Dims: samples x features.

  • fuzzy_variables – list of the fuzzy variables for each of the input variables.

  • support_threshold – minimum threshold to decide if prune or not the rule.

  • max_depth – maximum number of antecedents per rule.

Returns:

a rulebase object with the rules denoted as good.

ex_fuzzy.rule_mining.multiclass_mine_rulebase(x: DataFrame, y: array, fuzzy_variables: list[fuzzyVariable], support_threshold: float = 0.05, max_depth: int = 3, confidence_threshold: float = 0.05, lift_threshold: float = 1.05) MasterRuleBase[source]

Search the data for associations that are frequent and have good confidence/lift values given a list of fuzzy variables for each antecedent. Computes a different ruleBase for each class and then uses them to form a MasterRuleBase.

Parameters:
  • x – the data to mine. Dims: samples x features.

  • fuzzy_variables – list of the fuzzy variables for each of the input variables.

  • support_threshold – minimum threshold to decide if prune or not the rule.

  • max_depth – maximum number of antecedents per rule.

  • confidence_threshold – minimum confidence value.

  • lift_threshold

Returns:

a rulebase object with the rules denoted as good.

ex_fuzzy.rule_mining.prune_rules_confidence_lift(x: DataFrame, y: array, rules: MasterRuleBase, fuzzy_variables: list[fuzzyVariable], confidence_threshold: float = 0.5, lift_threshold: float = 1.05)[source]

Removes the rules from the rule base that do not meet a minimum value for confidence and lift measures.

Confidence is the ratio of rules that have a particular antecedent and consequent, and those that only have the antecedent. Lift is ratio between confidence and expected confidence, which is the percentage of class samples in the original data.

Parameters:
  • x – data to mine. samples x features.

  • y – class vector.

  • rules – MasterRuleBase object with the rules to prune.

  • fuzzy_variables – a list of the fuzzy variables per antecedent.

  • confidence_threshold – minimum confidence required to the rules.

  • lift_threshold – minimum lift required to the rules.

Computes the apriori algorithm for the given dataframe and threshold the support.

Parameters:
  • data – Dataframe of shape: samples x features

  • variables (fuzzy) – dict that maps each feature name with a fuzzy variable.

  • support_threshold – minimum support to consider frequent an itemset.

Returns:

all the frequent itemsets as a list.

ex_fuzzy.rule_mining.simple_mine_rulebase(x: DataFrame, fuzzy_type: FUZZY_SETS = FUZZY_SETS.t1, support_threshold: float = 0.05, max_depth: int = 3) RuleBase[source]

Search the data for associations that are frequent. Computes the fuzzy variables using a 3 label partition (low, medium, high).

Parameters:
  • x – the data to mine. Dims: samples x features.

  • fuzzy_type – fuzzy type to use.

  • support_threshold – minimum threshold to decide if prune or not the rule.

  • max_depth – maximum number of antecedents per rule.

Returns:

a rulebase object with the rules denoted as good.

ex_fuzzy.rule_mining.simple_multiclass_mine_rulebase(x: DataFrame, y: array, fuzzy_type: FUZZY_SETS, support_threshold: float = 0.05, max_depth: int = 3, confidence_threshold: float = 0.5, lift_threshold: float = 1.1) MasterRuleBase[source]

Search the data for associations that are frequent and have good confidence/lift values given a list of fuzzy variables for each antecedent. Computes a different ruleBase for each class and then uses them to form a MasterRuleBase.

Computes the fuzzy variables using a 3 label partition (low, medium, high).

Parameters:
  • x – the data to mine. Dims: samples x features.

  • fuzzy_type – fuzzy type to use.

  • support_threshold – minimum threshold to decide if prune or not the rule.

  • max_depth – maximum number of antecedents per rule.

Returns:

a rulebase object with the rules denoted as good.