Rule Mining methods
Module to perform rule mining in a pandas dataframe or numpy array. The methods use the support of the different itemsets to look for good rule candidates. It can be used then by a Genetic optimizator from evolutionary_fit module to search the optimal combination of them.
- ex_fuzzy.rule_mining.generate_rules_from_itemsets(itemsets: list, nAnts: int) list[RuleSimple] [source]
Given a list of itemsets, it creates the rules for each one and returns a list of rules containing them.
- Parameters:
itemsets – list of tuple (antecedent, linguistic variable value)
nAnts – number of possible antecedents.
- Returns:
the rules for ech itemset.
- ex_fuzzy.rule_mining.mine_rulebase_support(x: DataFrame, fuzzy_variables: list[fuzzyVariable], support_threshold: float = 0.05, max_depth: int = 3) RuleBase [source]
Search the data for associations that are frequent given a list of fuzzy variables for each antecedent.
- Parameters:
x – the data to mine. Dims: samples x features.
fuzzy_variables – list of the fuzzy variables for each of the input variables.
support_threshold – minimum threshold to decide if prune or not the rule.
max_depth – maximum number of antecedents per rule.
- Returns:
a rulebase object with the rules denoted as good.
- ex_fuzzy.rule_mining.multiclass_mine_rulebase(x: DataFrame, y: array, fuzzy_variables: list[fuzzyVariable], support_threshold: float = 0.05, max_depth: int = 3, confidence_threshold: float = 0.05, lift_threshold: float = 1.05) MasterRuleBase [source]
Search the data for associations that are frequent and have good confidence/lift values given a list of fuzzy variables for each antecedent. Computes a different ruleBase for each class and then uses them to form a MasterRuleBase.
- Parameters:
x – the data to mine. Dims: samples x features.
fuzzy_variables – list of the fuzzy variables for each of the input variables.
support_threshold – minimum threshold to decide if prune or not the rule.
max_depth – maximum number of antecedents per rule.
confidence_threshold – minimum confidence value.
lift_threshold
- Returns:
a rulebase object with the rules denoted as good.
- ex_fuzzy.rule_mining.prune_rules_confidence_lift(x: DataFrame, y: array, rules: MasterRuleBase, fuzzy_variables: list[fuzzyVariable], confidence_threshold: float = 0.5, lift_threshold: float = 1.05)[source]
Removes the rules from the rule base that do not meet a minimum value for confidence and lift measures.
Confidence is the ratio of rules that have a particular antecedent and consequent, and those that only have the antecedent. Lift is ratio between confidence and expected confidence, which is the percentage of class samples in the original data.
- Parameters:
x – data to mine. samples x features.
y – class vector.
rules – MasterRuleBase object with the rules to prune.
fuzzy_variables – a list of the fuzzy variables per antecedent.
confidence_threshold – minimum confidence required to the rules.
lift_threshold – minimum lift required to the rules.
- ex_fuzzy.rule_mining.rule_search(data: DataFrame, fuzzy_variables: dict[fuzzyVariable], support_threshold: float = 0.05, max_depth: int = None) list [source]
Computes the apriori algorithm for the given dataframe and threshold the support.
- Parameters:
data – Dataframe of shape: samples x features
variables (fuzzy) – dict that maps each feature name with a fuzzy variable.
support_threshold – minimum support to consider frequent an itemset.
- Returns:
all the frequent itemsets as a list.
- ex_fuzzy.rule_mining.simple_mine_rulebase(x: DataFrame, fuzzy_type: FUZZY_SETS = FUZZY_SETS.t1, support_threshold: float = 0.05, max_depth: int = 3) RuleBase [source]
Search the data for associations that are frequent. Computes the fuzzy variables using a 3 label partition (low, medium, high).
- Parameters:
x – the data to mine. Dims: samples x features.
fuzzy_type – fuzzy type to use.
support_threshold – minimum threshold to decide if prune or not the rule.
max_depth – maximum number of antecedents per rule.
- Returns:
a rulebase object with the rules denoted as good.
- ex_fuzzy.rule_mining.simple_multiclass_mine_rulebase(x: DataFrame, y: array, fuzzy_type: FUZZY_SETS, support_threshold: float = 0.05, max_depth: int = 3, confidence_threshold: float = 0.5, lift_threshold: float = 1.1) MasterRuleBase [source]
Search the data for associations that are frequent and have good confidence/lift values given a list of fuzzy variables for each antecedent. Computes a different ruleBase for each class and then uses them to form a MasterRuleBase.
Computes the fuzzy variables using a 3 label partition (low, medium, high).
- Parameters:
x – the data to mine. Dims: samples x features.
fuzzy_type – fuzzy type to use.
support_threshold – minimum threshold to decide if prune or not the rule.
max_depth – maximum number of antecedents per rule.
- Returns:
a rulebase object with the rules denoted as good.