Rule-based Classifiers in data mining: greedy approach and rule evaluation

Slides from Università Di Pisa about Rule-based Classifiers. The Pdf introduces rule-based classifiers in data mining, explaining the greedy approach for rule extraction and evaluation metrics like Accuracy, Laplace, and M-estimate. This University Computer science material is well-structured for autonomous study.

See more

36 Pages

Rule-based Classifiers
Anna Monreale
Computer Science Department
Introduction to Data Mining, 2
nd
Edition
Chapter 5.1
Rule-based Classifier
Classify records by using a collection of “if…then…”
rules
Rule: (Condition) ® y
where
Condition is a conjunction of tests on attributes
y is the class label
Examples of classification rules:
(Blood Type=Warm) Ù (Lay Eggs=Yes) ® Birds
(Taxable Income < 50K) Ù (Refund=Yes) ® Evade=No

Unlock the full PDF for free

Sign up to get full access to the document and start transforming it with AI.

Preview

Rule-based Classifiers

Anna Monreale
Computer Science Department
Introduction to Data Mining, 2nd Edition
Chapter 5.1

UNIVERSITÀ DI PISA
IN SUPREN
GNITATIS .
.
1343Rule-based Classifier
. Classify records by using a collection of "if ... then ... "
rules
· Rule:
(Condition) -> y
- where
· Condition is a conjunction of tests on attributes
· y is the class label
- Examples of classification rules:
. (Blood Type=Warm) ^ (Lay Eggs=Yes) -> Birds
· (Taxable Income < 50K) ^ (Refund=Yes) -> Evade=No

UNIVERSITÀ DI PISA
REMÆ DIG
IN SUPRE
.
1343

Rule-based Classifier Example

R1: (Give Birth = no) ^ (Can Fly = yes) -> Birds
R2: (Give Birth = no) A (Live in Water = yes) -> Fishes
R3: (Give Birth = yes) ^ (Blood Type = warm) -> Mammals
R4: (Give Birth = no) A (Can Fly = no) -> Reptiles
R5: (Live in Water = sometimes) -> Amphibians

NameBlood TypeGive BirthCan FlyLive in WaterClass
humanwarmyesnonomammals
pythoncoldnononoreptiles
salmoncoldnonoyesfishes
whalewarmyesnoyesmammals
frogcoldnonosometimesamphibians
komodocoldnononoreptiles
batwarmyesyesnomammals
pigeonwarmnoyesnobirds
catwarmyesnonomammals
leopard sharkcoldyesnoyesfishes
turtlecoldnonosometimesreptiles
penguinwarmnonosometimesbirds
porcupinewarmyesnonomammals
eelcoldnonoyesfishes
salamandercoldnonosometimesamphibians
gila monstercoldnononoreptiles
platypuswarmnononomammals
owlwarmnoyesnobirds
dolphinwarmyesnoyesmammals
eaglewarmnoyesnobirds

UNIVERSITÀ DI PISA
IN SUPRE
EMÆ DIG
GNITATIS .
.
1343

Application of Rule-Based Classifier

  • A rule r covers an instance x if the attributes of the
    instance satisfy the condition of the rule

R1: (Give Birth = no) ^ (Can Fly = yes) -> Birds
R2: (Give Birth = no) A (Live in Water = yes) -> Fishes
R3: (Give Birth = yes) A (Blood Type = warm) -> Mammals
R4: (Give Birth = no) ^ (Can Fly = no) -> Reptiles
R5: (Live in Water = sometimes) -> Amphibians

NameBlood TypeGive BirthCan FlyLive in WaterClass
hawkwarmnoyesno?
grizzly bearwarmyesnono?

The rule R1 covers a hawk => Bird
The rule R3 covers the grizzly bear => Mammal

UNIVERSITÀ DI PISA
IN SUPRE
.
EMÆ DIG
GNITATIS .
1343

Rule Coverage and Accuracy

  • Coverage of a rule:
    • Fraction of records that
      satisfy the antecedent
      of a rule
  • Accuracy of a rule:
    • Fraction of records that
      satisfy the antecedent
      that also satisfy the
      consequent of a rule
TidRefundMarital
Status
Taxable
Income
Class
1YesSingle125KNo
2NoMarried100KNo
3NoSingle70KNo
4YesMarried120KNo
5NoDivorced95KYes
6NoMarried60KNo
7YesDivorced220KNo
8NoSingle85KYes
9NoMarried75KNo
10NoSingle90KYes

(Status=Single) -> No
Coverage = 40%, Accuracy = 50%

UNIVERSITÀ DI PISA
IN SUPREM
GNITATIS .
.
1343

How a Rule-based Classifier Works

R1: (Give Birth = no) A (Can Fly = yes) -> Birds
R2: (Give Birth = no) ^ (Live in Water = yes) -> Fishes
R3: (Give Birth = yes) A (Blood Type = warm) -> Mammals
R4: (Give Birth = no) A (Can Fly = no) -> Reptiles
R5: (Live in Water = sometimes) -> Amphibians

NameBlood TypeGive BirthCan FlyLive in WaterClass
lemurwarmyesnono?
turtlecoldnonosometimes?
dogfish sharkcoldyesnoyes?

A lemur triggers rule R3, so it is classified as a mammal
A turtle triggers both R4 and R5
A dogfish shark triggers none of the rules

EMÆ DIG
UNIVERSITÀ DI PISA
IN SUPRE
.
GNITATIS .
1343

Characteristics of Rule Sets: Strategy 1

  • Mutually exclusive rules
    • Classifier contains mutually exclusive rules if the rules are
      independent of each other
    • Every record is covered by at most one rule
  • Exhaustive rules
    • Classifier has exhaustive coverage if it accounts for every
      possible combination of attribute values
    • Each record is covered by at least one rule

UNIVERSITÀ DI PISA
EMÆ DIG
IN SUPRE
GNITATIS .
.
1343

Characteristics of Rule Sets: Strategy 2

  • Rules are not mutually exclusive
    • A record may trigger more than one rule
    • Solution?
  • Ordered rule set
  • Unordered rule set - use voting schemes
  • Rules are not exhaustive
    • A record may not trigger any rules
    • Solution?
  • Use a default class

UNIVERSITÀ DI PISA
EMÆ DIG
IN SUPRE
GNITATIS .
.
1343

Ordered Rule Set

  • Rules are rank ordered according to their priority
    • An ordered rule set is known as a decision list
  • When a test record is presented to the classifier
    • It is assigned to the class label of the highest ranked rule it has triggered
    • If none of the rules fired, it is assigned to the default class (typically
      majority class)

R1: (Give Birth = no) ^ (Can Fly = yes) -> Birds
R2: (Give Birth = no) ^ (Live in Water = yes) -> Fishes
R3: (Give Birth = yes) A (Blood Type = warm) -> Mammals
R4: (Give Birth = no) A (Can Fly = no) -> Reptiles
R5: (Live in Water = sometimes) -> Amphibians

NameBlood TypeGive BirthCan FlyLive in WaterClass
turtlecoldnonosometimes?

UNIVERSITÀ DI PISA
REMA DIG
IN SUPRE
GNITATIS .
.
1343

Rule Ordering Schemes

  • Rule-based ordering
    • Individual rules are ranked based on their quality
  • Class-based ordering
    • Rules that belong to the same class appear together
  • Unordered rules
    • vote schema

UNIVERSITÀ DI PISA
EMÆ DIG
IN SUPRE
GNITATIS .
.
1343

Rule-based Ordering Scheme

  • Individual rules are ranked based on their quality
    measured by accuracy, coverage, or size (number
    of attribute tests in the rule antecedent)
  • The rule set is known as a decision list
  • The record X is classified by the rule with the
    highest priority and any other rule that satisfies is
    ignored
  • Each rule in a decision list implies the negation of
    the rules that come before it in the list
  • Rules in a decision list are more difficult to interpret.

Rule-based Ordering Example

(Refund=Yes) == > No
(Refund=No, Marital Status={Single, Divorced},
Taxable Income<80K) == > No
(Refund=No, Marital Status={Single, Divorced},
Taxable Income>80K) == > Yes
(Refund=No, Marital Status={Married}) == > No

UNIVERSITÀ DI PISA
IN SUPRE
EMÆ DIG
GNITATIS .
.
1343

Class-based Ordering Scheme

  • Rules that belong to the same class
    appear together
  • The classes are sorted in order of
    decreasing "importance" such as by
    decreasing order of prevalence.
  • Alternatively, they may be sorted based
    on the misclassification cost per
    class.
  • Within each class, the rules are not
    ordered

Class-based Ordering Example

(Refund=Yes) == > No
(Refund=No, Marital Status={Single, Divorced},
Taxable Income<80K) == > No
(Refund=No, Marital Status={Married}) == > No
(Refund=No, Marital Status={Single, Divorced},
Taxable Income>80K) == > Yes

UNIVERSITÀ DI PISA
IN SUPRE
EMÆ DIG
GNITATIS .
.
1343

Building Classification Rules

  • Direct Method:
    • Extract rules directly from data
    • Examples: RIPPER, CN2, Holte's 1R
  • Indirect Method:
    • Extract rules from other classification models
      (e.g. decision trees, neural networks, etc).
    • Examples: C4.5rules

UNIVERSITÀ DI PISA
IN SUPRE
.
EMÆ DIG
GNITATIS .
1343

Direct Method: Sequential Covering

  1. Start from an empty rule
  2. For each class
    1. Grow a rule using the Learn-One-Rule function
    2. Remove training records covered by the rule
  3. Repeat Step until stopping criterion is met

UNIVERSITÀ DI PISA
IN SUPRE
.
EMÆ DIG
GNITATIS .
1343

Example of Sequential Covering

-
+
-
-
+ +
-
+ +
-
-
-
-
+
-
1
+
+
+
+
+
+
+
(i) Original Data
I
+ +
-
+
+ :
I
-
I
+
+
+
+
+
-
(ii) Step 1

UNIVERSITÀ DI PISA
EMÆ DIG
IN SUPRE
GNITATIS .
.
1343
+
+
+
+ +

Example of Sequential Covering (Continued)

R1
-
-
+
-
+
+
+
+
+
+
+
(iii) Step 2
-
R1
-
-
+
R2
++
+
(iv) Step 3

UNIVERSITÀ DI PISA
EMÆ DIG
IN SUPRE
.
1343
GNITATIS .

Learn-One-Rule Function

  • The goal is to extract a classification rule covering
    many positive records and none (few) negative ones
  • Finding optimal rule requires high computational
    time
  • Greedy strategy by refining an initial rule

UNIVERSITÀ DI PISA
EMÆ DIG
IN SUPRE
GNITATIS .
.
1343

Greedy Approach for Rule Extraction

  • The approach for growing rules is greedy
    • Based on some evaluation measure
  • Rules are extracted one class per time
  • The criterion for deciding the order of the
    class to consider depends on:
    • Class prevalence
    • Miss classification error for a given class

UNIVERSITÀ DI PISA
EMÆ DIG
IN SUPRE
GNITATIS .
.
1343

Rule Growing Strategies

Two common strategies:
The initial rule contains the same conjuncts
as the attribute values a selected record

Yes: 3
No: 4
Refund=No,
Status=Single,
Income=85K
(Class=Yes)
Refund=No,
Status=Single,
Income=90K
(Class=Yes)

Refund=
No
Status =
Single
Status =
Divorced
Status =
Married
.
Income
> 80K

Yes: 3
No: 4
Yes: 2
No: 1
Yes: 1
No: 0
Yes: 0
No: 3
Yes: 3
No: 1

(a) General-to-specific

UNIVERSITÀ DI PISA
EMÆ DIG
IN SUPRE
.
GNITATIS .
1343
Refund=No,
Status = Single
(Class = Yes)
(b) Specific-to-general

Rule Evaluation for Growing Rules based on Rule Coverage

n
c
n
- Accuracy =
n +1
- Laplace
=
n+ k
nc + kp
- M-estimate
n+k
n : Number of instances
covered by rule
n : Number of instances
covered by rule with class c
k : Number of classes
p : Prior probability

UNIVERSITÀ DI PISA
. IN SUPREN
GNITATIS .
1343

Rule Evaluation for Growing Rules based on Support Count

  • Foil's Information Gain
    • RO: {} => class (initial rule)
    • R1: {A} => class (rule after adding conjunct)


Gain(Ro, R1) = P1x[ log2
-
Po
P1 + n1
-) - log2
-
Po + no
)]
- Po: number of positive instances covered by RO
no: number of negative instances covered by RO
P1: number of positive instances covered by R1
n1: number of negative instances covered by R1
FOIL: First Order Inductive Learner - an early rule-based learning algorithm

UNIVERSITÀ DI PISA
IN SUPRE
.
EMÆ DIG
GNITATIS .
1343

Rule Evaluation for Growing Rules based on Statistical Test

  • Given a rule we can compute the Likelihood ratio statistic
    N. of classes
    k
    N. of records covered
    by the rule
    R=2
    fi log(fi/ei),
    i=1
    Expected frequency of
    a rule that makes
    random predictions.
  • A large R value suggests that the number of correct predictions made by the
    rule is significantly larger than that expected by random guessing

UNIVERSITÀ DI PISA
EMÆ DIG
IN SUPRE
GNITATIS .
.
1343

Statistical Test Example

k
R= 2
i=1
filog(fi/ei),
Dataset: 60 positive records and 100 negative records
Rule 1: covers 50 positive records and 5 negative examples,
Rule 2: covers 2 positive records and no negative examples.

Rule 1 Evaluation

  • Expected frequency for the positive class is e. = 55x60/160 =20.625
  • Expected frequency for the negative class is e_ = 55x100/160 =34.375
    R(r1) = 2 x [50 × log2(50/20.625) + 5 x log2(5/34.375)] = 99.9.

Rule 2 Evaluation

  • Expected frequency for the positive class is e. = 2 x 60/160 = 0.75
  • Expected frequency for the negative class is e. = 2x100/160 = 1.25
    R(r2) = 2 x [2× log2(2/0.75) + 0 x log2(0/1.25)] = 5.66.

UNIVERSITÀ DI PISA
EMÆ DIG
IN SUPRE
GNITATIS .
.
1343

Can’t find what you’re looking for?

Explore more topics in the Algor library or create your own materials with AI.