PMML Associative Rules
PMML1.1 Menu

Home

PMML Conformance

Header

Data Dictionary

Mining Schema

Statistics

Normalization

Tree Model

General Regression

General Structure

Asscocation Rules

Neural Network

Center and Distribution - based Clustering

PMML 1.1 DTD

Download PMML v1.1 (zip)

PMML 1.1 -- DTD of Association Rules Model

The Association Rule model represents rules where some set of items is associated to another set of items. For example a rule can express that a certain product is often bought in combination with a certain set of other products.

The attribute definitions of the association rule model uses the entity ELEMENT-ID in order to express a semantical constraint that a value must be unique in a set of elements (contained in the same XML document) of the same type.

			
	<!ENTITY % ELEMENT-ID "CDATA">

An Association Rule model consists of four major parts:

					
	<!ELEMENT AssociationModel (Extension*, AssocInputStats,  AssocItem+, 
	     AssocItemset+, AssocRule+)>

	<!ATTLIST AssociationModel
	     modelName     CDATA     #IMPLIED
	>

Basic information of the input data:

		
		<!ELEMENT AssocInputStats EMPTY>
		
		
		<!ATTLIST AssocInputStats
		     numberOfTransactions     %INT-NUMBER;     #REQUIRED
		     maxNumberOfItemsPerTA    %INT-NUMBER;     #IMPLIED
		     avgNumberOfItemsPerTA    %REAL-NUMBER;    #IMPLIED
		     minimumSupport           %PROB-NUMBER;    #REQUIRED
		     minimumConfidence        %PROB-NUMBER;    #REQUIRED
		     lengthLimit              %INT-NUMBER;     #IMPLIED
		     numberOfItems            %INT-NUMBER;     #REQUIRED
		     numberOfItemsets         %INT-NUMBER;     #REQUIRED
		     numberOfRules            %INT-NUMBER;     #REQUIRED
		>

Attribute description:

    numberOfTransactions : The number of transactions (baskets of items) contained in the input data.

    maxNumberOfItemsPerTA : The number of items contained in the largest transaction.

    avgNumberOfItemsPerTA : The average number of items contained in a transaction.

    minimumSupport : The minimum relative support value (#supporting transactions / #total transactions) satisfied by all rules.

    minimumConfidence : The minimum confidence value satisfied by all rules. Confidence is calculated as (support (rule) / support(antecedent)).

    lengthLimit : The maximum number of items contained in a rule which was used to limit the number of rules.

    numberOfItems : The number of different items contained in the input data.

    numberOfItemsets : The number of itemsets contained in the model.

    numberOfRules : The number of rules contained in the model.


Items contained in itemsets


	<!ELEMENT AssocItem EMPTY>
		
	<!ATTLIST AssocItem
	     id           %ELEMENT-ID;       #REQUIRED
	     value        CDATA              #REQUIRED
	     mappedValue  CDATA              #IMPLIED
	     weight       %REAL-NUMBER;      #IMPLIED
	>

Attribute description:

    id : An identification to uniquely identify an item.

    value : The value of the item as in the input data.

    mappedValue : Optional, a value to which the original item value is mapped. For instance, this could be a product name if the original value is an EAN code.

    weight : The weight of the item. For example, the price or value of an item.


Itemsets which are contained in rules

					
	<!ELEMENT AssocItemset (Extension*, AssocItemRef+)>
	
	<!ATTLIST AssocItemset
	     id              %ELEMENT-ID;        #REQUIRED
	     support         %PROB-NUMBER;       #REQUIRED
	     numberOfItems   %INT-NUMBER;        #REQUIRED
	>
	

Attribute description:

    id : An identification to uniquely identify an itemset.

    support : The relative support of the itemset.

    numberOfItems : The number of items contained in this itemset.

    Subelements : Item references to point to elements of type item.

					
	<!ELEMENT AssocItemRef EMPTY>
					
	<!ATTLIST AssocItemRef
	     itemRef           %ELEMENT-ID;     #REQUIRED
	>
	

Attribute description:

    itemRef : The id value of an item element.


Rules: Elements of the form <antecedent itemset> => <consequent itemset>

					
	<!ELEMENT AssocRule( Extension* )>

	<!ATTLIST AssocRule
	     support           %PROB-NUMBER;      #REQUIRED
	     confidence        %PROB-NUMBER;      #REQUIRED
	     antecedent        %ELEMENT-ID;       #REQUIRED
	     consequent        %ELEMENT-ID;       #REQUIRED
	>
	

Attribute definitions:

    support : The relative support of the rule.

    confidence : The confidence of the rule.

    antecedent : The id value of the itemset which is the antecedent of the rule.

    consequent : The id value of the itemset which is the consequent of the rule.


Example:

Let's assume we have four transactions with the following data:

    t1: Cracker, Coke, Water

    t2: Cracker, Water

    t3: Cracker, Water

    t4: Cracker, Coke, Water

			
	<?xml version="1.0" ?>
	<PMML version="1.1">

	<Header copyright="www.dmg.org" 
	     description="example model for association rules"/>

	<DataDictionary numberOfFields="1"/>
	<DataField name="item" optype="categorical"/>
	</DataDictionary>
	
	<AssociationModel>
	
	<AssocInputStats numberOfTransactions="4" 
	     numberOfItems="3" minimumSupport="0.6" 
             minimumConfidence="0.5" numberOfItemsets="3" 
	     numberOfRules="2"/>
	
	<!-- We have three items in our input data -->
	
	<AssocItem id="1"value="Cracker"/>
	<AssocItem id="2"value="Coke"/>
	<AssocItem id="3"value="Water"/>
	
	<!-- and two frequent itemsets with a single item -->
	
	<AssocItemset id="1"support="1.0" 
	     numberOfItems="1"/>
	<AssocItemRef itemRef="1"/>
	</AssocItemset>
	
	<AssocItemset id="2" support="1.0"
	     numberOfItems="1"/>
	<AssocItemRef itemRef="3"/>
	</AssocItemset>
	
	<!-- and one frequent itemset with two items. -->
	
	<AssocItemset id="3" support="1.0" 
	     numberOfItems="2"/>
	<AssocItemRef itemRef="1"/>
	<AssocItemRef itemRef="3"/>
	</AssocItemset>

	<!-- Two rules satisfy the requirements -->
	
	<AssocRule support="1.0" confidence="1.0"
	     antecedent="1" consequent="2"/>
	<AssocRule support="1.0" confidence="1.0"
	     antecedent="2" consequent="1"/>

	</AssociationModel>
	</PMML>
e-mail info at dmg.org