PMML Data Mining Schema
PMML1.1 Menu

Home

PMML Conformance

Header

Data Dictionary

Mining Schema

Statistics

Normalization

Tree Model

General Regression

General Structure

Asscocation Rules

Neural Network

Center and Distribution - based Clustering

PMML 1.1 DTD

Download PMML v1.1 (zip)

PMML 1.1 -- Mining Schema

Each model contains one mining schema which lists fields as used in that model. This is a subset of the fields as defined in the data dictionary. While the mining schema contains information that is specific to a certain model, the data dictionary contains data definitions which do not vary per model.

 
<!ELEMENT MiningSchema (Extension*, MiningField+) >
<!ENTITY  % FIELD-USAGE-TYPE "(active | predicted | supplementary)" > 
		 
<!ENTITY  % OUTLIER-TREATMENT-METHOD "( asIs | asMissingValues | asExtremeValues ) " >

usageType

    active: field used as input (independent field)

    predicted: field whose value is predicted by the model

    supplementary: field holding additional descriptive information

Supplementary fields are not required to apply a model. They are provided as additional information for explanatory purpose, though. When some field has gone through preprocessing transformations before a model is built, then an additional supplementary field is typically used to describe the statistics for the original field values.

outliers

    asIs: field values treated at face value

    asMissingValues: outlier values are treated as if they were missing

    asExtremeValues: outlier values are changed to a specific high or low value defined in MiningField

					
	<!ELEMENT MiningField (Extension*) > 
				 
	<!ATTLIST MiningField 
	     name                   %FIELD-NAME;                    #REQUIRED
	     usageType              %FIELD-USAGE-TYPE;              "active" 
	     outliers               %OUTLIER-TREATMENT-METHOD;      "asIs" 
	     lowValue               %NUMBER;                        #IMPLIED 
	     highValue              %NUMBER;                        #IMPLIED
	> 

    name: symbolic name of field, same as the name of some field in the data dictionary

    highValue and lowValue: used in conjunction with %outlierTreatmentMethod "asExtremeValues" as values for records with outliers in this field if x < lowValue then x = lowValue


Conformance

  • outlier treatment 'asIs', i.e. the default value of the attribute outliers in MiningField, is in core; other options are not in core.
e-mail info at dmg.org