jMAF requires Java SE 6 installed on your system to run
properly. Please note that you can have more recent version of
installed as well. Moreover, path to Java SE 6 does not need
to be present in
JAVA_HOME. The last version of Java SE 6
Runtime Environment is 6u45. You can download this
version of Java SE Runtime Environment from Oracle site.
Once Java SE 6 Runtime Environment installed, please run jMAF by
If you have changed the path to which
Java SE 6 Runtime Environment is installed by default please
vm parameter in
ISF data files
jMAF can handle classification problems described by regular
attributes and attributes with ordered domains. The best way to
analyze large data sets is jMAF is to transform them into text file in
ISF format. An exemplary
ISF file looks like this:
**ATTRIBUTES + SEX: [F, M] + IC: [1, 2, 3] + OCSO: (integer) + OCOMO: [0, 1, 2, 3] + TAB: [1, 2, 3, 4] + SCP: [1, 2, 3] + FCC: (integer) + GPRS: [1, 2] + PROFILE: [A, B, C] decision: PROFILE **PREFERENCES SEX: none IC: gain OCSO: gain OCOMO: gain TAB: gain SCP: gain FCC: cost GPRS: gain PROFILE: gain **EXAMPLES M 1 1 1 1 3 4 1 A F 1 1 1 1 1 3 2 A M 3 3 2 4 3 2 2 A M 1 3 2 1 2 2 2 B M 3 3 2 4 3 1 2 C M 1 2 1 3 3 1 2 C *END
ISF file consists of three sections:
**EXAMPLES and is finished with
**END. The sequence of sections is essential and should not be changed, otherwise jMAF will report a data error.
**ATTRIBUTES section contains definitions of attributes used in the problem. Each definition is written in a single line. It starts with "+" for active attributes (i.e., the ones that should be used during analysis) and "-" for inactive ones. Then, there is the name of the attribute followed by ":". Afterwards the domain definition follows. The
ISF format supports numeric and symbolic domains.
The numeric domains are:
- continuous - for real-valued attributes
- integer - for integer-valued attributes
They are denoted as
(integer) accordingly. Thus, the definitions of some numerical attributes may look like:
Price: (continuous) Seats: (integer)
The symbolic domain is given as a list of possible values in square
brackets. In the above example all attributes except
FCC have such domains.
The section is finished with declaration of the decision attribute
in form of "
decision: name", where name is the name of the decision attribute.
NOTE: The decision attribute has to be active (i.e., there should
+" before its name) and it has to have a symbolic
domain. Otherwise, jMAF will show an error.
**PREFERENCES section contains information on the direction of preferences for the defined attributes. It has to contain entries for all attributes defined in the
**ATTRIBUTES section, even the inactive ones. Each entry is written in a single line and has the following syntax:
where name is the name of the attribute, and direction is one of the following:
- gain - the attribute is treated as the gain ordered; for numerical domains greater values are preferred, and for symbolic domains, the later a value appears in the definition list, the better it is
- cost - the attribute it considered to be the cost ordered; for numerical domains lower values are preferred, and for symbolic domains, the earlier a value appears on the list, the better it is
- none - the attribute is treated as the "regular" attribute, i.e., during the analysis it is only checked if their values are equal or different and no preference is taken into account.
In the above example
IC is gain, i.e., the value "3"
is better than "2" (the higher value the better),
FCC is cost, i.e., the lower the value the better.
The section contains the values of the examples (objects) that should be analyzed. Each object is stored in a single line as a list of values separated with ",". The values should be given exactly in the same order as the attributes in the
NOTE 1: The objects should have specified values of the decision attribute.
NOTE 2: Values can be also separated with the tab character, which
allows copying and pasting data from spreadsheets without a need of
converting them. Actually this is how we usually work with data. We
section in a text editor, prepare data in a spreadsheet (especially if it requires additional processing, like calculations or filtering) and finally copy data into the