Installation
jMAF requires Java SE 8 Runtime Environment (Java SE 6 Runtime Environment is required in case of legacy versions for macOS and Linux)
installed on your system to run properly. More precisely, a 32-bit version of Java Runtime Environment (sometimes also denoted as x86) is required.
Please note, however, that you may have more recent version of Java Runtime Environment installed on the system as well.
Moreover, path to this required version Java SE Runtime Environment does not need
to be present in PATH
variable nor set as JAVA_HOME
. The latest version of Java SE 8
Runtime Environment is available from Oracle JRE 8 download site. You can also download SE 6
version of Java Runtime Environment from Oracle JRE 6 download site . Open versions of Java SE Runtime Environments are available from OpenJDK JRE download site.
Once Java SE Runtime Environment installed, please run jMAF by
executing jMAF.bat
.
If you have changed the path to which
Java SE Runtime Environment is installed by default please
modify vm
parameter in jMAF.bat
.
ISF data files
jMAF can handle classification problems described by regular
attributes and attributes with ordered domains. The best way to
analyze large data sets is jMAF is to transform them into text file in
ISF format. An exemplary ISF
file looks like this:
**ATTRIBUTES + SEX: [F, M] + IC: [1, 2, 3] + OCSO: (integer) + OCOMO: [0, 1, 2, 3] + TAB: [1, 2, 3, 4] + SCP: [1, 2, 3] + FCC: (integer) + GPRS: [1, 2] + PROFILE: [A, B, C] decision: PROFILE **PREFERENCES SEX: none IC: gain OCSO: gain OCOMO: gain TAB: gain SCP: gain FCC: cost GPRS: gain PROFILE: gain **EXAMPLES M 1 1 1 1 3 4 1 A F 1 1 1 1 1 3 2 A M 3 3 2 4 3 2 2 A M 1 3 2 1 2 2 2 B M 3 3 2 4 3 1 2 C M 1 2 1 3 3 1 2 C *END
ISF
file consists of three sections: **ATTRIBUTES
, **PREFERENCES
and **EXAMPLES
and is finished with **END
. The sequence of sections is essential and should not be changed, otherwise jMAF will report a data error.
**ATTRIBUTES
The **ATTRIBUTES
section contains definitions of attributes used in the problem. Each definition is written in a single line. It starts with "+" for active attributes (i.e., the ones that should be used during analysis) and "-" for inactive ones. Then, there is the name of the attribute followed by ":". Afterwards the domain definition follows. The ISF
format supports numeric and symbolic domains.
The numeric domains are:
- continuous - for real-valued attributes
- integer - for integer-valued attributes
They are denoted as (continuous)
and (integer)
accordingly. Thus, the definitions of some numerical attributes may look like:
Price: (continuous) Seats: (integer)
The symbolic domain is given as a list of possible values in square
brackets. In the above example all attributes except
for OCSO
and FCC
have such domains.
The section is finished with declaration of the decision attribute
in form of "decision:
name", where name is the name of the decision attribute.
NOTE: The decision attribute has to be active (i.e., there should
be "+
" before its name) and it has to have a symbolic
domain. Otherwise, jMAF will show an error.
**PREFERENCES
The **PREFERENCES
section contains information on the direction of preferences for the defined attributes. It has to contain entries for all attributes defined in the **ATTRIBUTES
section, even the inactive ones. Each entry is written in a single line and has the following syntax:
name: direction
where name is the name of the attribute, and direction is one of the following:
- gain - the attribute is treated as the gain ordered; for numerical domains greater values are preferred, and for symbolic domains, the later a value appears in the definition list, the better it is
- cost - the attribute it considered to be the cost ordered; for numerical domains lower values are preferred, and for symbolic domains, the earlier a value appears on the list, the better it is
- none - the attribute is treated as the "regular" attribute, i.e., during the analysis it is only checked if their values are equal or different and no preference is taken into account.
In the above example IC
is gain, i.e., the value "3"
is better than "2" (the higher value the better),
and FCC
is cost, i.e., the lower the value the better.
**EXAMPLES
The section contains the values of the examples (objects) that should be analyzed. Each object is stored in a single line as a list of values separated with ",". The values should be given exactly in the same order as the attributes in the **ATTRIBUTES
section.
NOTE 1: The objects should have specified values of the decision attribute.
NOTE 2: Values can be also separated with the tab character, which
allows copying and pasting data from spreadsheets without a need of
converting them. Actually this is how we usually work with data. We
create **ATTRIBUTES
and **PREFERENCES
section in a text editor, prepare data in a spreadsheet (especially if it requires additional processing, like calculations or filtering) and finally copy data into the **EXAMPLES
section.