CompositionDatasetExperimental

See Javadoc for complete documentation of this class.

Usage: *No options*

Available Operations

These commands can be used to perform a variety of tasks, ranging from defining important settings about the object to actually using it.

<output> = clone [-empty] – Create a copy of this dataset
-empty: Do not copy entries from dataset into clone

<output> = split <number|fraction> – Randomly select and remove entries from dataset
number|fraction: Either the fraction or number of entries to be removed
output: New dataset containing randomly selected entries that were in this dataset

<output> = subset <number|fraction> – Generate a random subset from this dataset
number|fraction: Either the fraction or number of entries to select
output: New dataset containing random selection from this dataset

add $<dataset> [-force] – Add entries from another dataset
dataset: Dataset to be merged with this one
-force: Optional: Whether to force merge if attributes / classes / properties are different
If attributes, classes, or properties are different, attributes and class values in new entries (i.e., those from the other dataset) will be deleted and properties will be merged

add <entries...> – Add entries to a dataset
entries...: Strings describing entries to be added

attributes composition <true|false> – Set whether to use composition as attributes
true|false: - Whether to use composition as attributes
By default, this class does not use composition (by itself) as attributes

attributes expanders add <method> [<options...>] – Add an attribute expander to be run after generating attributes
method: How to expand attributes. Name of a BaseAttributeExpander ("?" to print available methods)
options...: Any options for the expansion method These expanders are designed to create new attributes based on existing ones.

attributes expanders clear – Clear the current list of attribute expanders

attributes expanders run – Run the currently-defined list of attribute expanders

attributes generate – Generate attributes for each entry

attributes generators add <method> [<options...>] – Add an attribute generators to create additional attributes
method: New generation method. Name of a BaseAttributeGenerator ("?" to print available methods)
options...: Any options for the generator method These expanders are designed to create new attributes tailored for a specific application.

attributes generators clear – Clear the current list of attribute generators

attributes generators run – Run the currently-defined list of attribute expanders

attributes properties <directory> – Specify directory that contains the elemental property lookup files
directory: Desired directory

attributes properties add <names...> – Add elemental properties to use when generating attributes

attributes properties add set <name> – Add in all elemental properties from a pre-defined set
name: Name of the pre-defined set

attributes properties remove <names...> – Remove properties from list of those used when generating attributes
names...: Name of properties to remove

attributes properties – List which elemental properties are used to generate attributes

attributes rank <number> <method> [<options...>] – Rank attributes based on predictive power
number: Number of top attributes to print
method: Method used to rank attributes. Name of a BaseAttributeEvaluator ("?" to print available methods)
options...: Options for the evaluation method.

attributes – Print all attributes

combine $<dataset> – Add entries from another dataset
dataset: Dataset to combine with this dataset

filter <include|exclude> <method> [<options...>] – Run dataset through a filter
include|exclude: Whether to include/exclude only entries that pass the filter
method: Filtering method. Name of a BaseDatasetFilter ("?" to print available methods)
options...: Options for the filter

generate <method> [>options<] – Generate new entries
method: Name of a BaseEntryGenerator. ("?" for options)
options: Any options for the entry generator

import <filename> [<options...>] – Import data by reading a file
filename: Name of file to import data from
options...: Any options used when parsing this dataset (specific to type of Dataset)

modify <method> [<options>] – Modify the dataset
method: How to modify dataset. Name of a BaseDatasetModifier. ("?" to print available methods)
options: Any options for the dataset

rank <number> <maximum|minimum> <measured|predicted> <method> [<options>] – Print the top ranked entries based by some measure
number: Number of top entries to print
maximum|minimum: Whether to print entries with the largest or smallest objection function
measured|predicted: Whether to use the measured or predicted values when calculation
method: Object function used to rank entries. Name of a BaseEntryRanker ("?" for available methods)
options...: Any options for the objective function

target <name> [-keep] – Set class variable to be a certain property
name: Name of property to use as class variable
-keep: Whether to keep entries without a measurement for this property

Available Print Commands

These commands are run by calling "print <variable name> <command> [<options>]". Any output from that command will be printed to standard output.

details – Print details about this class

dist – Print distribution of entries between known classes

Available Save Formats

Variables of this type can be saved in the following formats:

arff – Weka's ARFF format.
Requires that a measured value is available for the class variable of each entry.

comp – All properties with composition written by element fraction
Very similar to the "prop" format"

csv – Comma-separated value format.
The value of each attribute and the measured class variable, if defined.

prop – Print out the measured and predicted properties

stats – Writes predicted and measured class variables.
This is intended to allow an external program to evaluate model performance.