pysmac.utils package¶
pysmac.utils.java_helper module¶
-
pysmac.utils.java_helper.
check_java_version
(java_executable='java')[source]¶ Small function to ensure that Java (version >= 7) was found.
As SMAC requires a Java Runtime Environment (JRE), pysmac checks that a adequate version (>7) has been found. It raises a RuntimeError exception if no JRE or an out-dated version was found.
Parameters: java_executable (str) – callable Java binary. It is possible to pass additional options via this argument to the JRE, e.g. “java -Xmx128m” is a valid argument. Raises: RuntimeError
pysmac.utils.multiprocessing_wrapper module¶
-
class
pysmac.utils.multiprocessing_wrapper.
MyPool
(processes=None, initializer=None, initargs=(), maxtasksperchild=None)[source]¶ Bases:
multiprocessing.pool.Pool
Subclass to use the NoDeamonProcesses as workers in a Pool.
-
Process
¶ alias of
NoDaemonProcess
-
-
class
pysmac.utils.multiprocessing_wrapper.
NoDaemonProcess
(group=None, target=None, name=None, args=(), kwargs={})[source]¶ Bases:
multiprocessing.process.Process
A subclass to avoid the subprocesses used for the individual SMAC runs to be deamons.
As it turns out, the Java processes running SMAC cannot be deamons. To change the default behavior of the multiprocessing module, one has to derive a subclass and overwrite the _get_deamon, _set_deamon methods appropriately.
-
daemon
¶
-
pysmac.utils.smac_input_readers module¶
-
pysmac.utils.smac_input_readers.
read_pcs
(filename)[source]¶ Function to read a SMAC pcs file (format according to version 2.08).
Parameters: filename (str) – name of the pcs file to be read Returns: tuple – (parameters as a dict, conditionals as a list, forbiddens as a list)
pysmac.utils.smac_output_readers module¶
-
pysmac.utils.smac_output_readers.
json_parse
(fileobj, decoder=<json.decoder.JSONDecoder object>, buffersize=2048)[source]¶ Small function to parse a file containing JSON objects separated by a new line. This format is used in the live-rundata-xx.json files produces by SMAC.
-
pysmac.utils.smac_output_readers.
read_instance_features_file
(fn)[source]¶ Function to read a instance_feature file.
Returns: tuple – first entry is a list of the feature names, second one is a dict with ‘instance name’ - ‘numpy array containing the features’ key-value pairs
-
pysmac.utils.smac_output_readers.
read_instances_file
(fn)[source]¶ Reads the instance names from an instace file
Parameters: fn (str) – name of file to read Returns: list – each element is a list where the first element is the instance name followed by additional information for the specific instance.
-
pysmac.utils.smac_output_readers.
read_paramstrings_file
(fn)[source]¶ Function to read a paramstring file. Every line in this file corresponds to a full configuration. Everything is stored as strings and without knowledge about the pcs, converting that into any other type would involve guessing, which we shall not do here.
Parameters: fn (str) – the name of the paramstring file Returns: dict – with key-value pairs ‘parameter name’-‘value as string’
-
pysmac.utils.smac_output_readers.
read_runs_and_results_file
(fn)[source]¶ Converting a runs_and_results file into a numpy array.
Almost all entries in a runs_and_results file are numeric to begin with. Only the 14th column contains the status which is encoded as ints by SAT = 1, UNSAT = 0, TIMEOUT = -1, everything else = -2.
Value Representation SAT 2 UNSAT 1 TIMEOUT 0 Others -1 Returns: numpy_array(dtype = double) – the data
-
pysmac.utils.smac_output_readers.
read_trajectory_file
(fn)[source]¶ Reads a trajectory file and returns a list of dicts with all the information.
Due to the way SMAC stores every parameter’s value as a string, the configuration returned by this function also has every value stored as a string. All other values, like “Estimated Training Preformance” and so on are floats, though.
Parameters: fn (str) – name of file to read Returns: list of dicts – every dict contains the keys: “CPU Time Used”,”Estimated Training Performance”,”Wallclock Time”,”Incumbent ID”,”Automatic Configurator (CPU) Time”,”Configuration”
-
pysmac.utils.smac_output_readers.
read_validationCallStrings_file
(fn)[source]¶ Reads a validationCallString file into a list of dictionaries.
Returns: list of dicts – each dictionary contains ‘parameter name’ and ‘parameter value as string’ key-value pairs
-
pysmac.utils.smac_output_readers.
read_validationObjectiveMatrix_file
(fn)[source]¶ reads the run data of a validation run performed by SMAC.
For cases with instances, not necessarily every instance is used during the configuration phase to estimate a configuration’s performance. If validation is enabled, SMAC reruns parameter settings (usually just the final incumbent) on the whole instance set/a designated test set. The data from those runs is stored in separate files. This function reads one of these files.
Parameters: fn (str) – the name of the validationObjectiveMatrix file Returns: dict – configuration ids as keys, list of performances on each instance as values. Todo
testing of validation runs where more than the final incumbent is validated
pysmac.utils.state_merge module¶
-
pysmac.utils.state_merge.
find_largest_file
(glob_pattern)[source]¶ Function to find the largest file matching a glob pattern.
Old SMAC version keep several versions of files as back-ups. This helper can be used to find the largest file (which should contain the final output). One could also go for the most recent file, but that might fail when the data is copied.
Parameters: glob_pattern (string) – a UNIX style pattern to apply Returns: string – largest file matching the pattern
-
pysmac.utils.state_merge.
read_sate_run_folder
(directory, rar_fn='runs_and_results-it*.csv', inst_fn='instances.txt', feat_fn='instance-features.txt', ps_fn='paramstrings-it*.txt')[source]¶ Helper function that can reads all information from a state_run folder.
To get all information of a SMAC run, several different files have to be read. This function provides a short notation for gathering all data at once.
Parameters: - directory (str) – the location of the state_run_folder
- rar_fn (str) – pattern to find the runs_and_results file
- inst_fn (str) – name of the instance file
- feat_fn (str) – name of the instance feature file. If this file is not found, pysmac assumes no instance features.
- ps_fn (str) – name of the paramstrings file
Returns: tuple – (configurations returned by read_paramstring_file,
instance names returned by read_instance_file,
instance features returned by read_instance_features_file,
actual run data returned by read_runs_and_results_file)
-
pysmac.utils.state_merge.
state_merge
(state_run_directory_list, destination, check_scenario_files=True, drop_duplicates=False, instance_subset=None)[source]¶ Function to merge multiple state_run directories into a single run to be used in, e.g., the fANOVA.
To take advantage of the data gathered in multiple independent runs, the state_run folders have to be merged into a single directory that resemble the same structure. This allows easy application of the pyfANOVA on all run_and_results files.
Parameters: - state_run_directory_list (list of str) – list of state_run folders to be merged
- destination (str) – a directory to store the merged data. The folder is created if needed, and already existing data in that location is silently overwritten.
- check_scenario_files (bool) – whether to ensure that all scenario files in all state_run folders are identical. This helps to avoid merging runs with different settings. Note: Command-line options given to SMAC are not compared here!
- drop_duplicates (bool) – Defines how to handle runs with identical configurations. For deterministic algorithms the function’s response should be the same, so dropping duplicates is safe. Keep in mind that every duplicate effectively puts more weight on a configuration when estimating parameter importance.
- instance_subset (list) – Defines a list of instances that are used for the merge. All other instances are ignored. (Default: None, all instances are used)
pysmac.utils.pcs_merge module¶
-
pysmac.utils.pcs_merge.
merge_configuration_spaces
(*args, **kwargs)[source]¶ Convenience function to merge several algorithms with their respective config spaces into a single one.
Using pySMAC to optimize the parameters of a single function/algorithm is a very important usecase, but finding the best algorithm and its configuration across multiple choices. For example, when multiple different algorithms can be used to solve the same problem, and it is unclear which one will be the best choice.
The arguments to this function is a number of tuples, each with the following content: (callable, pcs, conditionals, forbiddens). The callable is a valid python function in the current namespace. The Parameter Configuration Space (pcs) is the definition of its parameters, while conditionals and forbiddens express dependencies and restrictions within that space.
Returns: the merged configuration space, a list of conditionals, a list of forbiddens, and a string that defines two functions (see Combining Parameter Configuration Spaces).