pysmac.utils package

pysmac.utils.java_helper module

pysmac.utils.java_helper.check_java_version(java_executable='java')[source]

Small function to ensure that Java (version >= 7) was found.

As SMAC requires a Java Runtime Environment (JRE), pysmac checks that a adequate version (>7) has been found. It raises a RuntimeError exception if no JRE or an out-dated version was found.

Parameters:java_executable (str) – callable Java binary. It is possible to pass additional options via this argument to the JRE, e.g. “java -Xmx128m” is a valid argument.
Raises:RuntimeError
pysmac.utils.java_helper.smac_classpath()[source]

Small function gathering all information to build the java class path.

Returns:string representing the Java classpath for SMAC

pysmac.utils.multiprocessing_wrapper module

class pysmac.utils.multiprocessing_wrapper.MyPool(processes=None, initializer=None, initargs=(), maxtasksperchild=None)[source]

Bases: multiprocessing.pool.Pool

Subclass to use the NoDeamonProcesses as workers in a Pool.

Process

alias of NoDaemonProcess

class pysmac.utils.multiprocessing_wrapper.NoDaemonProcess(group=None, target=None, name=None, args=(), kwargs={})[source]

Bases: multiprocessing.process.Process

A subclass to avoid the subprocesses used for the individual SMAC runs to be deamons.

As it turns out, the Java processes running SMAC cannot be deamons. To change the default behavior of the multiprocessing module, one has to derive a subclass and overwrite the _get_deamon, _set_deamon methods appropriately.

daemon

pysmac.utils.smac_input_readers module

pysmac.utils.smac_input_readers.read_pcs(filename)[source]

Function to read a SMAC pcs file (format according to version 2.08).

Parameters:filename (str) – name of the pcs file to be read
Returns:tuple – (parameters as a dict, conditionals as a list, forbiddens as a list)
pysmac.utils.smac_input_readers.read_scenario_file(fn)[source]

Small helper function to read a SMAC scenario file.

:returns : dict – (key, value) pairs are (variable name, variable value)

pysmac.utils.smac_input_readers.write_pcs(pcs_filename, parameters, forbiddens, conditionals)[source]

Function to write a SMAC PCS file

pysmac.utils.smac_output_readers module

pysmac.utils.smac_output_readers.convert_param_dict_types(param_dict, pcs)[source]
pysmac.utils.smac_output_readers.json_parse(fileobj, decoder=<json.decoder.JSONDecoder object>, buffersize=2048)[source]

Small function to parse a file containing JSON objects separated by a new line. This format is used in the live-rundata-xx.json files produces by SMAC.

taken from http://stackoverflow.com/questions/21708192/how-do-i-use-the-json-module-to-read-in-one-json-object-at-a-time/21709058#21709058

pysmac.utils.smac_output_readers.read_instance_features_file(fn)[source]

Function to read a instance_feature file.

Returns:tuple – first entry is a list of the feature names, second one is a dict with ‘instance name’ - ‘numpy array containing the features’ key-value pairs
pysmac.utils.smac_output_readers.read_instances_file(fn)[source]

Reads the instance names from an instace file

Parameters:fn (str) – name of file to read
Returns:list – each element is a list where the first element is the instance name followed by additional information for the specific instance.
pysmac.utils.smac_output_readers.read_paramstrings_file(fn)[source]

Function to read a paramstring file. Every line in this file corresponds to a full configuration. Everything is stored as strings and without knowledge about the pcs, converting that into any other type would involve guessing, which we shall not do here.

Parameters:fn (str) – the name of the paramstring file
Returns:dict – with key-value pairs ‘parameter name’-‘value as string’
pysmac.utils.smac_output_readers.read_runs_and_results_file(fn)[source]

Converting a runs_and_results file into a numpy array.

Almost all entries in a runs_and_results file are numeric to begin with. Only the 14th column contains the status which is encoded as ints by SAT = 1, UNSAT = 0, TIMEOUT = -1, everything else = -2.

Value Representation
SAT 2
UNSAT 1
TIMEOUT 0
Others -1
Returns:numpy_array(dtype = double) – the data
pysmac.utils.smac_output_readers.read_trajectory_file(fn)[source]

Reads a trajectory file and returns a list of dicts with all the information.

Due to the way SMAC stores every parameter’s value as a string, the configuration returned by this function also has every value stored as a string. All other values, like “Estimated Training Preformance” and so on are floats, though.

Parameters:fn (str) – name of file to read
Returns:list of dicts – every dict contains the keys: “CPU Time Used”,”Estimated Training Performance”,”Wallclock Time”,”Incumbent ID”,”Automatic Configurator (CPU) Time”,”Configuration”
pysmac.utils.smac_output_readers.read_validationCallStrings_file(fn)[source]

Reads a validationCallString file into a list of dictionaries.

Returns:list of dicts – each dictionary contains ‘parameter name’ and ‘parameter value as string’ key-value pairs
pysmac.utils.smac_output_readers.read_validationObjectiveMatrix_file(fn)[source]

reads the run data of a validation run performed by SMAC.

For cases with instances, not necessarily every instance is used during the configuration phase to estimate a configuration’s performance. If validation is enabled, SMAC reruns parameter settings (usually just the final incumbent) on the whole instance set/a designated test set. The data from those runs is stored in separate files. This function reads one of these files.

Parameters:fn (str) – the name of the validationObjectiveMatrix file
Returns:dict – configuration ids as keys, list of performances on each instance as values.

Todo

testing of validation runs where more than the final incumbent is validated

pysmac.utils.state_merge module

pysmac.utils.state_merge.find_largest_file(glob_pattern)[source]

Function to find the largest file matching a glob pattern.

Old SMAC version keep several versions of files as back-ups. This helper can be used to find the largest file (which should contain the final output). One could also go for the most recent file, but that might fail when the data is copied.

Parameters:glob_pattern (string) – a UNIX style pattern to apply
Returns:string – largest file matching the pattern
pysmac.utils.state_merge.read_sate_run_folder(directory, rar_fn='runs_and_results-it*.csv', inst_fn='instances.txt', feat_fn='instance-features.txt', ps_fn='paramstrings-it*.txt')[source]

Helper function that can reads all information from a state_run folder.

To get all information of a SMAC run, several different files have to be read. This function provides a short notation for gathering all data at once.

Parameters:
  • directory (str) – the location of the state_run_folder
  • rar_fn (str) – pattern to find the runs_and_results file
  • inst_fn (str) – name of the instance file
  • feat_fn (str) – name of the instance feature file. If this file is not found, pysmac assumes no instance features.
  • ps_fn (str) – name of the paramstrings file
Returns:

tuple – (configurations returned by read_paramstring_file,

instance names returned by read_instance_file,

instance features returned by read_instance_features_file,

actual run data returned by read_runs_and_results_file)

pysmac.utils.state_merge.state_merge(state_run_directory_list, destination, check_scenario_files=True, drop_duplicates=False, instance_subset=None)[source]

Function to merge multiple state_run directories into a single run to be used in, e.g., the fANOVA.

To take advantage of the data gathered in multiple independent runs, the state_run folders have to be merged into a single directory that resemble the same structure. This allows easy application of the pyfANOVA on all run_and_results files.

Parameters:
  • state_run_directory_list (list of str) – list of state_run folders to be merged
  • destination (str) – a directory to store the merged data. The folder is created if needed, and already existing data in that location is silently overwritten.
  • check_scenario_files (bool) – whether to ensure that all scenario files in all state_run folders are identical. This helps to avoid merging runs with different settings. Note: Command-line options given to SMAC are not compared here!
  • drop_duplicates (bool) – Defines how to handle runs with identical configurations. For deterministic algorithms the function’s response should be the same, so dropping duplicates is safe. Keep in mind that every duplicate effectively puts more weight on a configuration when estimating parameter importance.
  • instance_subset (list) – Defines a list of instances that are used for the merge. All other instances are ignored. (Default: None, all instances are used)

pysmac.utils.pcs_merge module

pysmac.utils.pcs_merge.merge_configuration_spaces(*args, **kwargs)[source]

Convenience function to merge several algorithms with their respective config spaces into a single one.

Using pySMAC to optimize the parameters of a single function/algorithm is a very important usecase, but finding the best algorithm and its configuration across multiple choices. For example, when multiple different algorithms can be used to solve the same problem, and it is unclear which one will be the best choice.

The arguments to this function is a number of tuples, each with the following content: (callable, pcs, conditionals, forbiddens). The callable is a valid python function in the current namespace. The Parameter Configuration Space (pcs) is the definition of its parameters, while conditionals and forbiddens express dependencies and restrictions within that space.

Returns:the merged configuration space, a list of conditionals, a list of forbiddens, and a string that defines two functions (see Combining Parameter Configuration Spaces).