Configuration Options

Simulation setup options

The section that’s relevant for the simulation setup should look something like this:

 1binning_options:
 2   block_size: 10 # Number of trajectories to be processed in blocks
 3   center_freq: 1 # How frequently do we add new Voronoi centers?
 4   max_centers: 300 # Maximum number of Voronoi centers to be added
 5   traj_per_bin: 100 # Number of trajectories per Voronoi center
 6path_options: # this entire section should be automatically set by the tool
 7   WESTPA_path: /home/USER/westpa
 8   bng_path: /home/USER/apps/anaconda3/lib/python3.7/site-packages/bionetgen/bng-linux
 9   bngl_file: /home/USER/webng/testing/test.bngl
10   sim_name: /home/USER/webng/testing/test # you can adjust sim folder here
11propagator_options:
12   pcoords: # These should match observables in your model
13   - Atot
14   - Btot
15   propagator_type: libRoadRunner # this is the suggested propagator
16sampling_options:
17   dimensions: 2 # Dimensionality of the WESTPA progress coordinates
18   max_iter: 10 # Maximum number of WE iterations
19   pcoord_length: 10 # Number of data points per WE iteration
20   tau: 100 # Resampling frequency

you can change various aspects of the simulation setup in this file. Let’s look at each block separately.

Binning

1binning_options:
2   block_size: 10 # Number of trajectories to be processed in blocks
3   center_freq: 1 # How frequently do we add new Voronoi centers?
4   max_centers: 300 # Maximum number of Voronoi centers to be added
5   traj_per_bin: 100 # Number of trajectories per Voronoi center

block_size refers to how many trajectories will be ran at a time. This is important for multicore runs, try to keep the blocksize an integer multiple of the number of cores you have. center_freq refers to how frequently voronoi bins will be placed, in units of WE iterations. max_centers is the maximum number of voronoi centers that will be placed. Finally, traj_per_bin is the number of trajectories in each voronoi center.

Path Options

1path_options: # this entire section should be automatically set by the tool
2   WESTPA_path: /home/USER/westpa
3   bng_path: /home/USER/apps/anaconda3/lib/python3.7/site-packages/bionetgen/bng-linux
4   bngl_file: /home/USER/webng/testing/test.bngl
5   sim_name: /home/USER/webng/testing/test # you can adjust sim folder here

Most of these option should be set automatically if WESTPA and BNG are both python importable. WESTPA_path is the path to WESTPA to be used, bng_path is the path where BNG2.pl lives. bngl_file is the bngl model and sim_name is the folder that will be used for the WESTPA setup.

Propagator Options

1propagator_options:
2   pcoords: # These should match observables in your model
3   - Atot
4   - Btot
5   propagator_type: libRoadRunner # this is the suggested propagator

pcoords is the list progress coordinates to be used for WESTPA and should match the observables in your BNGL model. propagator_type is the type of propagator to be used. If available, use libRoadRunner since it’s currently significantly more efficient for WESTPA runs. If not, you can select “executable” propagator which uses BNG2.pl in combination with bash scripts for each walker.

Sampling Options

1sampling_options:
2   dimensions: 2 # Dimensionality of the WESTPA progress coordinates
3   max_iter: 10 # Maximum number of WE iterations
4   pcoord_length: 10 # Number of data points per WE iteration
5   tau: 100 # Resampling frequency

dimensions is the number of dimensions to be used for WESTPA progress coordinates and should match the number of BNGL observables you are using. max_iter is the maximum number of WE iterations to be ran (this can be changed later from within the setup). pcoord_length is the number of data points each walker will return. tau is the length of each BNGL simulation/walker.

Analysis options

When you first create a setup configuration file like mysim.yaml, you will see an analysis section like this

 1analyses:
 2   enabled: false
 3   work-path: /home/USER/webng/testing/test/analysis # the folder to run the analysis under
 4   average:
 5      dimensions: null # you can limit the tool to the first N dimensions
 6      enabled: false # this needs to be set to true to run the analysis
 7      first-iter: null # first iteration to start the averaging
 8      last-iter: null # first iteration to end the averaging
 9      mapper-iter: null # the iteration to pull the voronoi bin mapper from, last iteration by default
10      normalize: false # normalizes the distributions
11      output: average.png # output file name
12      plot-energy: false # plots -ln of probabilies
13      plot-opts: # various plotting options like font sizes and line width
14         name-font-size: 12
15         voronoi-col: 0.75
16         voronoi-lw: 1
17      plot-voronoi: false # true if you want to plot voronoi centers
18      smoothing: 0.5 # the amount of smoothing to apply
19   evolution:
20      avg_window: null # number of iterations to average for each point in the plot
21      dimensions: null # you can limit the tool to the first N dimensions
22      enabled: false # this needs to be set to true to run the analysis
23      normalize: false # normalizes the distributions
24      output: evolution.png # output file name
25      plot-energy: false # plots -ln of probabilies
26      plot-opts: # various plotting options like font sizes and line width
27         name-font-size: 12

Let’s take a look at individual sections.

1analyses:
2   enabled: false
3   work-path: /home/USER/webng/testing/test/analysis # the folder to run the analysis under

This is upper level analysis block and has a single option called enabled. If set to false, none of the analyses will run. Each analysis subsection will have the same enabled option to set if that particular analysis will be ran or not. work-path is the folder where all analysis will be ran.

Average

 1average:
 2   dimensions: null # you can limit the tool to the first N dimensions
 3   enabled: false # this needs to be set to true to run the analysis
 4   first-iter: null # first iteration to start the averaging
 5   last-iter: null # first iteration to end the averaging
 6   mapper-iter: null # the iteration to pull the voronoi bin mapper from, last iteration by default
 7   normalize: false # normalizes the distributions
 8   output: average.png # output file name
 9   plot-energy: false # plots -ln of probabilies
10   plot-opts: # various plotting options like font sizes and line width
11      name-font-size: 12
12      voronoi-col: 0.75
13      voronoi-lw: 1
14   plot-voronoi: false # true if you want to plot voronoi centers
15   smoothing: 0.5 # the amount of smoothing to apply

This is the block for Average analysis. dimensions is normally set to null which makes the tool plot all dimensions. If this is set to N the tool will plot the first N dimensions. first-iter and last-iter are the iterations to start and stop the averaging. mapper-iter is the iteration to pull the voronoi mapper from, if you don’t want the mapper from the final WE iteration. normalize can be used to enable normalization of probability distributions before plotting. output is the file name for the output and this can be set to a png or pdf file. plot-energy takes the -ln of the probabilities before plotting. plot-voronoi controls if the voronoi centers are plotted on top of the probability distributions. smoothing can be changed to reduce or increase the gaussian smoothing used for probability distributions. plot-opts contain some options for plotting. name-front-size is the font-size used in plotting. voronoi-col is the color to be used for voronoi bins and voronoi-lw is the line width for the same lines.

Evolution

1evolution:
2   avg_window: 1 # number of iterations to average for each point in the plot
3   dimensions: null # you can limit the tool to the first N dimensions
4   enabled: false # this needs to be set to true to run the analysis
5   normalize: false # normalizes the distributions
6   output: evolution.png # output file name
7   plot-energy: false # plots -ln of probabilies
8   plot-opts: # various plotting options like font sizes and line width
9      name-font-size: 12

This is the block for Evolution analysis. avg_window the number of iterations to average over for every data point. dimensions is normally set to null which makes the tool plot all dimensions. If this is set to N the tool will plot the first N dimensions. normalize can be used to enable normalization of probability distributions before plotting. output is the file name for the output and this can be set to a png or pdf file. plot-opts contain some options for plotting. name-front-size is the font-size used in plotting.

Cluster

 1cluster:
 2   assignments: null
 3   cluster-count: 4
 4   enabled: true
 5   first-iter: null
 6   last-iter: null
 7   metastable-states-file: null
 8   normalize: null
 9   states:
10   - coords:
11      - - 20.0
12      - 4.0
13      label: a
14   - coords:
15      - - 4.0
16      - 20.0
17      label: b
18   symmetrize: null
19   transition-matrix: null

This is the block for Cluster analysis. assignments is the assignment file to be used for clustering. This can be pointed to a assignment file you generated using w_assign or, if left null, the tool will attempt to generate an assignment file itself. states is where you can define states for w_assign if you want the tool to run it for you. cluster-count is the number PCCA+ will try to cluster the data into. first-iter and last-iter are WE iterations to pull the data for clustering. metastable-states-file is a python pickle file that contains a dictionary which defined which bin is assigned to which metastable state. normalize makes it so that the output text is normalized to percentages. symmetrize controls if the transition matrix is made symmetrical or not. transition-matrix can point to a binary numpy file where you give the tool a custom transition matrix or, if left null, the tool will generate one for you using the assignment file.

Network

1network:
2   enabled: true
3   metastable-states-file: null
4   pcca-pickle: null
5   state-labels: null

This is the block for Network generation. metastable-states-file is a python pickle file that contains a dictionary which defined which bin is assigned to which metastable state. pcca-pickle is the python pickle object that the cluster analysis generates (or you can use pyGPCCA to generate one yourself). state-labels is the labels you want to use for each cluster generated by Cluster