Configuration Options
Simulation setup options
The section that’s relevant for the simulation setup should look something like this:
1binning_options:
2 block_size: 10 # Number of trajectories to be processed in blocks
3 center_freq: 1 # How frequently do we add new Voronoi centers?
4 max_centers: 300 # Maximum number of Voronoi centers to be added
5 traj_per_bin: 100 # Number of trajectories per Voronoi center
6path_options: # this entire section should be automatically set by the tool
7 WESTPA_path: /home/USER/westpa
8 bng_path: /home/USER/apps/anaconda3/lib/python3.7/site-packages/bionetgen/bng-linux
9 bngl_file: /home/USER/webng/testing/test.bngl
10 sim_name: /home/USER/webng/testing/test # you can adjust sim folder here
11propagator_options:
12 pcoords: # These should match observables in your model
13 - Atot
14 - Btot
15 propagator_type: libRoadRunner # this is the suggested propagator
16sampling_options:
17 dimensions: 2 # Dimensionality of the WESTPA progress coordinates
18 max_iter: 10 # Maximum number of WE iterations
19 pcoord_length: 10 # Number of data points per WE iteration
20 tau: 100 # Resampling frequency
you can change various aspects of the simulation setup in this file. Let’s look at each block separately.
Binning
1binning_options:
2 block_size: 10 # Number of trajectories to be processed in blocks
3 center_freq: 1 # How frequently do we add new Voronoi centers?
4 max_centers: 300 # Maximum number of Voronoi centers to be added
5 traj_per_bin: 100 # Number of trajectories per Voronoi center
block_size
refers to how many trajectories will be ran at a time. This is important for
multicore runs, try to keep the blocksize an integer multiple of the number of cores you have.
center_freq
refers to how frequently voronoi bins will be placed, in units of WE
iterations. max_centers
is the maximum number of voronoi centers that will be placed.
Finally, traj_per_bin
is the number of trajectories in each voronoi center.
Path Options
1path_options: # this entire section should be automatically set by the tool
2 WESTPA_path: /home/USER/westpa
3 bng_path: /home/USER/apps/anaconda3/lib/python3.7/site-packages/bionetgen/bng-linux
4 bngl_file: /home/USER/webng/testing/test.bngl
5 sim_name: /home/USER/webng/testing/test # you can adjust sim folder here
Most of these option should be set automatically if WESTPA and BNG are both python importable.
WESTPA_path
is the path to WESTPA to be used, bng_path
is the path where BNG2.pl
lives. bngl_file
is the bngl model and sim_name
is the folder that will be used
for the WESTPA setup.
Propagator Options
1propagator_options:
2 pcoords: # These should match observables in your model
3 - Atot
4 - Btot
5 propagator_type: libRoadRunner # this is the suggested propagator
pcoords
is the list progress coordinates to be used for WESTPA and should match the
observables in your BNGL model. propagator_type
is the type of propagator to be used. If
available, use libRoadRunner since it’s currently significantly more efficient for WESTPA runs.
If not, you can select “executable” propagator which uses BNG2.pl in combination with bash scripts
for each walker.
Sampling Options
1sampling_options:
2 dimensions: 2 # Dimensionality of the WESTPA progress coordinates
3 max_iter: 10 # Maximum number of WE iterations
4 pcoord_length: 10 # Number of data points per WE iteration
5 tau: 100 # Resampling frequency
dimensions
is the number of dimensions to be used for WESTPA progress coordinates and
should match the number of BNGL observables you are using. max_iter
is the maximum number
of WE iterations to be ran (this can be changed later from within the setup). pcoord_length
is the number of data points each walker will return. tau
is the length of each BNGL
simulation/walker.
Analysis options
When you first create a setup configuration file like mysim.yaml
, you will see
an analysis section like this
1analyses:
2 enabled: false
3 work-path: /home/USER/webng/testing/test/analysis # the folder to run the analysis under
4 average:
5 dimensions: null # you can limit the tool to the first N dimensions
6 enabled: false # this needs to be set to true to run the analysis
7 first-iter: null # first iteration to start the averaging
8 last-iter: null # first iteration to end the averaging
9 mapper-iter: null # the iteration to pull the voronoi bin mapper from, last iteration by default
10 normalize: false # normalizes the distributions
11 output: average.png # output file name
12 plot-energy: false # plots -ln of probabilies
13 plot-opts: # various plotting options like font sizes and line width
14 name-font-size: 12
15 voronoi-col: 0.75
16 voronoi-lw: 1
17 plot-voronoi: false # true if you want to plot voronoi centers
18 smoothing: 0.5 # the amount of smoothing to apply
19 evolution:
20 avg_window: null # number of iterations to average for each point in the plot
21 dimensions: null # you can limit the tool to the first N dimensions
22 enabled: false # this needs to be set to true to run the analysis
23 normalize: false # normalizes the distributions
24 output: evolution.png # output file name
25 plot-energy: false # plots -ln of probabilies
26 plot-opts: # various plotting options like font sizes and line width
27 name-font-size: 12
Let’s take a look at individual sections.
1analyses:
2 enabled: false
3 work-path: /home/USER/webng/testing/test/analysis # the folder to run the analysis under
This is upper level analysis block and has a single option called enabled
. If set to false,
none of the analyses will run. Each analysis subsection will have the same enabled
option
to set if that particular analysis will be ran or not. work-path
is the folder where all
analysis will be ran.
Average
1average:
2 dimensions: null # you can limit the tool to the first N dimensions
3 enabled: false # this needs to be set to true to run the analysis
4 first-iter: null # first iteration to start the averaging
5 last-iter: null # first iteration to end the averaging
6 mapper-iter: null # the iteration to pull the voronoi bin mapper from, last iteration by default
7 normalize: false # normalizes the distributions
8 output: average.png # output file name
9 plot-energy: false # plots -ln of probabilies
10 plot-opts: # various plotting options like font sizes and line width
11 name-font-size: 12
12 voronoi-col: 0.75
13 voronoi-lw: 1
14 plot-voronoi: false # true if you want to plot voronoi centers
15 smoothing: 0.5 # the amount of smoothing to apply
This is the block for Average analysis. dimensions
is normally set to null which
makes the tool plot all dimensions. If this is set to N
the tool will plot the first
N
dimensions. first-iter
and last-iter
are the iterations to start and
stop the averaging. mapper-iter
is the iteration to pull the voronoi mapper from, if you
don’t want the mapper from the final WE iteration. normalize
can be used to enable
normalization of probability distributions before plotting. output
is the file name
for the output and this can be set to a png
or pdf
file. plot-energy
takes the -ln
of the probabilities before plotting. plot-voronoi
controls
if the voronoi centers are plotted on top of the probability distributions. smoothing
can be changed to reduce or increase the gaussian smoothing used for probability distributions.
plot-opts
contain some options for plotting. name-front-size
is the font-size used
in plotting. voronoi-col
is the color to be used for voronoi bins and voronoi-lw
is the line width for the same lines.
Evolution
1evolution:
2 avg_window: 1 # number of iterations to average for each point in the plot
3 dimensions: null # you can limit the tool to the first N dimensions
4 enabled: false # this needs to be set to true to run the analysis
5 normalize: false # normalizes the distributions
6 output: evolution.png # output file name
7 plot-energy: false # plots -ln of probabilies
8 plot-opts: # various plotting options like font sizes and line width
9 name-font-size: 12
This is the block for Evolution analysis. avg_window
the number of iterations to
average over for every data point. dimensions
is normally set to null which
makes the tool plot all dimensions. If this is set to N
the tool will plot the first
N
dimensions. normalize
can be used to enable normalization of probability
distributions before plotting. output
is the file name for the output and this can be
set to a png
or pdf
file. plot-opts
contain some options for plotting.
name-front-size
is the font-size used in plotting.
Cluster
1cluster:
2 assignments: null
3 cluster-count: 4
4 enabled: true
5 first-iter: null
6 last-iter: null
7 metastable-states-file: null
8 normalize: null
9 states:
10 - coords:
11 - - 20.0
12 - 4.0
13 label: a
14 - coords:
15 - - 4.0
16 - 20.0
17 label: b
18 symmetrize: null
19 transition-matrix: null
This is the block for Cluster analysis. assignments
is the assignment file to be
used for clustering. This can be pointed to a assignment file you generated using w_assign
or, if left null, the tool will attempt to generate an assignment file itself. states
is where you can define states for w_assign
if you want the tool to run it for you.
cluster-count
is the number PCCA+ will try to cluster the data into. first-iter
and last-iter
are WE iterations to pull the data for clustering.
metastable-states-file
is a python pickle file that contains a dictionary which defined
which bin is assigned to which metastable state. normalize
makes it so that the output
text is normalized to percentages. symmetrize
controls if the transition matrix is
made symmetrical or not. transition-matrix
can point to a binary numpy file where you
give the tool a custom transition matrix or, if left null, the tool will generate one for you
using the assignment file.
Network
1network:
2 enabled: true
3 metastable-states-file: null
4 pcca-pickle: null
5 state-labels: null
This is the block for Network generation. metastable-states-file
is a python
pickle file that contains a dictionary which defined which bin is assigned to which metastable
state. pcca-pickle
is the python pickle object that the cluster analysis generates (or
you can use pyGPCCA to generate one yourself).
state-labels
is the labels you want to use for each cluster generated by Cluster