There are three classes of components: framework, driver, and general purpose (physics components fall into this category). In the IPS, each component executes in a separate process (a child of the framework) and implements the following methods:
The component writer will use the services API to help perform data, task, configuration and event management activities to implement these methods.
This document focuses on helping (physics) component and driver writers successfully write new components. It will take the writer step-by-step through the process of writing basic components. Detailed discussions of multiple levels of parallelism, fault tolerance strategies, performance and resource utilization considerations, and asynchronous coordination of simulations can be found in the advanced topics documentation.
[1] | Tasks are the binaries that are launched by components on compute nodes, where as components are Python scripts that manage the data movements and execution of the tasks (with the help of IPS services). In general, the component is aware of the driver and its existence within a coupled simulation, and the task does not. |
[2] | The IPS uses an agreed upon file format and associated library to manage global (shared) data for the simulation, called the Plasma State. It is made up of a set of netCDF files with a defined layout so that codes can access and share the data. At the beginning of each step the component will get a local copy of the current plasma state, execute based on these values, and then update the plasma state values that it changed to the global copy. There are data management services to perform these actions, see Data Management API. |
In this section, we take you through the steps of adding a new component to the IPS landscape. It will cover where to put source code, scripts, binaries and inputs, how to construct the component, how to add the component to the IPS build system, and some tips to make this process smoother.
The location of the binary does not technically matter to the framework, as long as the path can be constructed by the component and the permissions are set properly to launch it when the time comes. There are two recommended ways to express the location of the binary to the component:
Once you have your binary built properly and available, it is time to work on the data coupling to the other components in a simulation. This is a component specific task, but it often takes conversation with the other physicists in the group as to what values need to be communicated and to develop an understanding of how they are used.
When the physics of interest is identified, adapters need to be written to translate IPS-style inputs (from the Plasma State) to the inputs the binary is expecting, and a similar adapter for the output files. More details on how to use the Plasma State and adapting binaries can be found in the Plasma State Guide.
Now it is time to start writing the component. At this point you should have an idea of how the component will fit into a coupled simulation and the types of activities that will need to happen during the init, step, and finalize phases of execution.
While writing your component, be sure to use try...except blocks [1] to catch problems and the services logging mechanisms to report critical errors, warnings, info and debug messages. It is strongly recommended that you use exceptions and the services logging capability for debugging and output. Not catching exceptions in the component can lead to the driver or framework catching them in a weird place and it will likely take a long time to track down where the problem occurred. The logging mechanism in the IPS provides time stamps of when the event occurred, the component that produced the message, as well as a nice way to format the message information. These messages are written to the log file (specified in the configuration file for the simulation) atomically, unlike normal print statements. Absolute ordering is not guaranteed across different components, but ordering within the same component is guaranteed. See Logging API for more information on when to use the different logging levels.
At this point, it might be a good idea to start the documentation of the component in ips/doc/component_guides/. You will find a README file in ips/doc/ that explains how to build and write IPS documentation, and another in the ips/doc/component_guides/ on what information to include in your component documentation.
[3] | Tutorial on exceptions |
Once you are satisfied with the implementation of the component, it is time to construct and edit the Makefiles such that the component is built properly by the framework. The Makefile will build your executables and move scripts to ${IPS_ROOT}/bin.
If you do not already have a makefile in the directory for your new component, copy the examples (ips/doc/examples/Makefile and ips/doc/examples/Makefile.include) to your component directory.
List all executables to be compiled in EXECUTABLES and scripts in SCRIPTS.
EXECUTABLES = do_toric_init prepare_toric_input process_toric_output \
process_toric_output_mcmd # Ptoric.e Storic.e
SCRIPTS = rf_ic_toric.py rf_ic_toric_mcmd.py
TARGETS = $(EXECUTABLES)
Make targets for each executable. Do not remove targets all, install, clean, distclean, and .depend.
Add any libraries that are needed to ips/config/makeconfig.local. (This is where LIBS and the various fortran flags are defined.)
Add component to top-level Makefile. Toric example:
TORIC_COMP_DIR=components/rf/toric/src
TORIC_COMP=.TORIC
Add component dir to COMPONENT_DIRS:
COMPONENTS_DIRS=$(AORSA_COMP_DIR) \
$(TORIC_COMP_DIR) \
$(BERRY_INIT_COMP_DIR) \
$(CHANGE_POWER_COMP_DIR) \
$(BERRY_CQL3D_INIT_COMP_DIR) \
$(CHANGE_POWER_COMP_DIR) \
$(CQL3D_COMP_DIR) \
$(ELWASIF_DRIVER_COMP_DIR) \
...
Add component to COMPONENTS:
COMPONENTS=$(AORSA_COMP) \
$(TORIC_COMP) \
$(BERRY_AORSA_INIT_COMP) \
$(BERRY_CQL3D_INIT_COMP) \
$(CHANGE_POWER_COMP) \
$(CQL3D_COMP) \
$(BERRY_INIT_COMP) \
$(ELWASIF_DRIVER_COMP) \
...
Now you should be able to build the IPS with your new component.
Now it is time to construct a simulation to test your new component. There are two ways to test a new component. The first is to have the IPS just run that single component without a driver, by specifying your component as the driver. The second is to plug it into an existing driver. The former will test only the task launching and data movement capabilities. The latter can also test the data coupling and call interface to the component. This section will describe how to xstest your component using an existing driver (including how to add the new component to the driver).
As you can see in the example component, almost everything is specified in the configuration file and read at run-time. This means that the configuration of components is vitally important to their success or failure. The entries in the component configuration section are made available to the component automatically, thus a component can access them by self.<entry_name>. This is useful in many cases, and you can see in the example component that self.NPROC and self.BIN_PATH are used. Global configuration parameters can also be accessed using services call get_config_param(<param_name>) (API).
Drivers access components by their port names (as specified in the configuration file). To add a new component to the driver you will either need to add a new port name or use an existing port name. ips/components/drivers/dbb/generic_driver.py is a good all-purpose driver that most components should be able to use. If you are using an existing port name, then the code should just work. It is recommended to go through the driver code to make sure the component is being used in the expected manner. To add a new port name, you will need to add code to generic_driver.step():
The following sections of the configuration file may need to be modified. If you are not adding the component to an existing simulation, you can copy a configuration file from the examples directory and modify it.
Plasma State (Shared Files) Section
You will need to modify this section to include any additional files needed by your component:
# Where to put plasma state files as the simulation evolves
PLASMA_STATE_WORK_DIR = ${SIM_ROOT}/work/plasma_state
CURRENT_STATE = ${SIM_NAME}_ps.cdf
PRIOR_STATE = ${SIM_NAME}_psp.cdf
NEXT_STATE = ${SIM_NAME}_psn.cdf
CURRENT_EQDSK = ${SIM_NAME}_ps.geq
CURRENT_CQL = ${SIM_NAME}_ps_CQL.nc
CURRENT_DQL = ${SIM_NAME}_ps_DQL.nc
CURRENT_JSDSK = ${RUN_ID}_ps.jso
# What files constitute the plasma state
PLASMA_STATE_FILES1 = ${CURRENT_STATE} ${PRIOR_STATE}
${NEXT_STATE}
PLASMA_STATE_FILES2 = ${PLASMA_STATE_FILES1} ${CURRENT_EQDSK}
${CURRENT_CQL} ${CURRENT_DQL}
PLASMA_STATE_FILES = ${PLASMA_STATE_FILES2} ${CURRENT_JSDSK}
Ports Section
You will need to add the component to the ports section so that it can be properly detected by the framework and driver. An entry for DRIVER must be specified, otherwise the framework will abort. Also, a warning is produced if there is no INIT component. Note that all components added to the NAMES field must have a corresponding subsection.
[PORTS]
NAMES = INIT DRIVER MONITOR EPA NB
[[DRIVER]]
IMPLEMENTATION = EPA_IC_FP_NB_DRIVER
[[INIT]]
IMPLEMENTATION = minimal_state_init
[[RF_IC]]
IMPLEMENTATION = model_RF_IC
...
Component Description Section
The ports section just defines which components are going to be used in this simulation, and point to the section where they are described. The component description section is where those definitions take place:
[TSC]
CLASS = epa
SUB_CLASS =
NAME = tsc
NPROC = 1
BIN_PATH = ${IPS_ROOT}/bin
INPUT_DIR = ${IPS_ROOT}/components/epa/tsc
INPUT_FILES = inputa.I09001 sprsina.I09001config_nbi_ITER.dat
OUTPUT_FILES = outputa tsc.cgm inputa log.tsc ${PLASMA_STATE_FILES}
SCRIPT = ${BIN_PATH}/epa_nb_iter.py
The component section starts with a label that matches what is listed as the implementation in the ports section. These must match or else the framework will not find your component and the simulation will fail before it starts (or worse, use the wrong implementation!). CLASS and SUBCLASS typically refer to the directory hierarchy and are sometimes used to identify the location of the source code and input files. Note that NAME must match the python class name that implements the component. NPROC is the number of processes that the binary needs to use when launched on compute nodes. The BIN_PATH will almost always be ${IPS_ROOT}/bin and refers to the location of any binaries you wish to use in your component. The Makefile will move your component script to ${IPS_ROOT}/bin when you build the IPS, and should do the same to any binaries that are produced from the targets in the Makefile. If you have pre-built binaries that exist in another location, an additional entry in the component description section may be a convenient place to put it. INPUT_DIR, INPUT_FILES and OUTPUT_FILES specify the location and names of the input and output files, respectively. If a subset of plasma states files is all that is required by the component, they are specified here (PLASMA_STATE_FILES). If the entry is omitted, all of the plasma state files are used. This prevents the full set of files to be copied to and from the component’s work directory on every step, saving time and space. Lastly, SCRIPT is the Python script that contains the component code, specifically the Python class in NAME. Additionally, any component specific values maybe specified here to control logic or set data values that change often.
Time Loop Section
This may need to be modified for your component or the driver that uses your new component. During testing, a small number of steps is appropriate.
# Time loop specification (two modes for now) EXPLICIT | REGULAR
# For MODE = REGULAR, the framework uses the variables START, FINISH, and NSTEP
# For MODE = EXPLICIT, the framework uses the variable VALUES (space separated
# list of time values)
[TIME_LOOP]
MODE = EXPLICIT
VALUES = 75.000 75.025 75.050 75.075 75.100 75.125
This section contains some useful tips on testing, debugging and documenting your new component.
The driver of the simulation manages the control flow and synchronization across components via time stepping or implicit means, thus orchestrating the simulation. There is only one driver per simulation and it is invoked by the framework and is responsible for invoking the components that make up the simulation scenario it implements. It is also responsible for managing data at the simulation level, including checkpoint and restart activities.
Before writing a driver, it is a good idea to have the components already written. Once the components that are to be used are chosen the data coupling and control flow must be addressed.
In order to couple components, the data that must be exchanged between them and the ordering of updates to the plasma state must be determined. Once the data dependencies are identified (which components have to run before the next, and which ones can run at the same time). You can write the body of the driver. Before going through the steps of writing a driver, review the method invocation API and plan which methods to use during the main time loop. If you are writing a driver that uses the event service for synchronization, see Advanced Features for instructions and examples.
The framework will invoke the methods of the INIT and DRIVER components over the course of the simulation, defining the execution of the run:
It is recommended that you start with the ips/components/drivers/dbb/generic_driver.py and modify it as needed. You will most likely be changing: how the components are called in the main loop (the generic driver calls each component in sequence), the pre-step logic phase, and what ports are used. The data management and checkpointing calls should remain unchanged as their behavior is controlled in the configuration file.
The process for adding a new driver to the IPS is the same as that for the component. See the appropriate sections above for adding a component.