Typical workflow
The spring module can be used to perform reconstructions in a standard sequential way, as described in the example Performing a reconstruction.
It is, however, possible (and suggested) to use asynchronous execution via the spring.MPR.runasync()
method to get full advantage from running in an interactive Jupyter notebook, as presented in the example Running a reconstruction asynchronously.
In this section we enter more into the details of this mode, with some insights into the internal structure of the spring module.
Module import
In the following examples, it is assumed that the following objects, i.e. spring.Pattern
, spring.Settings
and spring.MPR
, are imported.
from spring import Pattern, Settings, MPR
The control of asynchronous execution is achieved by importing the spring.hypervisor
.
from spring import hypervisor as hv
It is important to note that spring.hypervisor
is not a type/class, but it is already an instance of spring._Hypervisor
.
Preparing data
The step of data loading aims at creating a spring.Pattern
object that contains the experimental diffraction pattern. The usage is rather simple, as reported in Setting up a diffraction pattern.
In a typical workflow, it is convenient to create a function that loads the data from the desired file, which can be in different data formats, and returns a spring.Pattern
object ready for use.
In the folder examples/data
some exemplary diffraction patterns are prepared with a very simple structure. A function to load this type of data can be implemented as it follows:
import numpy as np
def load_data(filename):
data = np.load(filename) # load raw data
# create an instance of the Pattern object
pattern = Pattern(pattern=data['pattern'], # diffraction data
mask=data['mask'], # mask matrix, with values 0 or 1. Coordinates where mask==1 are excluded from the analysis
center=data['center'], # coordinates of the center
pid=data['pid'], # pulse id
cropsize=1024, # size to which the pattern is cropped
size=256, # size to which the cropped pattern is rescaled
satvalue=3e4) # pixel value above which pixels are considered to be saturated (especially relevant for pnCCD data)
return pattern
The implementation of a data loader function is specific of the data format. The data
subfolder in the examples contains the following files:
import os
print(*sorted(os.listdir("data")))
example_A_1.npz example_A_2.npz example_A_3.npz example_A_4.npz example_B_1.npz example_B_2.npz example_B_3.npz example_B_4.npz
The example patterns can be now quickly loaded via the custom load_data
function and the pattern displayed with:
pattern = load_data("data/example_B_2.npz")
pattern.plot()
The algorithm settings should be also defined. For more details see the settings example and the reference documentation of the spring.Settings
class.
settings = Settings()
Live status of the process
The methods hypervisor.livelog
(see spring._Hypervisor.livelog()
) and hypervisor.liveplot
(see spring._Hypervisor.liveplot()
) allow one to get live information about the process. When run in a specific IPython cell, they update the output of that cell over time. For the liveplot
usage, see this example.
The livelog
output before running any reconstruction looks like the following:
hv.livelog()
- - - - - - - - - - -
IDLE
When a new reconstruction process is launched with MPR(pattern=pattern, settings=settings).runasync()
, the output of the livelog
is updated and the log of the MPR process is shown, along with a change of the status from IDLE
to RUNNING
accompained by the pulse-id (pid
) of the pattern under analysis and a progressbar that informs about the proceeding of the reconstruction steps.
Selected GPUs:
- NVIDIA GeForce RTX 3090 (Id: 0, load: 0%)
Initializing solver... Done.
Initializing algorithms... Done.
Initializing population... Done.
... Running main loop ...
- - - - - - - - - - - - - - - - - - - - -
RUNNING 19-7-14541946795 [###-------] 34%
Hint
The livelog
and liveplot
functions updates their output from time to time in the cell they were called from. This may lead to some jumps in the visualization. Furthermore, the user typically wants to have their output always visible independently on the position in the .ipynb
file. It is possible to create a dedicated view on a specific cell output on JupyterLab by right clicking with the mouse on the specific cell and selecting “Create New View for Cell Output”. This will create a new page which only displays the specific cell output, which can be put on the side and always visible by tiling. See this example.
It this then possible to hide the output of the live cell in the main document view by clickng on the blue bar at the side of the cell output.
Handling multiple reconstructions
In a typical workflow, the user wants to perform more than a single reconstruction. Several reconstructions can be achieved by multiple calls to the spring.MPR.runasync()
method.
For example, a reconstruction has already started by using data loaded from the examples/data/example_A_1.npy
file:
MPR(pattern=load_data("data/example_A_1.npz"), settings=sett).runasync()
The cell where the hv.livelog()
was started will show something similar to:
Selected GPUs:
- NVIDIA GeForce RTX 3090 (Id: 0, load: 0%)
Initializing solver... Done.
Initializing algorithms... Done.
Initializing population... Done.
... Running main loop ...
- - - - - - - - - - - - - - - - - - - - -
RUNNING 197-10-14558540587 [#---------] 9%
When a new reconstruction is submitted, for example on example_A_2
with the command
MPR(pattern=load_data("data/example_A_2.npz"), settings=sett).runasync()
The hv.livelog()
will still show the same output about the running reconstruction, but with the additional information that one reconstruction has been queued
Selected GPUs:
- NVIDIA GeForce RTX 3090 (Id: 0, load: 0%)
Initializing solver... Done.
Initializing algorithms... Done.
Initializing population... Done.
... Running main loop ...
- - - - - - - - - - - - - - - - - - - - -
RUNNING 197-10-14558540587 [###-------] 26% (1 queued)
The queued job will be executed as soon as the currently running one is completed. There is no restrinction on the number of reconstructions that can be submitted, i.e. the length of the execution queue (apart from the total installed memory in the system).
For example, the execution of
for examplename in ["A_1", "A_2", "A_3", "A_4"]:
MPR(pattern=load_data("data/example_"+ examplename+".npz"), settings=sett).runasync()
will make the livelog
update to:
Selected GPUs:
- NVIDIA GeForce RTX 3090 (Id: 0, load: 0%)
Initializing solver... Done.
Initializing algorithms... Done.
Initializing population... Done.
... Running main loop ...
- - - - - - - - - - - - - - - - - - - - -
RUNNING 197-10-14558540587 [#---------] 10% (3 queued)
while the execution of hv.info()
will provide the following output:
Running: 197-10-14558540587
Execution queue (3 total):
0: 250-23-14565026066
1: 71-1-14546865407
2: 19-7-14541946795
The call to spring.MPR.runasync()
(which is a wrapper to spring._Hypervisor.append()
) will put the reconstruction at the back of the execution queue. A reconstruction process can be put at the front of the execution queue with the spring._Hypervisor.prepend()
method, i.e.
hv.prepend(MPR(pattern=load_data("data/example_B_1.npz"), settings=sett))
This will modify the output of hv.info()
to:
Running: 197-10-14558540587
Execution queue (4 total):
0: 190-1769647347
1: 250-23-14565026066
2: 71-1-14546865407
3: 19-7-14541946795
The execution of the current reconstruction can be interrupted via hv.stop()
. In this case, the next reconstruction in the queue is automatically started.
Tagging reconstructions
By default, reconstruction processes are identified by concatenatig their pid numbers (e.g. 190-1769647347
or 250-23-14565026066
). However, it is sometimes necessary to perform multiple reconstruction procedures on the same diffraction data, for example to test the outcome as function of different values for the spring.Settings
. This can be achieved via the tag
argument when creating the instance of spring.MPR
. This tag, if different from an empty string, is added to the reconstruction name (and the filename of the mpr.Result
).
In the following example, four different patterns are analyzed, tested using three different starting support sizes, for a total of 12 processes:
for examplename in ["A_1", "A_2", "A_3", "A_4"]:
for ss in [80,100,120]:
sett = Settings().set('init','supportsize',ss)
MPR(pattern=load_data("data/example_"+ examplename+".npz"), settings=sett, tag="suppsize:"+str(ss)).runasync()
The queue is, thus, the following:
hv.info()
Running: 197-10-14558540587-suppsize:80
Execution queue (11 total):
0: 197-10-14558540587-suppsize:100
1: 197-10-14558540587-suppsize:120
2: 250-23-14565026066-suppsize:80
3: 250-23-14565026066-suppsize:100
4: 250-23-14565026066-suppsize:120
5: 71-1-14546865407-suppsize:80
6: 71-1-14546865407-suppsize:100
7: 71-1-14546865407-suppsize:120
8: 19-7-14541946795-suppsize:80
9: 19-7-14541946795-suppsize:100
10: 19-7-14541946795-suppsize:120
Removing jobs from the execution queue
The spring.hypervisor
object allows for further simple operations on the execution queue, and in particular it allows for the removal of submitted reconstructions via the method spring._Hypervisor.remove()
.
The hv.remove(item)
method takes two different types as value for item
. In case of an integer i
, the element in position i
of the queue given by hv.info()
is removed. Please note that the item position in the queue may change while inputting the command.
In case of an item
of type string, the full name of the job has to be given, e.g. item='71-1-14546865407-suppsize:120'
. This is more safe, as it is independent on the position in the queue.
Note
The hv.remove()
method only works on the process sitting in the execution queue. To cancel the running process, hv.stop()
(see spring._Hypervisor.stop()
) can be used.
The string given as item
parameter allows for the use of simple wildcards (the semantics follow the fnmatch module of the Python standard library).
For example, it is possible to remove multiple items from the previous example (four patterns with three different configurations each) using the wildcards as it follows:
# remove all jobs whose pid start with 71-
# (as the first number in the pid list is the run number
# this is like saying "remove all reconstructions from Run 71")
hv.remove("71-*")
Removing 71-1-14546865407-suppsize:80 from queue
Removing 71-1-14546865407-suppsize:100 from queue
Removing 71-1-14546865407-suppsize:120 from queue
# Remove all jobs whose name ends with "suppsize:100"
hv.remove("*suppsize:100")
Removing 197-10-14558540587-suppsize:100 from queue
Removing 250-23-14565026066-suppsize:100 from queue
Removing 19-7-14541946795-suppsize:100 from queue
After these operations, the output of hv.info()
looks like:
Running: 197-10-14558540587-suppsize:80
Execution queue (5 total):
0: 197-10-14558540587-suppsize:120
1: 250-23-14565026066-suppsize:80
2: 250-23-14565026066-suppsize:120
3: 19-7-14541946795-suppsize:80
4: 19-7-14541946795-suppsize:120
It is possible to kill all the queued jobs and the running one with:
hv.kill()
# Equivalent to:
# hv.remove("*")
# hv.stop()
Removing 197-10-14558540587 from queue
Removing 250-23-14565026066 from queue
Removing 71-1-14546865407 from queue
Removing 19-7-14541946795 from queue
Terminating running MPR process... Done