Hello and thank you so much for your time in advance!
I can't get the mpi to work on a stereo setup - it seems to be not passing the stereo images to the right ranks? Wondering if MPI has been tested against stereo setups, and it's my build, or if it's something deeper. Please let me know if you need additional information and I'll include it. Thank you!
I can confirm the following:
- the input files work ('dice -i input.xml' executes normally)
- mpi works and DICe is built with it enabled ('mpi -n 1 dice -i input.xml' executes normally)
- mpi with multiple cores works with a 2D setup
- using -n 2 through -n 4 fail the same way, logged below
My setup is as follows:
- version: v3.0-beta.8-patch
- system: Ubuntu 20.04
- I am doing this over ssh if that somehow matters
** git:
** MPI: enabled
** Data type: double
** Storage type: double
NOTE that this was built with a nix flake, which was extremely convenient but could possibly be related: https://github.com/cadkin/dice/tree/v3.0-beta.8-patch
Also, I've set the full paths to the images, so it shouldn't be a relative path problem...
The output log is as follows:
mpiexec -n 2 dice -i input.xml -v
No protocol specified
No protocol specified
** Digital Image Correlation Engine (DICe)
** git:
** MPI: enabled
** Data type: double
** Storage type: double
** Copyright 2021 National Technology & Engineering Solutions of Sandia, LLC (NTESS)
** Report bugs and feature requests as issues at https://github.com/dicengine/dice
Input Parameters:
subset_file = /dev/shm/scops_dic/00000/roi.dat [unused]
correlation_parameters_file = /dev/shm/scops_dic/00000/param.xml [unused]
camera_system_file = /dev/shm/scops_dic/00000/cam_sys.xml [unused]
output_folder = /dev/shm/scops_dic/00000/output/
image_folder = [unused]
subset_size = 27 [unused]
step_size = 35 [unused]
output_stereo_files = 0 [unused]
reference_image = /dev/shm/scops_dic/00000/img_00000_0.tif [unused]
stereo_reference_image = /dev/shm/scops_dic/00000/img_00000_1.tif [unused]
use_tracklib = 0 [default]
deformed_images ->
/dev/shm/scops_dic/00000/img_00001_0.tif = 1 [unused]
/dev/shm/scops_dic/00000/img_00002_0.tif = 1 [unused]
stereo_deformed_images ->
/dev/shm/scops_dic/00000/img_00001_1.tif = 1 [unused]
/dev/shm/scops_dic/00000/img_00002_1.tif = 1 [unused]
--- Input read successfully ---
User specified correlation Parameters:
interpolation_method = 2 [unused]
sssig_threshold = 10 [unused]
initialization_method = 7 [unused]
cross_initialization_method = 9 [unused]
optimization_method = 1 [unused]
enable_translation = 1 [unused]
enable_rotation = 1 [unused]
enable_normal_strain = 1 [unused]
enable_shear_strain = 0 [unused]
output_delimiter = , [unused]
omit_output_row_id = 1 [unused]
post_process_vsg_strain ->
strain_window_size_in_pixels = 105 [unused]
output_spec ->
COORDINATE_X = 1 [unused]
COORDINATE_Y = 1 [unused]
DISPLACEMENT_X = 1 [unused]
DISPLACEMENT_Y = 1 [unused]
MODEL_COORDINATES_X = 1 [unused]
MODEL_COORDINATES_Y = 1 [unused]
MODEL_COORDINATES_Z = 1 [unused]
MODEL_DISPLACEMENT_X = 1 [unused]
MODEL_DISPLACEMENT_Y = 1 [unused]
MODEL_DISPLACEMENT_Z = 1 [unused]
ROTATION_Z = 1 [unused]
VSG_STRAIN_XX = 1 [unused]
VSG_STRAIN_YY = 1 [unused]
VSG_STRAIN_XY = 1 [unused]
SIGMA = 1 [unused]
GAMMA = 1 [unused]
BETA = 1 [unused]
MATCH = 1 [unused]
STATUS_FLAG = 1 [unused]
UNCERTAINTY = 1 [unused]
--- Correlation parameters read successfully ---
Reference image: /dev/shm/scops_dic/00000/img_00000_0.tif
Deformed image: /dev/shm/scops_dic/00000/img_00001_0.tif
Deformed image: /dev/shm/scops_dic/00000/img_00002_0.tif
--- List of images constructed successfuly ---
Image dimensions: 5496 x 3672
Output will be written to one file per frame with all subsets included
Number of global subsets: 261
Proc 0: subset global id: 0 global coordinates (3421,3281)
Proc 0: subset global id: 1 global coordinates (3421,3246)
Proc 0: subset global id: 2 global coordinates (3421,3211)
Proc 0: subset global id: 3 global coordinates (3421,3176)
Proc 0: subset global id: 4 global coordinates (3421,3141)
Proc 0: subset global id: 5 global coordinates (3421,3106)
Proc 0: subset global id: 6 global coordinates (3421,3071)
Proc 0: subset global id: 7 global coordinates (3421,3036)
Proc 0: subset global id: 8 global coordinates (3421,3001)
Proc 0: subset global id: 9 global coordinates (3421,2966)
...
Proc 0: subset global id: 130 global coordinates (3841,3246)
--- Calibration parameters read successfully ---
Processing cross correlation between left and right images
Error, unrecognized image file type for file:
Error, either the image path is not valid, the file does not exist, or the path contains non-UTF-8 characters (for example chinese characters)
/build/source/src/base/DICe_Image.cpp:145:
Throw number = 1
Throw test that evaluated to true: true
Error, image file read failure
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
mpiexec detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:
Process name: [[42963,1],1]
Exit code: 1
Hello and thank you so much for your time in advance!
I can't get the mpi to work on a stereo setup - it seems to be not passing the stereo images to the right ranks? Wondering if MPI has been tested against stereo setups, and it's my build, or if it's something deeper. Please let me know if you need additional information and I'll include it. Thank you!
I can confirm the following:
My setup is as follows:
** git:
** MPI: enabled
** Data type: double
** Storage type: double
NOTE that this was built with a nix flake, which was extremely convenient but could possibly be related: https://github.com/cadkin/dice/tree/v3.0-beta.8-patch
Also, I've set the full paths to the images, so it shouldn't be a relative path problem...
The output log is as follows:
mpiexec -n 2 dice -i input.xml -v
No protocol specified
No protocol specified
** Digital Image Correlation Engine (DICe)
** git:
** MPI: enabled
** Data type: double
** Storage type: double
** Copyright 2021 National Technology & Engineering Solutions of Sandia, LLC (NTESS)
** Report bugs and feature requests as issues at https://github.com/dicengine/dice
Input Parameters:
subset_file = /dev/shm/scops_dic/00000/roi.dat [unused]
correlation_parameters_file = /dev/shm/scops_dic/00000/param.xml [unused]
camera_system_file = /dev/shm/scops_dic/00000/cam_sys.xml [unused]
output_folder = /dev/shm/scops_dic/00000/output/
image_folder = [unused]
subset_size = 27 [unused]
step_size = 35 [unused]
output_stereo_files = 0 [unused]
reference_image = /dev/shm/scops_dic/00000/img_00000_0.tif [unused]
stereo_reference_image = /dev/shm/scops_dic/00000/img_00000_1.tif [unused]
use_tracklib = 0 [default]
deformed_images ->
/dev/shm/scops_dic/00000/img_00001_0.tif = 1 [unused]
/dev/shm/scops_dic/00000/img_00002_0.tif = 1 [unused]
stereo_deformed_images ->
/dev/shm/scops_dic/00000/img_00001_1.tif = 1 [unused]
/dev/shm/scops_dic/00000/img_00002_1.tif = 1 [unused]
--- Input read successfully ---
User specified correlation Parameters:
interpolation_method = 2 [unused]
sssig_threshold = 10 [unused]
initialization_method = 7 [unused]
cross_initialization_method = 9 [unused]
optimization_method = 1 [unused]
enable_translation = 1 [unused]
enable_rotation = 1 [unused]
enable_normal_strain = 1 [unused]
enable_shear_strain = 0 [unused]
output_delimiter = , [unused]
omit_output_row_id = 1 [unused]
post_process_vsg_strain ->
strain_window_size_in_pixels = 105 [unused]
output_spec ->
COORDINATE_X = 1 [unused]
COORDINATE_Y = 1 [unused]
DISPLACEMENT_X = 1 [unused]
DISPLACEMENT_Y = 1 [unused]
MODEL_COORDINATES_X = 1 [unused]
MODEL_COORDINATES_Y = 1 [unused]
MODEL_COORDINATES_Z = 1 [unused]
MODEL_DISPLACEMENT_X = 1 [unused]
MODEL_DISPLACEMENT_Y = 1 [unused]
MODEL_DISPLACEMENT_Z = 1 [unused]
ROTATION_Z = 1 [unused]
VSG_STRAIN_XX = 1 [unused]
VSG_STRAIN_YY = 1 [unused]
VSG_STRAIN_XY = 1 [unused]
SIGMA = 1 [unused]
GAMMA = 1 [unused]
BETA = 1 [unused]
MATCH = 1 [unused]
STATUS_FLAG = 1 [unused]
UNCERTAINTY = 1 [unused]
--- Correlation parameters read successfully ---
Reference image: /dev/shm/scops_dic/00000/img_00000_0.tif
Deformed image: /dev/shm/scops_dic/00000/img_00001_0.tif
Deformed image: /dev/shm/scops_dic/00000/img_00002_0.tif
--- List of images constructed successfuly ---
Image dimensions: 5496 x 3672
Output will be written to one file per frame with all subsets included
Number of global subsets: 261
Proc 0: subset global id: 0 global coordinates (3421,3281)
Proc 0: subset global id: 1 global coordinates (3421,3246)
Proc 0: subset global id: 2 global coordinates (3421,3211)
Proc 0: subset global id: 3 global coordinates (3421,3176)
Proc 0: subset global id: 4 global coordinates (3421,3141)
Proc 0: subset global id: 5 global coordinates (3421,3106)
Proc 0: subset global id: 6 global coordinates (3421,3071)
Proc 0: subset global id: 7 global coordinates (3421,3036)
Proc 0: subset global id: 8 global coordinates (3421,3001)
Proc 0: subset global id: 9 global coordinates (3421,2966)
...
Proc 0: subset global id: 130 global coordinates (3841,3246)
--- Calibration parameters read successfully ---
Processing cross correlation between left and right images
Error, unrecognized image file type for file:
Error, either the image path is not valid, the file does not exist, or the path contains non-UTF-8 characters (for example chinese characters)
/build/source/src/base/DICe_Image.cpp:145:
Throw number = 1
Throw test that evaluated to true: true
Error, image file read failure
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
mpiexec detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:
Process name: [[42963,1],1]
Exit code: 1