RISA
|
This stage transfers the input data element from device to host. More...
#include <H2D.h>
Public Types | |
using | input_type = glados::Image< glados::cuda::HostMemoryManager< unsigned short, glados::cuda::async_copy_policy >> |
The input data type that needs to fit the output type of the previous stage. More... | |
using | output_type = glados::Image< glados::cuda::DeviceMemoryManager< unsigned short, glados::cuda::async_copy_policy >> |
The output data type that needs to fit the input type of the following stage. More... | |
using | deviceManagerType = glados::cuda::DeviceMemoryManager< unsigned short, glados::cuda::async_copy_policy > |
Public Member Functions | |
H2D (const std::string &configFile) | |
Initializes everything, that needs to be done only once. More... | |
~H2D () | |
Destroys everything that is not destroyed automatically. More... | |
auto | process (input_type &&sinogram) -> void |
Pushes the sinogram to the processor-threads. More... | |
auto | wait () -> output_type |
Takes one sinogram from the output queue results_ and transfers it to the neighbored stage. More... | |
Private Member Functions | |
auto | processor (int deviceID) -> void |
main data processing routine executed in its own thread for each CUDA device, that performs the data processing of this stage More... | |
auto | readConfig (const std::string &configFile) -> bool |
Read configuration values from configuration file. More... | |
Private Attributes | |
std::map< int, glados::Queue< input_type > > | sinograms_ |
one separate input queue for each available CUDA device More... | |
glados::Queue< output_type > | results_ |
the output queue in which the processed sinograms are stored More... | |
std::map< int, std::thread > | processorThreads_ |
stores the processor()-threads More... | |
std::map< int, cudaStream_t > | streams_ |
stores the cudaStreams that are created once More... | |
std::map< int, unsigned int > | memoryPoolIdxs_ |
stores the indeces received when regisitering in MemoryPool More... | |
double | worstCaseTime_ |
stores the worst case time between the arrival of two following sinograms More... | |
double | bestCaseTime_ |
stores the besst case time between the arrival of two following sinograms More... | |
Timer | tmr_ |
used to measure the timings More... | |
std::size_t | lastIndex_ |
stores the index of the last sinogram. Used to analyze which percentage of the arrived sinograms could be reconstructed More... | |
std::size_t | lostSinos_ |
stores the number of sinograms that could not be reconstructed More... | |
std::size_t | count_ { 0 } |
counts the total number of reconstructed sinograms More... | |
int | lastDevice_ |
stores, to which device the last arrived sinogram was sent More... | |
int | numberOfDevices_ |
the number of available CUDA devices in the system More... | |
int | numberOfDetectors_ |
the number of detectors in the fan beam sinogram More... | |
int | numberOfProjections_ |
the number of projections in the fan beam sinogram More... | |
int | memPoolSize_ |
specifies, how many elements are allocated by memory pool More... | |
This stage transfers the input data element from device to host.
This class transfers the data to be processed from host to device. Furthermore, it performs the scheduling between the available devices. Scheduling is done statically, so far. Heterogeneous can be used, by adapting the scheduling in a way, that the more powerful device receives more data inputs.
using risa::cuda::H2D::deviceManagerType = glados::cuda::DeviceMemoryManager<unsigned short, glados::cuda::async_copy_policy> |
using risa::cuda::H2D::input_type = glados::Image<glados::cuda::HostMemoryManager<unsigned short, glados::cuda::async_copy_policy>> |
using risa::cuda::H2D::output_type = glados::Image<glados::cuda::DeviceMemoryManager<unsigned short, glados::cuda::async_copy_policy>> |
risa::cuda::H2D::H2D | ( | const std::string & | configFile | ) |
risa::cuda::H2D::~H2D | ( | ) |
auto risa::cuda::H2D::process | ( | input_type && | sinogram | ) | -> void |
|
private |
main data processing routine executed in its own thread for each CUDA device, that performs the data processing of this stage
This method takes one sinogram from the input queue sinograms_. The sinogram is transfered to the device using the asynchronous cudaMemcpyAsync()-operation. The resulting device strucutre is pushed back into the output queue results_.
[in] | deviceID | specifies on which CUDA device to execute the device functions |
|
private |
Read configuration values from configuration file.
All values needed for setting up the class are read from the config file in this function.
[in] | configFile | path to config file |
true | configuration options were read successfully |
false | configuration options could not be read successfully |
auto risa::cuda::H2D::wait | ( | ) | -> output_type |
|
private |
|
private |
|
private |
|
private |
|
private |
|
private |
|
private |
|
private |
|
private |
|
private |
|
private |
stores the processor()-threads
|
private |
|
private |
|
private |
|
private |
|
private |