HPC meets AI 4 – HPC in data-intensive science

In the previous post I describe a typical workflow for data-intensive science. Sizeable IT infrastructure is required to handle the data rates and the data volume from the data acquisition and fast feedback via the iterative analysis to the archive. The required infrastructure can be provided by high-performance computing (HPC), cloud computing, or a hybrid …

HPC meets AI 3 – Typical workflow for data-intensive science

In the previous post I characterize data-intensive science as research and engineering efforts where the storage, the management and the analysis of acquired data requires special considerations to enable the overall scientific effort. So, what does a typically workflow in data-intensive science look like? The figure below depicts a typical workflow in data-intensive science. The …

HPC meets AI 2 – Data-intensive science

In the previous post I claim that data-intensive science forces organizations to adopt high-performance computing (HPC). I heard the phrase “data-intensive science” the first time in 2012 when I worked with a research institution on large-scale file services for scientists. Since then I heard “data-intensive science” from multiple clients, though in 2019 “data-intensive science” has …

HPC Lab 2 – Getting Started with MPI using a shell script

In the previous post I illustrated how to start MPI programs using the MPI command mpiexec. mpiexec provisions to each application instance an execution environment which enables effective communication between all instances. MPI allows to organize application instances in groups and to create communication objects for communication between members of the same groups (intra-group communicator) …

HPC Lab 1 – Run an MPI Program with mpiexec

Message passing is an integral technique for parallel programs that run on a cluster computer. The Message Passing Interface (MPI) is a standardized and portable specification that is available in many HPC environments. In this post I explain how to start one or more instances of the same program on one ore more compute nodes …