With the volume of Remote Sensing (RS) and Earth Observation (EO) data increasing incessantly, existing and future workflows often must be scaled up beyond the computational and storage resources available in workstations. In this regard, solutions using high-throughput and high-performance computing (HTC/HPC) systems, as an additional alternative to cloud-based solutions, are of relevance for the academic community. Offering full control over available hardware, software, and data, these systems are excellently suited to serve the needs of researchers and can readily support the migration of existing workflows. Furthermore, they are generally available through national infrastructure providers on a merit-driven no-cost basis.
This workshop organized by the Netherlands eScience Center will cover the basic tenets of the use of large academic computing resources, and introduce participants to a Dask-based ecosystem, familiarizing them with the use of the Remote Sensing Deployable Analysis environmenT (RS-DAT) framework to scale EO and RS data analysis using HTC/HPC systems and associated storage resources. The session will cover the tools for data access, retrieval and storage, and demonstrate the scaling up of processing and analysis workflows focused on EO datasets. Participants will perform hands-on research using the RS-DAT framework on a HTC/HPC system.
Through this workshop, participants will learn how to:
- Setup and run analysis in a Jupyter environent on a SLURM-based supercomputer.
- Organize and retrieve geospatial datasets using tools from the SpatioTemporal Asset Catalog (STAC) ecosystem.
- Scale up computations using the Dask framework.
Prerequisites
- Basic understanding of Unix command line and shell, the Python programming language, and the geospatial python ecosystem*
- Affinity to high-performance computing
- Interest in scaling EO workflows
We strongly advise participants to familiarize themselves with the contents of the (nascent) Carpentries Incubator Introduction to Geospatial Raster and Vector Data with Python.
Date
15 May 2023 (One-day training)
Venue
ITC Langezijds Building (ITC), Room 2105
Hallenweg 8, 7522 NH, Enschede
Registration
The workshop is open to UT staff and students. Priority will be given to ITC staff and students.
Registration is required to attend the workshop. Registration is closed.
The capacity is limited to 25 people. In case of more registrations, the participants will be selected randomly, considering the balance between staff and students.
Instructors
Schedule
9:30 - 9:45 | Welcome and icebreaker |
9:45 - 10:00 | SURF services for research and SPIDER (Guest speaker: Lodewijk Nauta, SURF) |
10:00 - 11:00 | HPC, RS-DAT, and the EO software ecosystem |
11:00 - 11:15 | Coffee break |
11:15 - 12:30 | Deployment with RS-DAT and data retrieval |
12:30 - 13:30 | Lunch break |
13:30 - 15:00 | Scaling EO workflows with HPC |
15:00 - 15:15 | Coffee break |
15:15 - 16:30 | Hands-on session |
16:30 - 17:00 | Wrap-up |
17:00 | END |
This event is supported by the Netherlands eScience Center Fellowship Grant NSESC.ESCF.2022.013.
For more information or questions, please contact dr. ing. Serkan Girgin (s.girgin@utwente.nl).