Waleed Alzuhair, flickr

Big Geodata Newsletter, June 2021

Become a high-skilled geospatial professional

Greetings from the Big Geodata Newsletter!

In this issue you will find information on Microsoft's Planetary Computer which is currently in private preview, how to process continental Sentinel 2 data with Dask, a Python package (geemap) to use Google Earth Engine within Jupyter-based environments, and a "big picture" from China on spatiotemporal distribution of aquaculture activities! Our regular upcoming events, recent releases, and CRIB news sections are here as well. We have also a short survey on big geospatial data sets!

Happy reading! 

You can access the previous issues of the newsletter on our web portal. If you find the newsletter useful, please share the subscription link below with your network.

Cloud computing with petabytes of global EO data

Image credits: Planetary Computer, 2021

Microsoft first announced its plan for the Planetary Computer in April 2020, which aims to employ machine learning and other techniques to better understand the planetary challenges and provide answers to improve sustainability. Now it is operational!

The Planetary Computer consists of four major components: the Data Catalog that includes petabytes of free-access data hosted on Azure (AI for Earth data sets), APIs that are based on open standards (e.g. STAC) allowing users to search for the data they need across space and time, the Hub which is a fully managed computing environment based on Jupyterlab, and applications built by a network of partners. APIs and the Hub, which provides 3 different development environments (Pangeo notebook, R geospatial, and GPU-enabled) up to 8 cores and 64 GB memory, are currently available in private preview and you can request access if you're interested in using them.

Microsoft is open to collaboration with ITC, especially for development of data sets and applications, and requests from ITC (UT) e-mail addresses will be approved quickly. Early access to new technology may result in improved visibility of the research outcomes; therefore, it can be a nice opportunity if you already have research ideas and need infrastructure for computing. Dan Morris (Principal scientist, AI for Earth Programme) will be our guest for a Big Geodata Talk later this year for a detailed discussion.

PS: If you didn't see the GEO-Planetary Computer Programme call for research to address environmental challenges (access to the Planetary Computer + up to US $60,000 in financial support for 12 months + up to US $60,000 Microsoft Azure credits), there is still time until 25 June!

Imaging Entire Continents with Dask

Image credits: Coiled, 2021 and Kouzoubov, K., 2021

Dask is an open-source parallel computing library, which literally changed big data processing in Python. Matthew Rocklin, the creator of Dask founded Coiled in 2020 to help researchers, data scientists, and engineers by providing a cloud-based managed Dask platform, which so far processed more than 100 million tasks. In addition, they also organize "Science Thursday" events to showcase recent developments and data science studies, most of which are related to geoinformation and EO. In the next meeting (24 June, 18:00 EST), Kirill Kouzoubov from Geoscience Australia will discuss patterns for large-scale temporal processing of geospatial data using Dask. Recently, Kirill and his team computed cloud-free Sentinel-2 Geometric Median and Median Absolute Deviations mosaics over the entire continent of Africa for Digital Earth Africa. If you want to learn how to process large data sets without running out of RAM or spending all your time on I/O, don't miss this opportunity. Free registration.

PS: If you already get used to having meetings at "strange" hours, 18:00 EST which is 22:00 UTC (midnight at Enschede) should be fine. Otherwise, check Kouzoubov's Dask Summit 2021 presentation until the event recording becomes available.

Upcoming Meetings

Useful Tools: Geemap

Google Earth Engine (GEE) is a powerful platform to perform geospatial analysis by using both JavaScript and Python APIs provided for development. Compared to the interactive code editor and comprehensive documentation of the JavaScript API, the Python API (currently) has limited documentation and functionality. The open-source geemap Python package was created to fill this gap. Built upon interactive widget libraries (e.g. ipyleaflet, ipywidgets), it enables users to analyze and visualize GEE datasets interactively within a Jupyter-based environment - such as ITC's Geospatial Computing Platform. The package has detailed documentation, hundreds of example notebooks, as well as video tutorials.

If you want to combine the capabilities of GEE with existing powerful geospatial tools and libraries, geemap might be the bridge you need. The main developer Qiusheng Wu's YouTube channel also includes additional useful content, such as spatial data management with PostGIS and development of open-source geographic software. Definitely worth a look!

Survey: Big Geospatial Data Sets

Image credits: CRIB, 2021

There are many regional, continental, and global geospatial data sets, which are necessary as base data for a wide-range of analysis needs. By filling this short anonymous survey you can help us to identify the data sets that you use frequently, so that we can make them available on our computing platform in a format that suits best to your needs. You can submit multiple data sets by updating and re-submitting the survey.


Recent Releases

The "Big" Picture

Image credits: Duan et al., 2021

According to the data from the FAO, the total output of fish production reached more than 178 million tons in 2018, of which 46% was from aquaculture. As a typical labor-intensive production mode, aquaculture boomed in China where pond fishing has a history of thousands of years. In fact, China has provided more than half of the total global output of aquaculture since the 1990s. Duan et al. used Google Earth Engine (GEE) to quantify the spatiotemporal distribution of aquaculture ponds over last 30 years along China's coastal zone, approximately 14,500 km in length with a bilateral buffer of 30 km, by using Landsat data and a decision-tree classifier. The results show that the cumulative and holding areas of aquaculture ponds increased 3.7-fold and 1.6-fold than the 1990 values, respectively. Coastal land reclamation played a critical role in the expansion of aquaculture ponds by cumulatively contributing approximately 22% of the land resource in the past 30 years. However there is a "sharply" shrinking period after 2017, which the authors estimate to continue as a result of increasing land competition.

Duan, Y., Tian, B., Li, X., Liu, D., Sengupta, D., Wang, Y. and Peng, Y. (2021) Tracking changes in aquaculture ponds on the China coast using 30 years of Landsat images, International Journal of Applied Earth Observation and Geoinformation, 102:102383, doi:10.1016/j.jag.2021.102383

One more thing...

Image credits: OSCT, 2021

The Open Science Community Twente (OSCT) continues Open Science Kitchen talks, which are regular, virtual meetings of around 60 minutes that aim to provide a space to discuss topics in the context of Open Science, make people think about Open Science regularly, and present current developments people should know about. The next event will be on Thursday 24th June at 14:00 and Dennie Hebels from Maastricht University will talk about "Engaging society: How public outreach meets Open Science".

Click here to join the meeting or add it to your calendar!

Emils Bernhards
Student Assistant

We have a new team member!

Hello everyone! My name is Emīls Bernhards and I am a Business Information Technology student here at the University of Twente. I have recently joined CRIB as a student assistant to support the activities of the Geospatial Computing Platform and Big Geodata Newsletter. I am excited to join this community and contribute to providing you with the latest and greatest in Big Data.

For any comments or suggestions about the newsletter, or if you want to contribute, please simply send us an e-mail.

If you find it useful, please share this link with your colleagues so that they can also subscribe. Thanks!