Democratising EO Intelligence: CORSA and Major Tom Now Live on Terrascope

By Bart Beusen 24 June 2025
Every day, petabytes of Earth observation (EO) data stream into global archives. With Sentinel satellites capturing the planet in extraordinary detail, we are living in an age where data is abundant—but storing, sharing, and leveraging this data is a growing challenge. This is especially true for developers and researchers who want to build Artificial Intelligence (AI) solutions on top of EO imagery but are constrained by bandwidth, storage, or limited labelled data.
 
That’s why at VITO Remote Sensing, we developed CORSA—a lightweight AI-based compression model that does much more than shrink file sizes. It acts as both a high-efficiency storage solution and a ready-to-use encoder that unlocks fast AI prototyping, even for those with limited resources or training data.
 
In this blog, Remote Sensing expert Bart Beusen showcases how CORSA is made accessible through the Terrascope platform, using a curated version of the Major Tom dataset. Together, they form a powerful stack that brings down barriers to entry in AI for EO, democratising geospatial intelligence with less cost, less latency, and less energy consumption.

What is Major Tom?

Major Tom is an open-access dataset developed by ESA's Φ-lab. It divides the globe into a 10 km by 10 km grid, assigning a high-resolution Sentinel-2 patch to each cell. Each patch is a multispectral cube with 12 bands, each stored as a separate GeoTIFF file (B01–B12). This structured approach allows standardised benchmarking and fair comparisons of AI models across geographies and tasks.

For this demonstration, we focused on a regional subset covering Flanders and parts of the Netherlands. We made these patches available in both their original and CORSA-compressed forms.

Enter CORSA: Compression Meets Intelligence

CORSA isn't just another image compression tool. It’s built on a Vector Quantized Variational Auto-Encoder (VQVAE) architecture, trained to compress multispectral satellite images while preserving their semantic content. Instead of storing the original image, CORSA represents it through indices pointing to a learned codebook of feature vectors—drastically reducing storage size while maintaining rich visual information.

Figure 1 - CORSA reconstruction
Figure 1: Reconstructing original image from compressed CORSA embedding.

This dual role of CORSA—compression and feature extraction—means developers can train downstream models (like land cover classifiers or change detectors) directly on the compressed features, bypassing the need to decode or reprocess the full original image.

CORSA + Terrascope = Ready-to-Use AI Stack

Terrascope is a the Belgian open EO platform—funded by the Belgian Science Policy—offering on-demand access to a wide range of geospatial datasets and processing capabilities. By integrating CORSA outputs as a public data collection, Terrascope now enables anyone to build EO applications using precomputed embeddings—no GPU required, no downloads of multi-gigabyte files.

On Terrascope, we published:

Speed and Efficiency: A Quick Comparison

Let’s take a group of 27 grid cells located between Antwerp and Rotterdam (see Figure 2) and compare performance:

Figure 2 - Major Tom grid patches
Figure 2: Map showing Major Tom grid patches over Flanders and the Netherlands.

Format File Size Download Time Reconstruct Time per Tile
Original S2 (12 bands) 382.8 MB 60.6 s 0.2 s
CORSA (2 feature levels) 10.1 MB 5.9 s 1.8 s (decode) + 0.5 s (scaling)

Figure 3 -  CORSA file size comparisonFigure 3: Bar chart comparing download time and file size for original vs CORSA format.

That's:

  • 10× faster downloads
  • ~32× smaller files
  • And more importantly: ready-to-use feature vectors without needing to retrain a model.

From Colour to Classification

To explore the semantic structure of CORSA embeddings, we visualised them in two ways:

  1. Codebook-inherent colour: Based on the 3D arrangement of vectors during training.
  2. t-SNE-based colourisation: A non-linear projection of codebook vectors into 3D space, normalised and mapped to RGB.

Figure 4.1 - CORSA colourisationFigure 4.2 - CORSA colourisation
Figure 4: Side-by-side visualisation of the “571U_29R” patch using codebook colour and t-SNE colourisation.

These visualisations give a striking view of the ‘semantic texture’ of the Earth, as learned by CORSA.

As a toy example, we trained a lightweight land cover classification model using only 541 samples from the Dynamic World dataset, leveraging CORSA embeddings directly as input. This drastically reduces the need for annotated data and training time—perfect for rapid prototyping or deployment in low-resource settings.

Figure 5 - CORSA land cover
Figure 5: Land cover classification map for grid cell 571U_29R using CORSA features.

Why CORSA is Unique: One Solution, Many Wins

CORSA stands apart from traditional compression or AI feature extractors because it solves multiple challenges at once:

  • Storage-efficient: Achieves 25–40× compression on Sentinel-2 imagery
  • Bandwidth-friendly: Smaller files = faster downloads
  • Energy-saving: Reduces server-side and client-side compute
  • Model-ready: Feature embeddings usable out of the box
  • Few-shot friendly: Enables training with fewer labels
  • Sensor-adaptable: Can be retrained for other satellites or sensors in a self-supervised way
For developers and researchers, this makes CORSA a Swiss-army knife for EO workflows—from data handling to AI deployment.

Toward an Inclusive Future for AI4EO

What we’re seeing is a transition in the EO world: from data hoarding to data accessibility, from big compute to smart compute. By combining CORSA’s intelligence-preserving compression with the cloud-native accessibility of Terrascope, we make it easier for more people—researchers, NGOs, startups, and students—to work with remote sensing data and build impactful AI solutions.

This is a step toward the democratisation of AI4EO—bringing down barriers like cost, compute, and data availability to unlock innovation for all.

Figure 6: Walkthrough of the Terrascope notebook.

Join Us at Living Planet Symposium

Curious to learn more about data compression and how it can support your work in Earth observation (EO)? Visit the VITO booth (U31) at the Living Planet Symposium 2025 in Vienna during 23-27 June. Our Remote Sensing experts are looking forward to answering your questions and showing how CORSA can support data accessibility. And don't miss our presentations, demo, and poster on Thursday 26 June and Friday 27 June to learn more about the latest CORSA updates:

Timing Type / Session Topic Speaker Location
Thursday, 26 June
14:00-15:30
Oral Presentation
D.02.06
From Edge to Insights: Transforming Earth Observation with Lightweight Foundation Models and Embeddings-as-a-Service Tanja Van Achteren Hall G1
Thursday, 26 June
15:45-16:15
Demo at VITO Booth From Orbit to Insights: CORSA Live on Edge, Insights via Terrascope Compressed Embeddings. In Collaboration with Unibap. Tanja Van Achteren VITO Booth (U31), EO Arena
Thursday, 26 June
17:45-19:00
Poster
D.04.03
Unlocking ML and Foundation Models within openEO Hans Vanrompay X5 - Poster Area
Friday, 27 June
14:30-16:00
Oral Presentation
C.01.03
Efficient On-Board Processing Using a Shared AI Backbone Acorss Multiple Tasks Bart Beusen, Andreas Luyts Room 1.85/1.86

Let’s connect in Vienna and discuss EO intelligence! Cannot make it to Vienna? Feel free to contact us online.

Living Planet Symposium 2025 LPS25 Themes

 

Like this article? Share it on
Bart Beusen
An article by
Bart Beusen
Senior R&D Professional
More info about Bart Beusen
Share

Related posts

Democratising EO Intelligence: CORSA and Major Tom Now Live on Terrascope
  • EO Data ,
  • AI ,
  • data compression ,
  • CORSA

Democratising EO Intelligence: CORSA and Major Tom Now Live on Terrascope

By Bart Beusen 24.06.2025
Every day, petabytes of Earth observation (EO) data stream into global archives. With Sentinel satellites capturing the planet in extraordinary detail, we..
Lees meer
EvoLand: Sneak Peek into Three CLMS Prototypes
  • Land Use ,
  • Agriculture ,
  • EO Data ,
  • Copernicus ,
  • Sentinel ,
  • AI ,
  • Land Cover ,
  • Machine Learning ,
  • OpenEO

EvoLand: Sneak Peek into Three CLMS Prototypes

By Wai-Tim Ng 13.06.2025
EvoLand is a three-year project launched in January 2023 under the Horizon Europe program, aiming to enhance the Copernicus Land Monitoring Service (CLMS)..
Lees meer
CORSA Unlocked: Hyperspectral Data Compression, AI Analytics, and Jetson-Driven Edge AI
  • EO Data ,
  • AI ,
  • data compression ,
  • self-supervised learning ,
  • CORSA

CORSA Unlocked: Hyperspectral Data Compression, AI Analytics, and Jetson-Driven Edge AI

By Andreas Luyts 18.11.2024

In recent years, advancements in satellite technology have significantly increased the volume of data captured by Earth Observation (EO) satellites. This..

Lees meer