Flexible cloud processing with OpenEO

Earth observation processing made easy

To process Copernicus data, we can currently choose between more than 5 public cloud providers, and a myriad of platforms providing processing services. It’s a bit of an understatement to say that this is hard to navigate for a researcher who just wants to process some data.
To resolve this, the European Commission has started the H2020 project openEO. It is set up to develop an open API to connect R, python, javascript and other clients to big Earth observation cloud back-ends in a simple and unified way.

The data in an openEO service are exposed as a ‘data cube’, irrespective of how the data is stored internally. As a result, users of openEO will no longer need to deal with individual files, formats, and EO product catalogs. openEO makes the user’s life easy. More technical details can be found on this openEO web page.

OpenEO
The OpenEO architecture

Uptake of OpenEO

Now, two years after the project’s kick-off, we are already able to run earth observation tasks on various systems, such as the PROBA-V MEP, DIAS and Google Earth Engine, with no need for code modifications or complex deployment processes. Users can work with various back-ends and easily compare systems in terms of capacity, cost, and results. Next to that we also see some other big advantages:

openEO is easy to use, while hiding the complex technical details of distributed cloud processing
Users can trace how results are generated, so that their work is reproducible
well documented interfaces make it easy to integrate openEO in any application

Several projects are already using openEO, also within our own project portfolio. This is not only because we like to ‘eat our own dog food’ and get some much needed real-world testing, but because there is a clear demand from multiple research projects.

The CropSAR team for instance mostly wanted to extract data in a fast, easy and flexible manner. This was of course an easy match for the data cube capabilities that openEO offers, but also the distributed processing and automatic parallelization was useful when a researcher wanted to request data from 100000’s of parcels.

OpenEO CropSAR Parcel Inspector
CropSAR validation tool that integrates OpenEO

In the SieuSoil H2020 project, we are using predefined and user-defined functions in openEO to preprocess Sentinel-2 data for phenology analysis and soil management. Some first results were demonstrated during the FOSS4G conference and ESA’s Phi-week. Click here to view a recording of the presentation.

From predefined to user-defined functions

Earth observation research often involves processing huge amounts of data. To enable this, openEO offers a number of predefined funtions, which the backend has to support (e.g. computing an NDVI from spectral bands). This ensures that users can easily use different service providers, or can easily compare results.

Next to the predefined functions, openEO also offers user-defined functions. They allow the users to reuse the wealth of algorithms that already exist, for instance, those that are available as open source in the Python and R communities. User-defined functions are sent to the backend in the form of a simple script. For instance, a Python script that loads a machine learning model to classify pixels. Then, the function is executed on the backend. This capability and its ease of use, is truly unique for openEO, and it will increase the uptake of cloud-based processing even more.

OpenEO wb editor
OpenEO visual interface (editor.open.org)

Community driven and open source

The openEO specification is built by an open community of various research institutions and companies. Concurrently, multiple open source client and backend implementations are developed to ensure that the standard is validated in realistic scenarios. This approach makes the specification sustainable, as it can continue to evolve freely as long as there are interested parties.

The open source nature of the implementations ensures that results are reproducible and traceable. The implementation of the various processing functions can be verified and improved independently, which is important in a world where information derived from earth observation data is used more and more to drive policies and decisions.

Open source code also supports the long-term sustainability of the project. For instance, if we decide to use Google Earth Engine through the openEO implementation, we can continue to improve it ourselves even if the original authors are no longer involved. Likewise, people who are interested in our implementation (which runs on DIAS) and that have a SentinelHub connector for global data access can simply run and improve it on their own if needed.

OpenEO interface
Open source repository - https://github.com/Open-EO