Read the interview with Dr. Zhiming Zhao, one of the editors of the freshly published ENVRI data management book.

Today we spoke with Dr. Zhiming Zhao, one of the two main editors of ENVRI Data management book that has been recently published by Springer. The book is described in detail and ready for download here.

But what is the story behind this book? What was the furthermost motivation, why was the development rather challenging and why you should read it. Read the interview below to find out.

Dr Zhiming Zhao is an assistant professor at University of Amsterdam (UvA). He leads the “Quality Critical Distributed Computing” research team in the group of Multiscale Networked Systems (MNS) at the System and Networking Lab (SNE). His research interests include big data management, Cloud and edge computing, software engineering, and blockchain. He coordinated the EU SWITCH project and the data for science theme in the ENVRIplus project. He leads the common implementation and support activities in the ENVRI-FAIR project, and the VRE development in the LifeWatch-ERIC Dutch Virtual Laboratory Innovation Center. He is also the UvA PI in ARTICONF, CLARIFY, SWITCH, VRE4EIC and several other projects. He is an IEEE senior member and co-chairs the RDA VRE IG. He has published more than 100 research papers in peer-reviewed international journals and conferences. https://staff.fnwi.uva.nl/z.zhao/

Zhiming, what inspired you to initiate the development of this book together with Margareta?

The idea of writing this book first appeared in 2018 when we organized the first ENVRIplus summer school. One of the essential objectives of the summer school was to transfer the knowledge of the technical results we developed in the ENVRIplus data for science theme to the community of research infrastructure developers. The first edition of the school got very positive feedback from the students. However, we soon realized that we need to prepare some better-structured reading material available before and after school.

I wondered, why don’t we have a book to document all the knowledge we developed in the project? I started to talk to the students and other teachers during the summer school. Dr. Margareta Hellström is one of the first few teachers I talked to. I immediately got lots of positive and valuable feedback. People were all very keen on the idea.

What was the writing process like? The book includes 20 chapters from more than 70 authors, that is a lot of coordination and editing.

Many of the data for science theme colleagues were very excited about the idea, but none of us was sure about the amount of effort the book would need. It was crucial to take action.

We first proposed the basic structure and scope of the book. We decided to mainly focus on the work we did in the data for science theme and take the state-of-the-art technologies and work from the other communities into account, e.g., from e-Infrastructures (EGI, Indigo and EUDAT). In the meantime, we started to check different publishers. When I heard Springer is preparing a special edition on state-of-the-art results delivered by different EU projects, we decided to submit the proposal to them. The positive feedback came immediately.

The development of the book was then officially kicked off in October 2018. We made a very ambitious plan: 3 months of writing, two months of review, and one month of finalization. We aimed to publish it before the end of the ENVRIplus project.

It did not go as we planned, not all chapters have been finished in time, particularly the ones involving many co-authors. Many chapter contributors got extremely busy due to the finalization of ENVRIplus, and kickoff of the new ENVRI-FAIR project. To get back on track, we started to organize weekly teleconferences with authors, review progress, and discuss the emerging issues.

Thanks a lot to all the colleagues for their hard work, we did it! It took longer than expected, but great things do take time.

Who should read this book, and why?

Our book’s expected readers are developers, managers, operators, and potential users of research infrastructures in environmental and earth sciences.

The book can also serve as a textbook for young researchers and data managers. With the book, we can train them in data management skills, research infrastructure service development and operation practices, and in using research infrastructures for data-centric research.

Our book effectively structures all the technical knowledge delivered by ENVRIplus and provides a consistent story across different subjects involved in data management and service development. The book provides an integrated view of those technical results compared to the deliverables. Moreover, the book presents content provided by experts outside the project, thus broadens our scope.

What chapter makes you especially proud?

The first chapter explains visionary cases for the cluster level collaboration among research infrastructures. The last one summarizes all the development and highlights future directions. They both involve contributors from different backgrounds and took more effort to develop than the other chapters. I am especially proud of these two.

What needs to be done next in terms of the interdisciplinarity of data and environmental research? And how ENVRI-FAIR fits in all this?

This book provides research infrastructure developers in the ENVRI-FAIR with a good summary of technical results in the previous project. The knowledge base team is continuously working on the machine-readable and discoverable representation for the knowledge. We hope the ENVRI Reference Model, the book, and ENVRI knowledge base will provide practical IT support for the development of FAIR data and services, and for making them ready for operation in the emerging ENVRI Hub, a crucial component of the European Open Science Cloud.