Open science is fundamentally changing how scientists and researchers approach scholarly communication and collaboration, from publishing preprints and interactive research results in new formats, to sharing methods, code, or data, and having open research meetings and seminars. At Curvenote we strive to support scientists as they conduct their research, compose and develop their findings, and share ideas with collaborators and the world. There can be many benefits to working more openly, however, this can be both daunting and complex for researchers who are adopting these practices.
To better understand how Curvenote can help in the transition towards open science, we are constantly talking to researchers who have transitioned their own work to practicing more in the open, and understanding what Curvenote can do to help in this transition.
For this blog, we talked with Dr. Jiajia Sun, current Assistant Professor of Geophysics at the University of Houston, about his experiences with open science, open-source programming, and open-educational resources. Throughout our discussion we learned more about Dr. Sun’s transition into open-science practices, as well as the questions that arise about the benefits, challenges, and tools necessary to practice open science.
Dr. Sun’s introduction to open science started with open-educational materials. In his first semester at the University of Houston, he was asked to create a machine learning course. He knew about machine learning theory, but hadn’t personally implemented all the algorithms. He discovered the book, Hands-On Machine Learning with Scikit-Learn and TensorFlow by Aurélien Géron, which included openly available code examples. Modifying these examples, Dr. Sun created a collection of Jupyter notebooks that served as lab exercises for his students. When asked how he has benefited from others sharing their work, he explained:
“I feel like I have benefited so much from the open-source community that I feel morally obligated to give back. I would feel guilty if I don’t. They helped me and I should contribute back.”
True to his word, all of Dr. Sun’s course materials are openly available on GitHub. Dr. Sun’s experiences in educational resources, where he had transparent access to methods, data, and previous work to easily build upon, have influenced his passion for open science in many other parts of his team’s research process, specifically around how he uses tools like Jupyter, GitHub, and Curvenote.
Both Jupyter and GitHub are popular within open-source coding and research communities. Jupyter, built within an open-source community, provides a set of tools for scientific computing that have become one of the go-to resources for teaching and learning scientific programming. GitHub’s platform provides collaborative features around version control in many open-source communities. Curvenote is designed to integrate directly with Jupyter, to maintain an active link between research code and the resulting outputs, even when they appear outside of a Jupyter environment. The collaboration and version control features within Curvenote are built specifically for Jupyter notebooks, overcoming many issues of versioning notebooks in Git. Curvenote presents an approachable alternative to sharing and collaborating on scientific work for those not familiar with code development, or who may find GitHub intimidating.
Dr. Sun has continued to develop courses with open-source materials including an undergraduate geophysics course on electromagnetics using EM GeoSci, which was originally created for the SEG 2017 Distinguished Instructor Short Course on “Geophysical Electromagnetics: Fundamentals and Applications.” For the lab portion of the course Sun, “used a lot of the [Jupyter] Notebooks that they developed and modified [them] for undergraduates.” He added, gratefully that “the core part, background — the hard part — was taken care of by the SimPEG community.” Dr. Sun’s EM course materials are also openly available on GitHub.
As we continued to discuss education, Dr. Sun mentioned that the field of geoscience is experiencing a steep global decline of student enrollment at the university level (Geoscience on the chopping block, 2021). In relation to open science, many fields of geoscience have been slower to adopt such practices, perhaps in part due to the field’s proprietary past in both the oil & gas and mining industries.
Dr. Sun remarked, “On the other side, if you look at the machine learning community, [it] has been exploding! One thing that they have is a really good community that are willing to share. If you look at the papers published in top machine learning conferences almost all of them share their code on GitHub so you can immediately follow up, reproduce, and improve upon their work.”
Open science is an opportunity to get engagement and detailed constructive feedback. By sharing our own contributions we provide resources for fellow scientists, and by allowing others to provide feedback we can help to foster a community. These conversations can lead to collaboration on a global scale, working to move science and discovery forward. Given these ideas, Dr. Sun continued with a call for increased and continued openness in the geosciences with a “mindset of collaboration and sharing.”