Collaborative Software Engineering Project in Computational Physics (9 ECTS)
Interested participants from academia and industry are most welcome to apply for the course: Collaborative Software Engineering Project in Computational Physics given by the Linköping university unit of Materials Design and Informatics. The course is offered within the Swedish e-Science education (SeSE) initiative and as part of our participation in the Data-driven computational materials design (DCMD) Multidisciplinary collaboration programme of the Swedish e-Science Research Centre.
- Next course start: August 30, 2021. Full digital participation possible.
- Course outline: 10 lectures, 4 hands-on exercises, and group work on a collaborative software project in computational physics.
- Successfully completing the course corresponds to 9 ECTS points.
- Teacher: Rickard Armiento, associate professor in Physical Modelling and head of the Materials Design and Informatics unit at Linköping University.
- Course homepage: https://mdi.gitlab-pages.liu.se/collab_proj_course.html
- I'm happy to answer any questions: contact Rickard Armiento, rickard.armiento [at] liu.se
Course contents at a glance
- Learn how to engineer software with collaborative software tools, including git and GitHub/GitLab. Evolve your skills from programming → software development → collaborative software engineering.
- Try an industry-relevant agile project model, using sprints and keeping track of tasks on a virtual Kanban board.
- Apply what you learn in a group project to develop a molecular dynamics software based on the ASE and ASAP libraries. Run the software in (semi) high-throughput on supercomputers and explore the data with visual data analysis.
- The project adds to your software project portfolio.
- Train your presentation skills with an oral presentation and a written final report.
- Four hands-on sessions:
- Version control with git, collaborative development on GitHub/GitLab.
- Exploratory Visual Data Analysis (+ automated software documentation systems)
- Molecular dynamics in Python with ASE and ASAP (+ automated testing/CI).
- High-throughput computations using supercomputers.
The course takes place in the autumn term starting end of August/beginning of September.
- August 30 - October 15 Introductionary part with 10 lectures (2h with 15 min break) and 4 practical hands-on exercises (each 4h). During this part, the project groups organize, plan, and prepare their projects.
- November 1 - December 17 Project execution part in which the project groups do the primary part of the project work. The work is coordinated over the Internet using the tools for collaborative software engineering covered in the course.
- The project execution part ends with an oral presentation and a written final report. These are meant to be completed before the holidays, but in case the final report needs further revision, there is an absolute final deadline in mid-January.
- Good communication skills in English.
- University-level education in mathematical analysis and linear algebra.
- University-level programming experience and preferably some knowledge of programming in Python.
Course Content Details
The course is aimed at those who want to elevate their skills beyond "programming" and get experience with modern practices in collaborative software development and software engineering. The course covers methods, tools, and workflows that enables working together on large software projects. These topics are covered in a series of lectures, hands-on exercises, and a group project in computational physics. The exercises and the project work primarily use Python.
The lectures span over both theoretical and practical aspects of software engineering as well as computational physics. They introduce agile project models, version control of software, documentation, software testing (automated unit and integration tests, CI/CD), parallel and concurrent execution, databases, exploratory data analysis with visualization, molecular dynamics, and computer simulation of materials.
The hands-on computer exercises provide practical training in version control in collaborative software projects, automated testing (CI), visualization, and computer simulations based on computational physics.
The participants then apply the acquired skills in practice in a collaborative project to develop a software package for molecular dynamics simulations according to an agile project model. The software will be implemented, tested, documented, and run in high-throughput to generate big data, which is explored, visualized, and inserted in a database that is made externally accessible via an open API. The primary part of the project work is ongoing November-December. It ends with a final examination in the form of an oral presentation by the group and a final report.
The course provides sufficient training on essential computational physics and molecular dynamics to allow students with less experience in these topics to participate. Nevertheless, the project also allows engaging more deeply in the computational physics aspects for those with more experience in the topic.
After completing the course, the participants will be able to:
- identify and apply central concepts of collaborative software development and engineering, and be familiar with the basic functions of standard tools.
- design, model, implement, test, document, and deliver a software system using modern practices and methodologies in software engineering, using an agile project model.
- implement, operate, and explore results of software for computational physics simulations.
Preliminary Outline of Lectures and Exercises
Lecture 1: Course Introduction and Project Models
- Overview, background, course plan, waterfall vs. agile project models, LIPs, Scrum.
Lecture 2: Software Versioning and Collaborative Development
- Version control systems (git, svn), commits/branching/merging, collaborative workflows with pull requests and reviews (GitHub).
Hands-on exercise 1: Git and GitHub
- Working with a local repository: commits, branches, merging.
- Collaborative software development with pull-requests, reviews, and approvals.
- Creating the shared online repository for your project.
Lecture 3: Exploratory Data Analysis by Visualization and Introduction to Computer simulations
- Single and multi-property exploration, identifying outliers, descriptors, heatmaps, PCA.
- Introduction to materials simulations in computational physics.
- Implementation considerations: representations of periodic structures, boundary conditions.
Lecture 4: Software Documentation and Licensing
- Documentation: UML, Source code comments, Embedded documentation, Sphinx.
- Software licensing: Open and closed source licenses (GPL, MIT, BSD, CC, etc.), CLAs.
Lecture 5: Software Engineering in Industry (Guest lecture)
Hands-on exercise 2: Exploratory Data Analysis and Documentation
- Visualization and data exploration in Python (matplotlib, and more.)
- Extracting inline software documentation with Sphinx.
Lecture 6: Introduction to Computational Physics and Molecular Dynamics
- Theoretical modeling of solid-state properties.
- The anatomy of a molecular dynamics program: interaction potentials, integration of equations of motion.
Lecture 7: Software Testing, Debugging, and Profiling
- Unit/integration/system/acceptance tests, black/white box, (non-)functional, test-driven development, coverage, CI/CD.
- Debuggers, profiling tools, algorithmic complexity.
Lecture 8: Molecular Dynamics (cont.)
- Calculating instantaneous properties, timesteps, thermalization.
- More advanced interaction potentials.
- Time and ensemble averages; pressure, heat capacity, MSD, Lindemann criterion, self-diffusion coefficient.
- Finding the equilibrium structure.
Hands-on exercise 3: Molecular dynamics and software testing
- Molecular dynamics with ASE and ASAP.
- Unit tests and continuous integration (CI) with GitHub actions.
Lecture 9: Concurrency and Parallelism
- Concurrency with coroutines; parallel threads (OpenMP), processes (MPI)
Hands-on exercise 4: Supercomputing
- Supercomputer usage, queue scripts, high-throughput computations, etc.
- Running ASAP-simulations on supercomputers.
Lecture 10: Databases and wrap-up
- Relational databases, normalization, transactions/ACID, SQL; noSQL, mongoDB.
- Making data available via open APIs.
- Final remarks about the project execution and final phases.
There will also be some "extra credit" material distributed on: advanced programming concepts: programming paradigms, multi-paradigm programming, programming patterns; and computer security aspects in software development.