Introduction to Git and GitHub for Musicology Editorial Work
Why this matters
In digital musicology, we often work with files that change over time:
- digital encoded files (MEI, MusicXML, MIDI, etc.)
- metadata tables
- documentation
- editorial notes
- scripts for processing musical sources
A common problem is that files get copied, renamed, emailed around, or stored as:
final.xmlfinal_revised.xmlfinal_revised2.xmlfinal_REAL_final.xml
This quickly becomes confusing.
Git helps us keep a structured history of changes. GitHub makes it easier to synchronize work between computers and collaborate with others.
For musicological editorial work, this is especially useful because:
- we need to track who changed what
- we need to document editorial decisions
- we often work on the same corpus over long periods
- we need a safe way to synchronize data across machines
- we may want to collaborate on corrections
Learning goals
After this introduction, students should be able to:
- explain what Git is
- understand the difference between Git and GitHub
- create a local Git repository
- save changes with meaningful commit messages
- connect a local repository to GitHub
- synchronize editorial work between computers
- understand a basic collaboration workflow for MEI and OMR data
- avoid common mistakes when working with Git
What is Git?
Git is a version control system.
This means it records the history of a folder and its files over time.
With Git, you can:
- save snapshots of your work
- see what changed between two versions
- return to an earlier state
- work on new ideas without breaking the main version
- merge work from different collaborators
Git is not only for programmers. It is useful for any research project where files evolve over time.
What is GitHub?
GitHub is a web platform built around Git.
Git itself works on your computer. GitHub adds:
- online hosting of repositories
- synchronization between computers
- collaboration tools
- issue tracking
- pull requests for reviewing changes
- backup in the cloud
A useful short formula is:
- Git = version control
- GitHub = online hosting and collaboration for Git repositories
Core concepts
Repository
A repository (or „repo“) is a project folder managed by Git.
It contains:
- your files
- a hidden Git history
- information about changes over time
Commit
A commit is a saved snapshot of your work.
A commit should record a meaningful step, for example:
- corrected clef encoding in measure 1
- normalized staff labels in manuscript A
- added measure numbers to MEI files
- fixed pitch errors in soprano part
GitHub remote
A remote is an online copy of your repository, for example on GitHub.
This lets you:
- upload your local work
- download changes made elsewhere
- keep multiple computers synchronized
Push and pull
- push = send your local commits to GitHub
- pull = download changes from GitHub to your local machine
Branch
A branch is a parallel line of work.
Branches are useful when:
- testing a new editorial strategy
- correcting one source separately
- working on a feature without changing the stable main version
Why Git is useful for OMR and MEI editorial work
Git works especially well when files are text-based.
MEI files are XML, so Git can track line-by-line changes very well.
This means Git can help you see:
- where measures were changed
- where attributes were added or removed
- where note encodings were corrected
- which editor made which change
Important limitation: binary files
Git works best with plain text files:
.mei.xml.csv.txt.md- code files such as
.py
Git works less well with large binary files such as:
.pdf.docx.png.jpg.tif
This does not mean you cannot store such files in a repository, but:
- changes are harder to compare
- repositories become larger
- collaboration becomes less efficient
Installing Git
What is needed:
- Git installed on computer
- a GitHub account
- optionally GitHub Desktop, if you prefer a graphical interface
For long-term understanding, command line Git is better. For quick adoption, GitHub Desktop may be easier.
Mei-friend has a sophisticated GitHub integration which we'll probably use most of the time, but git syntax is universally usable in all tools/environments.
First-time Git setup
After installing Git, set your name and email in the terminal:
git config --global user.name "Your Name" git config --global user.email "your.email@example.org"
This identifies your commits.
Check the configuration:
git config --global --list
Starting a repository
Move into your project folder:
cd path/to/music-edition-project
Initialize Git:
git init
Now Git is tracking this folder as a repository.
Checking repository status
The most important command for beginners is:
git status
This shows:
- which files are new
- which files were changed
- which files are ready to be committed
Students should get used to running git status very often.
Adding files
To prepare all current changes for a commit:
git add .
To add one specific file:
git add data/mei/motet_01.mei
Making a commit
After adding files, save a snapshot:
git commit -m "Corrected note durations in motet_01"
A good commit message is:
- short
- specific
- meaningful
Good examples:
Corrected mensuration encoding in KyrieNormalized staff labels in source BFixed barline errors in movement 3Added editorial note on unclear accidentals
Bad examples:
changesupdatestufffinal
Connecting to GitHub
Create an empty repository on GitHub.
Then connect your local repository to it:
git remote add origin https://github.com/USERNAME/REPOSITORY.git
Check that the remote was added:
git remote -v
Uploading your work to GitHub
If your main branch is called main:
git branch -M main git push -u origin main
After that, future uploads are usually just:
git push
Downloading changes from GitHub
Before you start working, download any new changes:
git pull
This is especially important if:
- you work on multiple computers
- a collaborator may have updated the repository
- you edited files on GitHub directly
A minimal daily workflow
git pull git status git add . git commit -m "Describe the editorial change" git push
This can be explained as:
pullbefore work- edit files
statusto inspect changesaddto stage themcommitto record thempushto synchronize them
Collaboration basics
When multiple people work on one repository, Git can help organize collaboration.
A simple shared workflow:
- each editor pulls before starting
- each editor commits small changes regularly
- each editor writes clear commit messages
- each editor pushes finished work
- the team agrees on folder structure and editorial conventions
For slightly more advanced collaboration, use:
- branches
- pull requests
- code review or editorial review on GitHub
Branches for safer experimentation
Branches are useful when you want to try something without changing the stable version.
Example uses:
- testing a new encoding strategy
- revising one source separately
- trying automated clean-up on MEI files
- preparing a larger correction batch
Create a new branch:
git checkout -b revise-source-A
Work and commit as usual.
Later, the branch can be merged into the main branch.
Merge conflicts
A merge conflict happens when Git cannot automatically combine two competing changes.
Example:
- two people edit the same passage in the same MEI file
- both commit their changes
- Git does not know which version should win
This is normal and not a disaster.
The usual solution is:
- inspect the conflicting file
- decide which reading to keep
- edit the file manually
- save the corrected version
- commit the resolution
For beginners, the best prevention is:
- communicate who edits which files
- make small commits
- pull frequently
- avoid long unsynchronized work sessions
GitHub features that are useful for editorial projects
Useful GitHub features include:
Issues
Use issues to track:
- uncertain readings
- missing metadata
- files needing review
- recurring OMR errors
- editorial questions
Pull requests
A pull request lets someone propose changes before merging them.
This is useful for:
- checking corrections before accepting them
- discussing editorial choices
- reviewing larger changes
History and blame
GitHub also allows you to inspect file history and see who last changed a line.
This can be very helpful when asking:
- when was this corrected?
- who changed this encoding?
- why was this element added?
Glossary
| Term | Meaning |
|---|---|
| Git | Version control system |
| GitHub | Online platform for hosting Git repositories |
| Repository | A project folder tracked by Git |
| Commit | A saved snapshot of changes |
| Push | Upload local commits to GitHub |
| Pull | Download changes from GitHub |
| Branch | A separate line of development |
| Merge | Combine changes from different branches |
| Conflict | A situation where Git cannot automatically combine changes |