en:git

Introduction to Git and GitHub for Musicology Editorial Work

In digital musicology, we often work with files that change over time:

  • digital encoded files (MEI, MusicXML, MIDI, etc.)
  • metadata tables
  • documentation
  • editorial notes
  • scripts for processing musical sources

A common problem is that files get copied, renamed, emailed around, or stored as:

  • final.xml
  • final_revised.xml
  • final_revised2.xml
  • final_REAL_final.xml

This quickly becomes confusing.

Git helps us keep a structured history of changes. GitHub makes it easier to synchronize work between computers and collaborate with others.

For musicological editorial work, this is especially useful because:

  • we need to track who changed what
  • we need to document editorial decisions
  • we often work on the same corpus over long periods
  • we need a safe way to synchronize data across machines
  • we may want to collaborate on corrections

After this introduction, students should be able to:

  • explain what Git is
  • understand the difference between Git and GitHub
  • create a local Git repository
  • save changes with meaningful commit messages
  • connect a local repository to GitHub
  • synchronize editorial work between computers
  • understand a basic collaboration workflow for MEI and OMR data
  • avoid common mistakes when working with Git

Git is a version control system.

This means it records the history of a folder and its files over time.

With Git, you can:

  • save snapshots of your work
  • see what changed between two versions
  • return to an earlier state
  • work on new ideas without breaking the main version
  • merge work from different collaborators

Git is not only for programmers. It is useful for any research project where files evolve over time.

GitHub is a web platform built around Git.

Git itself works on your computer. GitHub adds:

  • online hosting of repositories
  • synchronization between computers
  • collaboration tools
  • issue tracking
  • pull requests for reviewing changes
  • backup in the cloud

A useful short formula is:

  • Git = version control
  • GitHub = online hosting and collaboration for Git repositories

A repository (or „repo“) is a project folder managed by Git.

It contains:

  • your files
  • a hidden Git history
  • information about changes over time

A commit is a saved snapshot of your work.

A commit should record a meaningful step, for example:

  • corrected clef encoding in measure 1
  • normalized staff labels in manuscript A
  • added measure numbers to MEI files
  • fixed pitch errors in soprano part

A remote is an online copy of your repository, for example on GitHub.

This lets you:

  • upload your local work
  • download changes made elsewhere
  • keep multiple computers synchronized
  • push = send your local commits to GitHub
  • pull = download changes from GitHub to your local machine

A branch is a parallel line of work.

Branches are useful when:

  • testing a new editorial strategy
  • correcting one source separately
  • working on a feature without changing the stable main version

Git works especially well when files are text-based.

MEI files are XML, so Git can track line-by-line changes very well.

This means Git can help you see:

  • where measures were changed
  • where attributes were added or removed
  • where note encodings were corrected
  • which editor made which change

Git works best with plain text files:

  • .mei
  • .xml
  • .csv
  • .txt
  • .md
  • code files such as .py

Git works less well with large binary files such as:

  • .pdf
  • .docx
  • .png
  • .jpg
  • .tif

This does not mean you cannot store such files in a repository, but:

  • changes are harder to compare
  • repositories become larger
  • collaboration becomes less efficient

What is needed:

  • Git installed on computer
  • a GitHub account
  • optionally GitHub Desktop, if you prefer a graphical interface

For long-term understanding, command line Git is better. For quick adoption, GitHub Desktop may be easier.

Mei-friend has a sophisticated GitHub integration which we'll probably use most of the time, but git syntax is universally usable in all tools/environments.

After installing Git, set your name and email in the terminal:

git config --global user.name "Your Name"
git config --global user.email "your.email@example.org"

This identifies your commits.

Check the configuration:

git config --global --list

Move into your project folder:

cd path/to/music-edition-project

Initialize Git:

git init

Now Git is tracking this folder as a repository.

The most important command for beginners is:

git status

This shows:

  • which files are new
  • which files were changed
  • which files are ready to be committed

Students should get used to running git status very often.

To prepare all current changes for a commit:

git add .

To add one specific file:

git add data/mei/motet_01.mei

After adding files, save a snapshot:

git commit -m "Corrected note durations in motet_01"

A good commit message is:

  • short
  • specific
  • meaningful

Good examples:

  • Corrected mensuration encoding in Kyrie
  • Normalized staff labels in source B
  • Fixed barline errors in movement 3
  • Added editorial note on unclear accidentals

Bad examples:

  • changes
  • update
  • stuff
  • final

Create an empty repository on GitHub.

Then connect your local repository to it:

git remote add origin https://github.com/USERNAME/REPOSITORY.git

Check that the remote was added:

git remote -v

If your main branch is called main:

git branch -M main
git push -u origin main

After that, future uploads are usually just:

git push

Before you start working, download any new changes:

git pull

This is especially important if:

  • you work on multiple computers
  • a collaborator may have updated the repository
  • you edited files on GitHub directly
git pull
git status
git add .
git commit -m "Describe the editorial change"
git push

This can be explained as:

  • pull before work
  • edit files
  • status to inspect changes
  • add to stage them
  • commit to record them
  • push to synchronize them

When multiple people work on one repository, Git can help organize collaboration.

A simple shared workflow:

  • each editor pulls before starting
  • each editor commits small changes regularly
  • each editor writes clear commit messages
  • each editor pushes finished work
  • the team agrees on folder structure and editorial conventions

For slightly more advanced collaboration, use:

  • branches
  • pull requests
  • code review or editorial review on GitHub

Branches are useful when you want to try something without changing the stable version.

Example uses:

  • testing a new encoding strategy
  • revising one source separately
  • trying automated clean-up on MEI files
  • preparing a larger correction batch

Create a new branch:

git checkout -b revise-source-A

Work and commit as usual.

Later, the branch can be merged into the main branch.

A merge conflict happens when Git cannot automatically combine two competing changes.

Example:

  • two people edit the same passage in the same MEI file
  • both commit their changes
  • Git does not know which version should win

This is normal and not a disaster.

The usual solution is:

  1. inspect the conflicting file
  2. decide which reading to keep
  3. edit the file manually
  4. save the corrected version
  5. commit the resolution

For beginners, the best prevention is:

  • communicate who edits which files
  • make small commits
  • pull frequently
  • avoid long unsynchronized work sessions

Useful GitHub features include:

Use issues to track:

  • uncertain readings
  • missing metadata
  • files needing review
  • recurring OMR errors
  • editorial questions

A pull request lets someone propose changes before merging them.

This is useful for:

  • checking corrections before accepting them
  • discussing editorial choices
  • reviewing larger changes

GitHub also allows you to inspect file history and see who last changed a line.

This can be very helpful when asking:

  • when was this corrected?
  • who changed this encoding?
  • why was this element added?
Term Meaning
Git Version control system
GitHub Online platform for hosting Git repositories
Repository A project folder tracked by Git
Commit A saved snapshot of changes
Push Upload local commits to GitHub
Pull Download changes from GitHub
Branch A separate line of development
Merge Combine changes from different branches
Conflict A situation where Git cannot automatically combine changes
  • en/git.txt
  • Zuletzt geändert: 2026/04/01 08:33
  • von egorpoly