====== Introduction to Git and GitHub for Musicology Editorial Work ======
===== Why this matters =====
In digital musicology, we often work with files that change over time:
* digital encoded files (MEI, MusicXML, MIDI, etc.)
* metadata tables
* documentation
* editorial notes
* scripts for processing musical sources
A common problem is that files get copied, renamed, emailed around, or stored as:
* ''final.xml''
* ''final_revised.xml''
* ''final_revised2.xml''
* ''final_REAL_final.xml''
This quickly becomes confusing.
**Git** helps us keep a structured history of changes.
**GitHub** makes it easier to synchronize work between computers and collaborate with others.
For musicological editorial work, this is especially useful because:
* we need to track who changed what
* we need to document editorial decisions
* we often work on the same corpus over long periods
* we need a safe way to synchronize data across machines
* we may want to collaborate on corrections
===== Learning goals =====
After this introduction, students should be able to:
* explain what Git is
* understand the difference between Git and GitHub
* create a local Git repository
* save changes with meaningful commit messages
* connect a local repository to GitHub
* synchronize editorial work between computers
* understand a basic collaboration workflow for MEI and OMR data
* avoid common mistakes when working with Git
===== What is Git? =====
Git is a **version control system**.
This means it records the history of a folder and its files over time.
With Git, you can:
* save snapshots of your work
* see what changed between two versions
* return to an earlier state
* work on new ideas without breaking the main version
* merge work from different collaborators
Git is **not** only for programmers.
It is useful for any research project where files evolve over time.
===== What is GitHub? =====
GitHub is a web platform built around Git.
Git itself works on your computer.
GitHub adds:
* online hosting of repositories
* synchronization between computers
* collaboration tools
* issue tracking
* pull requests for reviewing changes
* backup in the cloud
A useful short formula is:
* **Git = version control**
* **GitHub = online hosting and collaboration for Git repositories**
===== Core concepts =====
==== Repository ====
A **repository** (or "repo") is a project folder managed by Git.
It contains:
* your files
* a hidden Git history
* information about changes over time
==== Commit ====
A **commit** is a saved snapshot of your work.
A commit should record a meaningful step, for example:
* corrected clef encoding in measure 1
* normalized staff labels in manuscript A
* added measure numbers to MEI files
* fixed pitch errors in soprano part
==== GitHub remote ====
A **remote** is an online copy of your repository, for example on GitHub.
This lets you:
* upload your local work
* download changes made elsewhere
* keep multiple computers synchronized
==== Push and pull ====
* **push** = send your local commits to GitHub
* **pull** = download changes from GitHub to your local machine
==== Branch ====
A **branch** is a parallel line of work.
Branches are useful when:
* testing a new editorial strategy
* correcting one source separately
* working on a feature without changing the stable main version
===== Why Git is useful for OMR and MEI editorial work =====
Git works especially well when files are text-based.
MEI files are XML, so Git can track line-by-line changes very well.
This means Git can help you see:
* where measures were changed
* where attributes were added or removed
* where note encodings were corrected
* which editor made which change
===== Important limitation: binary files =====
Git works best with **plain text files**:
* ''.mei''
* ''.xml''
* ''.csv''
* ''.txt''
* ''.md''
* code files such as ''.py''
Git works less well with large binary files such as:
* ''.pdf''
* ''.docx''
* ''.png''
* ''.jpg''
* ''.tif''
This does not mean you cannot store such files in a repository, but:
* changes are harder to compare
* repositories become larger
* collaboration becomes less efficient
===== Installing Git =====
What is needed:
* Git installed on computer
* a GitHub account
* optionally GitHub Desktop, if you prefer a graphical interface
For long-term understanding, command line Git is better.
For quick adoption, GitHub Desktop may be easier.
Mei-friend has a sophisticated GitHub integration which we'll probably use most of the time, but git syntax is universally usable in all tools/environments.
===== First-time Git setup =====
After installing Git, set your name and email in the terminal:
git config --global user.name "Your Name"
git config --global user.email "your.email@example.org"
This identifies your commits.
Check the configuration:
git config --global --list
===== Starting a repository =====
Move into your project folder:
cd path/to/music-edition-project
Initialize Git:
git init
Now Git is tracking this folder as a repository.
===== Checking repository status =====
The most important command for beginners is:
git status
This shows:
* which files are new
* which files were changed
* which files are ready to be committed
Students should get used to running ''git status'' very often.
===== Adding files =====
To prepare all current changes for a commit:
git add .
To add one specific file:
git add data/mei/motet_01.mei
===== Making a commit =====
After adding files, save a snapshot:
git commit -m "Corrected note durations in motet_01"
A good commit message is:
* short
* specific
* meaningful
Good examples:
* ''Corrected mensuration encoding in Kyrie''
* ''Normalized staff labels in source B''
* ''Fixed barline errors in movement 3''
* ''Added editorial note on unclear accidentals''
Bad examples:
* ''changes''
* ''update''
* ''stuff''
* ''final''
===== Connecting to GitHub =====
Create an empty repository on GitHub.
Then connect your local repository to it:
git remote add origin https://github.com/USERNAME/REPOSITORY.git
Check that the remote was added:
git remote -v
===== Uploading your work to GitHub =====
If your main branch is called ''main'':
git branch -M main
git push -u origin main
After that, future uploads are usually just:
git push
===== Downloading changes from GitHub =====
Before you start working, download any new changes:
git pull
This is especially important if:
* you work on multiple computers
* a collaborator may have updated the repository
* you edited files on GitHub directly
===== A minimal daily workflow =====
git pull
git status
git add .
git commit -m "Describe the editorial change"
git push
This can be explained as:
* ''pull'' before work
* edit files
* ''status'' to inspect changes
* ''add'' to stage them
* ''commit'' to record them
* ''push'' to synchronize them
===== Collaboration basics =====
When multiple people work on one repository, Git can help organize collaboration.
A simple shared workflow:
* each editor pulls before starting
* each editor commits small changes regularly
* each editor writes clear commit messages
* each editor pushes finished work
* the team agrees on folder structure and editorial conventions
For slightly more advanced collaboration, use:
* branches
* pull requests
* code review or editorial review on GitHub
===== Branches for safer experimentation =====
Branches are useful when you want to try something without changing the stable version.
Example uses:
* testing a new encoding strategy
* revising one source separately
* trying automated clean-up on MEI files
* preparing a larger correction batch
Create a new branch:
git checkout -b revise-source-A
Work and commit as usual.
Later, the branch can be merged into the main branch.
===== Merge conflicts =====
A **merge conflict** happens when Git cannot automatically combine two competing changes.
Example:
* two people edit the same passage in the same MEI file
* both commit their changes
* Git does not know which version should win
This is normal and not a disaster.
The usual solution is:
- inspect the conflicting file
- decide which reading to keep
- edit the file manually
- save the corrected version
- commit the resolution
For beginners, the best prevention is:
* communicate who edits which files
* make small commits
* pull frequently
* avoid long unsynchronized work sessions
===== GitHub features that are useful for editorial projects =====
Useful GitHub features include:
==== Issues ====
Use issues to track:
* uncertain readings
* missing metadata
* files needing review
* recurring OMR errors
* editorial questions
==== Pull requests ====
A pull request lets someone propose changes before merging them.
This is useful for:
* checking corrections before accepting them
* discussing editorial choices
* reviewing larger changes
==== History and blame ====
GitHub also allows you to inspect file history and see who last changed a line.
This can be very helpful when asking:
* when was this corrected?
* who changed this encoding?
* why was this element added?
===== Glossary =====
^ Term ^ Meaning ^
| Git | Version control system |
| GitHub | Online platform for hosting Git repositories |
| Repository | A project folder tracked by Git |
| Commit | A saved snapshot of changes |
| Push | Upload local commits to GitHub |
| Pull | Download changes from GitHub |
| Branch | A separate line of development |
| Merge | Combine changes from different branches |
| Conflict | A situation where Git cannot automatically combine changes |