====== Introduction to Git and GitHub for Musicology Editorial Work ====== ===== Why this matters ===== In digital musicology, we often work with files that change over time: * digital encoded files (MEI, MusicXML, MIDI, etc.) * metadata tables * documentation * editorial notes * scripts for processing musical sources A common problem is that files get copied, renamed, emailed around, or stored as: * ''final.xml'' * ''final_revised.xml'' * ''final_revised2.xml'' * ''final_REAL_final.xml'' This quickly becomes confusing. **Git** helps us keep a structured history of changes. **GitHub** makes it easier to synchronize work between computers and collaborate with others. For musicological editorial work, this is especially useful because: * we need to track who changed what * we need to document editorial decisions * we often work on the same corpus over long periods * we need a safe way to synchronize data across machines * we may want to collaborate on corrections ===== Learning goals ===== After this introduction, students should be able to: * explain what Git is * understand the difference between Git and GitHub * create a local Git repository * save changes with meaningful commit messages * connect a local repository to GitHub * synchronize editorial work between computers * understand a basic collaboration workflow for MEI and OMR data * avoid common mistakes when working with Git ===== What is Git? ===== Git is a **version control system**. This means it records the history of a folder and its files over time. With Git, you can: * save snapshots of your work * see what changed between two versions * return to an earlier state * work on new ideas without breaking the main version * merge work from different collaborators Git is **not** only for programmers. It is useful for any research project where files evolve over time. ===== What is GitHub? ===== GitHub is a web platform built around Git. Git itself works on your computer. GitHub adds: * online hosting of repositories * synchronization between computers * collaboration tools * issue tracking * pull requests for reviewing changes * backup in the cloud A useful short formula is: * **Git = version control** * **GitHub = online hosting and collaboration for Git repositories** ===== Core concepts ===== ==== Repository ==== A **repository** (or "repo") is a project folder managed by Git. It contains: * your files * a hidden Git history * information about changes over time ==== Commit ==== A **commit** is a saved snapshot of your work. A commit should record a meaningful step, for example: * corrected clef encoding in measure 1 * normalized staff labels in manuscript A * added measure numbers to MEI files * fixed pitch errors in soprano part ==== GitHub remote ==== A **remote** is an online copy of your repository, for example on GitHub. This lets you: * upload your local work * download changes made elsewhere * keep multiple computers synchronized ==== Push and pull ==== * **push** = send your local commits to GitHub * **pull** = download changes from GitHub to your local machine ==== Branch ==== A **branch** is a parallel line of work. Branches are useful when: * testing a new editorial strategy * correcting one source separately * working on a feature without changing the stable main version ===== Why Git is useful for OMR and MEI editorial work ===== Git works especially well when files are text-based. MEI files are XML, so Git can track line-by-line changes very well. This means Git can help you see: * where measures were changed * where attributes were added or removed * where note encodings were corrected * which editor made which change ===== Important limitation: binary files ===== Git works best with **plain text files**: * ''.mei'' * ''.xml'' * ''.csv'' * ''.txt'' * ''.md'' * code files such as ''.py'' Git works less well with large binary files such as: * ''.pdf'' * ''.docx'' * ''.png'' * ''.jpg'' * ''.tif'' This does not mean you cannot store such files in a repository, but: * changes are harder to compare * repositories become larger * collaboration becomes less efficient ===== Installing Git ===== What is needed: * Git installed on computer * a GitHub account * optionally GitHub Desktop, if you prefer a graphical interface For long-term understanding, command line Git is better. For quick adoption, GitHub Desktop may be easier. Mei-friend has a sophisticated GitHub integration which we'll probably use most of the time, but git syntax is universally usable in all tools/environments. ===== First-time Git setup ===== After installing Git, set your name and email in the terminal: git config --global user.name "Your Name" git config --global user.email "your.email@example.org" This identifies your commits. Check the configuration: git config --global --list ===== Starting a repository ===== Move into your project folder: cd path/to/music-edition-project Initialize Git: git init Now Git is tracking this folder as a repository. ===== Checking repository status ===== The most important command for beginners is: git status This shows: * which files are new * which files were changed * which files are ready to be committed Students should get used to running ''git status'' very often. ===== Adding files ===== To prepare all current changes for a commit: git add . To add one specific file: git add data/mei/motet_01.mei ===== Making a commit ===== After adding files, save a snapshot: git commit -m "Corrected note durations in motet_01" A good commit message is: * short * specific * meaningful Good examples: * ''Corrected mensuration encoding in Kyrie'' * ''Normalized staff labels in source B'' * ''Fixed barline errors in movement 3'' * ''Added editorial note on unclear accidentals'' Bad examples: * ''changes'' * ''update'' * ''stuff'' * ''final'' ===== Connecting to GitHub ===== Create an empty repository on GitHub. Then connect your local repository to it: git remote add origin https://github.com/USERNAME/REPOSITORY.git Check that the remote was added: git remote -v ===== Uploading your work to GitHub ===== If your main branch is called ''main'': git branch -M main git push -u origin main After that, future uploads are usually just: git push ===== Downloading changes from GitHub ===== Before you start working, download any new changes: git pull This is especially important if: * you work on multiple computers * a collaborator may have updated the repository * you edited files on GitHub directly ===== A minimal daily workflow ===== git pull git status git add . git commit -m "Describe the editorial change" git push This can be explained as: * ''pull'' before work * edit files * ''status'' to inspect changes * ''add'' to stage them * ''commit'' to record them * ''push'' to synchronize them ===== Collaboration basics ===== When multiple people work on one repository, Git can help organize collaboration. A simple shared workflow: * each editor pulls before starting * each editor commits small changes regularly * each editor writes clear commit messages * each editor pushes finished work * the team agrees on folder structure and editorial conventions For slightly more advanced collaboration, use: * branches * pull requests * code review or editorial review on GitHub ===== Branches for safer experimentation ===== Branches are useful when you want to try something without changing the stable version. Example uses: * testing a new encoding strategy * revising one source separately * trying automated clean-up on MEI files * preparing a larger correction batch Create a new branch: git checkout -b revise-source-A Work and commit as usual. Later, the branch can be merged into the main branch. ===== Merge conflicts ===== A **merge conflict** happens when Git cannot automatically combine two competing changes. Example: * two people edit the same passage in the same MEI file * both commit their changes * Git does not know which version should win This is normal and not a disaster. The usual solution is: - inspect the conflicting file - decide which reading to keep - edit the file manually - save the corrected version - commit the resolution For beginners, the best prevention is: * communicate who edits which files * make small commits * pull frequently * avoid long unsynchronized work sessions ===== GitHub features that are useful for editorial projects ===== Useful GitHub features include: ==== Issues ==== Use issues to track: * uncertain readings * missing metadata * files needing review * recurring OMR errors * editorial questions ==== Pull requests ==== A pull request lets someone propose changes before merging them. This is useful for: * checking corrections before accepting them * discussing editorial choices * reviewing larger changes ==== History and blame ==== GitHub also allows you to inspect file history and see who last changed a line. This can be very helpful when asking: * when was this corrected? * who changed this encoding? * why was this element added? ===== Glossary ===== ^ Term ^ Meaning ^ | Git | Version control system | | GitHub | Online platform for hosting Git repositories | | Repository | A project folder tracked by Git | | Commit | A saved snapshot of changes | | Push | Upload local commits to GitHub | | Pull | Download changes from GitHub | | Branch | A separate line of development | | Merge | Combine changes from different branches | | Conflict | A situation where Git cannot automatically combine changes |