====== Corpora Work Progression ====== This page tracks the current state of corpus work, known issues, recurring MEI encoding patterns, and links to general documentation. ===== General documentation ===== ==== Git and GitHub ==== * [[git|Introduction to Git and GitHub]] ==== MEI and XML ==== * [[mei|Introduction to MEI and XML schema]] ===== Current corpus status ===== tba ===== Current corpus assignments ===== tba ===== Open bugs ===== * mei-friend may create new branches if it cannot properly save work on the **main** branch. * **Always check which branch you are saving your work on.** ==== Requested features / improvements ==== tba ===== Quick reference: recurring MEI topics ===== ==== Rests ==== In MEI, ordinary rests and full-bar rests are not the same thing. * '''' = a non-sounding event with a specific written duration * '''' = a complete measure rest, independent of meter * '''' = multiple consecutive complete-measure rests compressed into one symbol, typically in parts * '''' = an explicitly empty measure, where no musical content is encoded but nothing is considered missing === When do we use ''''? === Use '''' when the source shows a rest with a specific notated duration inside the rhythmic flow of the layer. Typical cases: * quarter rest inside a 4/4 measure * half rest followed by notes * a rest that is part of ongoing voice activity * silence in one voice while another voice continues Example: === When do we use ''''? === Use '''' when the layer is silent for the entire measure and the source expresses that as a full-bar rest. This is the preferred encoding for a complete measure rest because it does not depend on the current meter. Example: === Local recommendation === For our corpus work: * use '''' for ordinary rests with explicit duration inside a measure * use '''' when a whole measure is silent in that layer * do **not** replace a full-bar rest with a duration-based '''' just because the meter happens to allow it * prefer '''' when the notation is semantically a “whole measure rest” === Important restriction === A layer containing '''' should not also contain notes or ordinary rests. Control events such as fermatas may still occur alongside it. === Full-bar silence in different meters === A full-bar rest should usually still be encoded as: This remains true in different meters: * 2/4 * 3/4 * 4/4 * 3/8 * 6/8 * etc. The point of '''' is that it means “this whole measure is silent”, regardless of how many beats the measure contains. === What about multi-measure rests? === Use '''' when the source compresses several complete silent measures into a single multiple-rest symbol. Example: === Local recommendation for multi-measure rests === For our project: * use '''' only when the source actually presents a compressed multi-measure rest * avoid '''' in score-like encodings * if consecutive silent measures are shown individually in the source, encode them as separate measures with separate '''' elements === Empty measure vs. rest === Do not confuse: * '''' = actual notated rest event * '''' = actual complete-measure rest * '''' = explicitly empty measure with no encoded content Use '''' only when the layer is intentionally empty and this emptiness is itself what needs to be represented. === Practical examples === ==== Example 1: ordinary rest ==== ==== Example 2: full-bar rest ==== ==== Example 3: multiple-rest in a part ==== === Project rule of thumb === Ask: * is this a rest with its own written duration inside the measure? → use '''' * is the whole measure silent? → use '''' * are several full silent measures compressed into one sign? → use '''' ==== Voice handling ==== In MEI, a '''' is an independent stream of events on a staff. A staff may contain more than one layer in order to represent multiple voices. === What is a layer? === A layer is best understood as a single rhythmic and event stream within one staff. For beginners: * one staff may have one layer * one staff may also have two or more layers * multiple layers are usually used when distinct voices must be represented independently === When does one staff need more than one ''''? === Use more than one layer when the notation clearly contains multiple independent voices on the same staff. Typical cases: * soprano and alto on one staff * tenor and bass on one staff * independent rhythms in upper and lower voices * overlapping note values that cannot be represented cleanly in a single stream * voice-specific ties, rests, slurs, or cues that belong to different voices Do **not** create extra layers just because stems point in different directions once or twice if the passage is still best understood as one continuous voice. === How do we distinguish voices clearly? === At minimum: * give each layer its own ''n'' value * keep one voice consistently in the same layer as far as possible * avoid switching the same musical voice back and forth between layers without a good reason Example: === Local project policy for layer numbering === Recommended local convention: * ''layer n="1"'' = upper voice or primary voice * ''layer n="2"'' = lower voice or secondary voice * keep this convention stable across the corpus * only add ''n="3"'', ''n="4"'', etc. when genuinely required === Shared stems and polyphonic overlap === Shared stems and polyphonic overlap are often visually complex, but the encoding priority should be: * represent the musical voices clearly * keep each independent voice in its own layer when needed * avoid forcing polyphonic notation into one layer if that makes duration or tie logic unclear A useful practical rule is: * if two events behave like separate voices, encode them in separate layers * if the notation is only visually compressed but musically still one stream, keep one layer === Why this matters for ties and other relations === This matters because some relations are layer-sensitive. For example: * a tie that starts in one layer should also end in that same layer So unstable or inconsistent layer assignment can create problems later. === Recommended workflow === When deciding whether to split into layers, ask: - are there two independent rhythmic streams? - are there overlapping note values that imply separate voices? - are rests voice-specific? - do ties or slurs belong to separate voices? - would one-layer encoding become confusing? If yes to several of these, use multiple layers. ==== Generalbass ==== MEI supports figured bass through harmonic indication markup. The key elements are: * '''' = the harmonic indication as the attached object * '''' = the figured-bass container * '''' = one individual figure or component inside the figured bass sign === Which elements do we use? === For our corpus, the default pattern should be: 6 This means: * ''harm'' provides the attachment point * ''fb'' says this is figured bass / Generalbass * ''f'' holds the visible figure component === How do we align figured bass with notes or harmonic events? === ''harm'' must define a point of attachment using one of these attributes: * ''startid'' * ''tstamp'' * ''tstamp.ges'' * ''tstamp.real'' The most common attachment methods are ''startid'' and ''tstamp''. For practical work, I recommend: * use ''startid'' when the figure clearly belongs to a specific encoded note or event * use ''tstamp'' when the figure is best attached to a beat position in the measure * prefer ''startid'' when stable note-level linking matters for editorial or computational reuse === Example with ''tstamp'' === 6 === Example with ''startid'' === 6 4 === Ordering of figures === The order of ''f'' elements is significant. Figures should be encoded in the order they appear, usually top to bottom on the page. So this: 6 4 is not just an arbitrary list; the order carries meaning. === Accidentals in figured bass === Accidentals can be encoded directly in the figure content. Example: 7♭ === Recommended local policy === * use ''harm'' + ''fb'' + ''f'' as the default structure * prefer ''startid'' for note-bound corpus work when possible * use ''tstamp'' when note-level linking is not practical * preserve figure order as written * keep editorial additions explicitly distinguishable from source readings ==== Ties ==== A tie connects two notes of the same pitch so that the first note sounds for the combined duration of both notes. === Basic principle === Use ties only when: * the connected notes have the same pitch * the notation indicates a tie rather than a slur * the sounding duration is continued across noteheads === How are ties encoded? === The simplest MEI method uses the ''tie'' attribute. Allowed values: * ''i'' = initial * ''m'' = medial * ''t'' = terminal Example: === Ties across barlines === A tie may continue into the following measure. Example: === Ties and layers === This point is crucial for corpus consistency: * a tie that starts in one layer must end in the same layer So for local practice: * never begin a tie in layer 1 and end it in layer 2 * if the voice continues, keep it in the same layer * check layer assignment before debugging a “broken” tie === Ties on chords === The ''tie'' attribute can also be used on ''chord''. When used on a chord, it acts as shorthand for multiple ties on all unchanged pitches in the chord. Example: === Local recommendation === For our project: * use ''tie'' on ''note'' for simple cases * use ''tie'' on ''chord'' only when the shorthand is genuinely clear * if only some notes of a chord are tied, encode ties on the individual ''note'' elements instead * check pitch identity carefully before calling something a tie === Tie vs slur === Do not confuse: * tie = same pitch, sustained duration * slur = phrasing or articulation grouping, usually across different pitches This distinction matters both musically and computationally. === Minimal checklist === Before encoding a tie, ask: * are the connected notes the same pitch? * is this really a tie, not a slur? * does the tied continuation stay in the same layer? * if the notes are in a chord, are all pitches tied or only some?