Dies ist eine alte Version des Dokuments!
Corpora Work Progression
This page tracks the current state of corpus work, known issues, recurring MEI encoding patterns, and links to general documentation.
General documentation
Git and GitHub
MEI and XML
Current corpus status
Current corpus assignments
Open bugs
- mei-friend may create new branches if it cannot properly save work on the main branch.
- Always check which branch you are saving your work on.
Requested features / improvements
tba
Quick reference: recurring MEI topics
Rests
In MEI, ordinary rests and full-bar rests are not the same thing.
<rest>= a non-sounding event with a specific written duration<mRest>= a complete measure rest, independent of meter<multiRest>= multiple consecutive complete-measure rests compressed into one symbol, typically in parts<mSpace>= an explicitly empty measure, where no musical content is encoded but nothing is considered missing
When do we use ''<rest>''?
Use <rest> when the source shows a rest with a specific notated duration inside the rhythmic flow of the layer.
Typical cases:
- quarter rest inside a 4/4 measure
- half rest followed by notes
- a rest that is part of ongoing voice activity
- silence in one voice while another voice continues
Example:
<layer n="1"> <note pname="c" oct="4" dur="4"/> <rest dur="4"/> <note pname="d" oct="4" dur="2"/> </layer>
When do we use ''<mRest>''?
Use <mRest/> when the layer is silent for the entire measure and the source expresses that as a full-bar rest.
This is the preferred encoding for a complete measure rest because it does not depend on the current meter.
Example:
<measure n="12"> <staff n="1"> <layer n="1"> <mRest/> </layer> </staff> </measure>
Local recommendation
For our corpus work:
- use
<rest>for ordinary rests with explicit duration inside a measure - use
<mRest/>when a whole measure is silent in that layer - do not replace a full-bar rest with a duration-based
<rest>just because the meter happens to allow it - prefer
<mRest/>when the notation is semantically a “whole measure rest”
Important restriction
A layer containing <mRest/> should not also contain notes or ordinary rests.
Control events such as fermatas may still occur alongside it.
Full-bar silence in different meters
A full-bar rest should usually still be encoded as:
<mRest/>
This remains true in different meters:
- 2/4
- 3/4
- 4/4
- 3/8
- 6/8
- etc.
The point of <mRest> is that it means “this whole measure is silent”, regardless of how many beats the measure contains.
What about multi-measure rests?
Use <multiRest> when the source compresses several complete silent measures into a single multiple-rest symbol.
Example:
<layer n="1"> <multiRest num="9"/> </layer>
Local recommendation for multi-measure rests
For our project:
- use
<multiRest>only when the source actually presents a compressed multi-measure rest - avoid
<multiRest>in score-like encodings - if consecutive silent measures are shown individually in the source, encode them as separate measures with separate
<mRest/>elements
Empty measure vs. rest
Do not confuse:
<rest>= actual notated rest event<mRest>= actual complete-measure rest<mSpace>= explicitly empty measure with no encoded content
Use <mSpace> only when the layer is intentionally empty and this emptiness is itself what needs to be represented.
Practical examples
Example 1: ordinary rest
<measure n="5"> <staff n="1"> <layer n="1"> <rest dur="4"/> <note pname="e" oct="4" dur="4"/> <note pname="f" oct="4" dur="2"/> </layer> </staff> </measure>
Example 2: full-bar rest
<measure n="6"> <staff n="1"> <layer n="1"> <mRest/> </layer> </staff> </measure>
Example 3: multiple-rest in a part
<measure n="20"> <staff n="1"> <layer n="1"> <multiRest num="8"/> </layer> </staff> </measure>
Project rule of thumb
Ask:
- is this a rest with its own written duration inside the measure? → use
<rest> - is the whole measure silent? → use
<mRest> - are several full silent measures compressed into one sign? → use
<multiRest>
Voice handling
In MEI, a <layer> is an independent stream of events on a staff. A staff may contain more than one layer in order to represent multiple voices.
What is a layer?
A layer is best understood as a single rhythmic and event stream within one staff.
For beginners:
- one staff may have one layer
- one staff may also have two or more layers
- multiple layers are usually used when distinct voices must be represented independently
When does one staff need more than one ''<layer>''?
Use more than one layer when the notation clearly contains multiple independent voices on the same staff.
Typical cases:
- soprano and alto on one staff
- tenor and bass on one staff
- independent rhythms in upper and lower voices
- overlapping note values that cannot be represented cleanly in a single stream
- voice-specific ties, rests, slurs, or cues that belong to different voices
Do not create extra layers just because stems point in different directions once or twice if the passage is still best understood as one continuous voice.
How do we distinguish voices clearly?
At minimum:
- give each layer its own
nvalue - keep one voice consistently in the same layer as far as possible
- avoid switching the same musical voice back and forth between layers without a good reason
Example:
<staff n="1"> <layer n="1"> <note pname="g" oct="4" dur="2"/> <note pname="a" oct="4" dur="2"/> </layer> <layer n="2"> <rest dur="2"/> <note pname="e" oct="4" dur="2"/> </layer> </staff>
Local project policy for layer numbering
Recommended local convention:
layer n=„1“= upper voice or primary voicelayer n=„2“= lower voice or secondary voice- keep this convention stable across the corpus
- only add
n=„3“,n=„4“, etc. when genuinely required
Shared stems and polyphonic overlap
Shared stems and polyphonic overlap are often visually complex, but the encoding priority should be:
- represent the musical voices clearly
- keep each independent voice in its own layer when needed
- avoid forcing polyphonic notation into one layer if that makes duration or tie logic unclear
A useful practical rule is:
- if two events behave like separate voices, encode them in separate layers
- if the notation is only visually compressed but musically still one stream, keep one layer
Why this matters for ties and other relations
This matters because some relations are layer-sensitive.
For example:
- a tie that starts in one layer should also end in that same layer
So unstable or inconsistent layer assignment can create problems later.
Recommended workflow
When deciding whether to split into layers, ask:
- are there two independent rhythmic streams?
- are there overlapping note values that imply separate voices?
- are rests voice-specific?
- do ties or slurs belong to separate voices?
- would one-layer encoding become confusing?
If yes to several of these, use multiple layers.
Generalbass
MEI supports figured bass through harmonic indication markup.
The key elements are:
<harm>= the harmonic indication as the attached object<fb>= the figured-bass container<f>= one individual figure or component inside the figured bass sign
Which elements do we use?
For our corpus, the default pattern should be:
<harm tstamp="1"> <fb> <f>6</f> </fb> </harm>
This means:
harmprovides the attachment pointfbsays this is figured bass / Generalbassfholds the visible figure component
How do we align figured bass with notes or harmonic events?
harm must define a point of attachment using one of these attributes:
startidtstamptstamp.geststamp.real
The most common attachment methods are startid and tstamp.
For practical work, I recommend:
- use
startidwhen the figure clearly belongs to a specific encoded note or event - use
tstampwhen the figure is best attached to a beat position in the measure - prefer
startidwhen stable note-level linking matters for editorial or computational reuse
Example with ''tstamp''
<measure n="1"> <staff n="1"> <layer n="1"> <note pname="c" oct="3" dur="1"/> </layer> </staff> <harm tstamp="1"> <fb> <f>6</f> </fb> </harm> </measure>
Example with ''startid''
<measure n="1"> <staff n="1"> <layer n="1"> <note pname="c" oct="3" dur="1" xml:id="b1"/> </layer> </staff> <harm startid="#b1"> <fb> <f>6</f> <f>4</f> </fb> </harm> </measure>
Ordering of figures
The order of f elements is significant. Figures should be encoded in the order they appear, usually top to bottom on the page.
So this:
<fb> <f>6</f> <f>4</f> </fb>
is not just an arbitrary list; the order carries meaning.
Accidentals in figured bass
Accidentals can be encoded directly in the figure content.
Example:
<harm tstamp="1"> <fb> <f>7♭</f> </fb> </harm>
Recommended local policy
- use
harm+fb+fas the default structure - prefer
startidfor note-bound corpus work when possible - use
tstampwhen note-level linking is not practical - preserve figure order as written
- keep editorial additions explicitly distinguishable from source readings
Ties
A tie connects two notes of the same pitch so that the first note sounds for the combined duration of both notes.
Basic principle
Use ties only when:
- the connected notes have the same pitch
- the notation indicates a tie rather than a slur
- the sounding duration is continued across noteheads
How are ties encoded?
The simplest MEI method uses the tie attribute.
Allowed values:
i= initialm= medialt= terminal
Example:
<layer n="1"> <note pname="f" oct="4" dur="2" tie="i"/> <note pname="f" oct="4" dur="4" dots="1" tie="t"/> </layer>
Ties across barlines
A tie may continue into the following measure.
Example:
<measure n="1"> <staff n="1"> <layer n="1"> <note pname="g" oct="4" dur="2" tie="i"/> </layer> </staff> </measure> <measure n="2"> <staff n="1"> <layer n="1"> <note pname="g" oct="4" dur="2" tie="t"/> </layer> </staff> </measure>
Ties and layers
This point is crucial for corpus consistency:
- a tie that starts in one layer must end in the same layer
So for local practice:
- never begin a tie in layer 1 and end it in layer 2
- if the voice continues, keep it in the same layer
- check layer assignment before debugging a “broken” tie
Ties on chords
The tie attribute can also be used on chord.
When used on a chord, it acts as shorthand for multiple ties on all unchanged pitches in the chord.
Example:
<chord dur="4" tie="i"> <note pname="f" oct="4"/> <note pname="a" oct="4"/> <note pname="c" oct="5"/> </chord> <chord dur="4" tie="t"> <note pname="f" oct="4"/> <note pname="a" oct="4"/> <note pname="c" oct="5"/> </chord>
Local recommendation
For our project:
- use
tieonnotefor simple cases - use
tieonchordonly when the shorthand is genuinely clear - if only some notes of a chord are tied, encode ties on the individual
noteelements instead - check pitch identity carefully before calling something a tie
Tie vs slur
Do not confuse:
- tie = same pitch, sustained duration
- slur = phrasing or articulation grouping, usually across different pitches
This distinction matters both musically and computationally.
Minimal checklist
Before encoding a tie, ask:
- are the connected notes the same pitch?
- is this really a tie, not a slur?
- does the tied continuation stay in the same layer?
- if the notes are in a chord, are all pitches tied or only some?
Barline
When barlines should run through some staves, then break, and then continue through a lower group, encode this with nested staffGrp elements.
Local recommendation:
- use one outer
staffGrpfor the full system - create one child
staffGrp bar.thru=„true“for each continuous barline span - put the lower brace or bracket on the child group it actually belongs to
- do not rely on two top-level sibling groups if the system belongs together visually
Example: barline through staves 1-3, break, then through staves 4-5:
<staffGrp> <staffGrp bar.thru="true"> <staffDef n="1"/> <staffDef n="2"/> <staffDef n="3"/> </staffGrp> <staffGrp bar.thru="true"> <grpSym symbol="brace"/> <staffDef n="4"/> <staffDef n="5"/> </staffGrp> </staffGrp>
Minimal checklist:
- where should the barline continue without interruption?
- where should it stop?
- does each continuous span have its own child
staffGrp? - is the brace or bracket attached to the correct subgroup?
Staff and page breaks
For the current editorial task, also check the position of <sb> and <pb> carefully.
This is important because misplaced system or page breaks can change the number of bars per system or page and create mismatches with the facsimile layout.
Local recommendation:
- always compare encoded
<sb>and<pb>positions with the facsimile - check whether the number of measures per system matches the source
- check whether the number of measures per page matches the source
- if the bar count looks wrong, inspect break positions before changing musical content
Minimal checklist:
- does each encoded system begin where the facsimile begins?
- does each encoded page begin where the facsimile page begins?
- do the measures per system and per page still align with the source image?
Facsimile zones
When a measure still has no facsimile mapping, add a new <zone> in the relevant <surface> and then link that zone to the measure with the facs attribute.
Local recommendation:
- create the new
<zone>inside the correct<surface> - give it a unique
xml:id - set the bounding box with
ulx,uly,lrx, andlry - use
type=„measure“for measure zones - add or update
facs=„#zone_id“on the matching<measure>
Example:
<surface> <zone xml:id="zone_new123" ulx="100" uly="200" lrx="500" lry="800" type="measure"/> </surface> <measure n="12" facs="#zone_new123"> ... </measure>
Minimal checklist:
- is the new zone on the correct page surface?
- does the bounding box match the visible measure in the facsimile?
- does the
facsvalue point to the correct zone id? - does each measure point to exactly the intended measure zone?
Duration encoding: dur vs. dur.ppq
Please encode written musical duration with dur and, where needed, dots.
dur records the notated value, for example dur=„4“ for a quarter note or dur=„8“ for an eighth note. dur.ppq records a calculated playback/timing value in pulses per quarter note. It is not the editorial duration and does not represent the visual notation directly.
Local directive:
- use
duranddotsas the authoritative duration encoding - do not add
dur.ppqto notes, rests, chords, or spaces - do not add
ppqtostaffDeforscoreDef - remove existing
dur.ppqandppqvalues from imported files during cleanup
Reason:
- the edition is based on the facsimile and should preserve written notation, not generated timing data
dur.ppqis usually import or playback residue and can disagree with the writtendur/dots- rhythmic proofreading should check whether the written durations fill the measure
Minimal checklist:
- does every timed event have a clear written
durwhere required? - are dotted values encoded with
dots? - have
dur.ppqandppqbeen removed?