Corpora Work Progression

Dies ist eine alte Version des Dokuments!

This page tracks the current state of corpus work, known issues, recurring MEI encoding patterns, and links to general documentation.

Introduction to Git and GitHub

Introduction to MEI and XML schema

tba

mei-friend may create new branches if it cannot properly save work on the main branch.
Always check which branch you are saving your work on.

tba

In MEI, ordinary rests and full-bar rests are not the same thing.

<rest> = a non-sounding event with a specific written duration
<mRest> = a complete measure rest, independent of meter
<multiRest> = multiple consecutive complete-measure rests compressed into one symbol, typically in parts
<mSpace> = an explicitly empty measure, where no musical content is encoded but nothing is considered missing

When do we use ''<rest>''?

Use <rest> when the source shows a rest with a specific notated duration inside the rhythmic flow of the layer.

Typical cases:

quarter rest inside a 4/4 measure
half rest followed by notes
a rest that is part of ongoing voice activity
silence in one voice while another voice continues

Example:

<layer n="1">
  <note pname="c" oct="4" dur="4"/>
  <rest dur="4"/>
  <note pname="d" oct="4" dur="2"/>
</layer>

When do we use ''<mRest>''?

Use <mRest/> when the layer is silent for the entire measure and the source expresses that as a full-bar rest.

This is the preferred encoding for a complete measure rest because it does not depend on the current meter.

Example:

<measure n="12">
  <staff n="1">
    <layer n="1">
      <mRest/>
    </layer>
  </staff>
</measure>

Local recommendation

For our corpus work:

use <rest> for ordinary rests with explicit duration inside a measure
use <mRest/> when a whole measure is silent in that layer
do not replace a full-bar rest with a duration-based <rest> just because the meter happens to allow it
prefer <mRest/> when the notation is semantically a “whole measure rest”

Important restriction

A layer containing <mRest/> should not also contain notes or ordinary rests. Control events such as fermatas may still occur alongside it.

Full-bar silence in different meters

A full-bar rest should usually still be encoded as:

<mRest/>

This remains true in different meters:

2/4
3/4
4/4
3/8
6/8
etc.

The point of <mRest> is that it means “this whole measure is silent”, regardless of how many beats the measure contains.

What about multi-measure rests?

Use <multiRest> when the source compresses several complete silent measures into a single multiple-rest symbol.

Example:

<layer n="1">
  <multiRest num="9"/>
</layer>

Local recommendation for multi-measure rests

For our project:

use <multiRest> only when the source actually presents a compressed multi-measure rest
avoid <multiRest> in score-like encodings
if consecutive silent measures are shown individually in the source, encode them as separate measures with separate <mRest/> elements

Empty measure vs. rest

Do not confuse:

<rest> = actual notated rest event
<mRest> = actual complete-measure rest
<mSpace> = explicitly empty measure with no encoded content

Use <mSpace> only when the layer is intentionally empty and this emptiness is itself what needs to be represented.

Practical examples

<measure n="5">
  <staff n="1">
    <layer n="1">
      <rest dur="4"/>
      <note pname="e" oct="4" dur="4"/>
      <note pname="f" oct="4" dur="2"/>
    </layer>
  </staff>
</measure>

<measure n="6">
  <staff n="1">
    <layer n="1">
      <mRest/>
    </layer>
  </staff>
</measure>

<measure n="20">
  <staff n="1">
    <layer n="1">
      <multiRest num="8"/>
    </layer>
  </staff>
</measure>

Project rule of thumb

Ask:

is this a rest with its own written duration inside the measure? → use <rest>
is the whole measure silent? → use <mRest>
are several full silent measures compressed into one sign? → use <multiRest>

In MEI, a <layer> is an independent stream of events on a staff. A staff may contain more than one layer in order to represent multiple voices.

What is a layer?

A layer is best understood as a single rhythmic and event stream within one staff.

For beginners:

one staff may have one layer
one staff may also have two or more layers
multiple layers are usually used when distinct voices must be represented independently

When does one staff need more than one ''<layer>''?

Use more than one layer when the notation clearly contains multiple independent voices on the same staff.

Typical cases:

soprano and alto on one staff
tenor and bass on one staff
independent rhythms in upper and lower voices
overlapping note values that cannot be represented cleanly in a single stream
voice-specific ties, rests, slurs, or cues that belong to different voices

Do not create extra layers just because stems point in different directions once or twice if the passage is still best understood as one continuous voice.

How do we distinguish voices clearly?

At minimum:

give each layer its own n value
keep one voice consistently in the same layer as far as possible
avoid switching the same musical voice back and forth between layers without a good reason

Example:

<staff n="1">
  <layer n="1">
    <note pname="g" oct="4" dur="2"/>
    <note pname="a" oct="4" dur="2"/>
  </layer>
  <layer n="2">
    <rest dur="2"/>
    <note pname="e" oct="4" dur="2"/>
  </layer>
</staff>

Local project policy for layer numbering

Recommended local convention:

layer n=„1“ = upper voice or primary voice
layer n=„2“ = lower voice or secondary voice
keep this convention stable across the corpus
only add n=„3“, n=„4“, etc. when genuinely required

Shared stems and polyphonic overlap

Shared stems and polyphonic overlap are often visually complex, but the encoding priority should be:

represent the musical voices clearly
keep each independent voice in its own layer when needed
avoid forcing polyphonic notation into one layer if that makes duration or tie logic unclear

A useful practical rule is:

if two events behave like separate voices, encode them in separate layers
if the notation is only visually compressed but musically still one stream, keep one layer

Why this matters for ties and other relations

This matters because some relations are layer-sensitive.

For example:

a tie that starts in one layer should also end in that same layer

So unstable or inconsistent layer assignment can create problems later.

Recommended workflow

When deciding whether to split into layers, ask:

are there two independent rhythmic streams?
are there overlapping note values that imply separate voices?
are rests voice-specific?
do ties or slurs belong to separate voices?
would one-layer encoding become confusing?

If yes to several of these, use multiple layers.

MEI supports figured bass through harmonic indication markup.

The key elements are:

<harm> = the harmonic indication as the attached object
<fb> = the figured-bass container
<f> = one individual figure or component inside the figured bass sign

Which elements do we use?

For our corpus, the default pattern should be:

<harm tstamp="1">
  <fb>
    <f>6</f>
  </fb>
</harm>

This means:

harm provides the attachment point
fb says this is figured bass / Generalbass
f holds the visible figure component

How do we align figured bass with notes or harmonic events?

harm must define a point of attachment using one of these attributes:

startid
tstamp
tstamp.ges
tstamp.real

The most common attachment methods are startid and tstamp.

For practical work, I recommend:

use startid when the figure clearly belongs to a specific encoded note or event
use tstamp when the figure is best attached to a beat position in the measure
prefer startid when stable note-level linking matters for editorial or computational reuse

Example with ''tstamp''

<measure n="1">
  <staff n="1">
    <layer n="1">
      <note pname="c" oct="3" dur="1"/>
    </layer>
  </staff>
 
  <harm tstamp="1">
    <fb>
      <f>6</f>
    </fb>
  </harm>
</measure>

Example with ''startid''

<measure n="1">
  <staff n="1">
    <layer n="1">
      <note pname="c" oct="3" dur="1" xml:id="b1"/>
    </layer>
  </staff>
 
  <harm startid="#b1">
    <fb>
      <f>6</f>
      <f>4</f>
    </fb>
  </harm>
</measure>

Ordering of figures

The order of f elements is significant. Figures should be encoded in the order they appear, usually top to bottom on the page.

So this:

<fb>
  <f>6</f>
  <f>4</f>
</fb>

is not just an arbitrary list; the order carries meaning.

Accidentals in figured bass

Accidentals can be encoded directly in the figure content.

Example:

<harm tstamp="1">
  <fb>
    <f>7♭</f>
  </fb>
</harm>

Uncertain or editorially supplied figures

For local corpus policy, use the following distinction:

source-visible figures: encode directly in fb / f
editorially supplied figures: encode them, but mark editorial responsibility explicitly
uncertain readings: record the uncertainty in a way that remains visible in project documentation

Suggested local rule:

supplied figures must be marked as editorial
uncertain figures must be noted in comments or project documentation
do not silently normalize ambiguous source notation

Recommended local policy

use harm + fb + f as the default structure
prefer startid for note-bound corpus work when possible
use tstamp when note-level linking is not practical
preserve figure order as written
keep editorial additions explicitly distinguishable from source readings

A tie connects two notes of the same pitch so that the first note sounds for the combined duration of both notes.

Basic principle

Use ties only when:

the connected notes have the same pitch
the notation indicates a tie rather than a slur
the sounding duration is continued across noteheads

How are ties encoded?

The simplest MEI method uses the tie attribute.

Allowed values:

i = initial
m = medial
t = terminal

Example:

<layer n="1">
  <note pname="f" oct="4" dur="2" tie="i"/>
  <note pname="f" oct="4" dur="4" dots="1" tie="t"/>
</layer>

Ties across barlines

A tie may continue into the following measure.

Example:

<measure n="1">
  <staff n="1">
    <layer n="1">
      <note pname="g" oct="4" dur="2" tie="i"/>
    </layer>
  </staff>
</measure>
 
<measure n="2">
  <staff n="1">
    <layer n="1">
      <note pname="g" oct="4" dur="2" tie="t"/>
    </layer>
  </staff>
</measure>

Ties and layers

This point is crucial for corpus consistency:

a tie that starts in one layer must end in the same layer

So for local practice:

never begin a tie in layer 1 and end it in layer 2
if the voice continues, keep it in the same layer
check layer assignment before debugging a “broken” tie

Ties on chords

The tie attribute can also be used on chord.

When used on a chord, it acts as shorthand for multiple ties on all unchanged pitches in the chord.

Example:

<chord dur="4" tie="i">
  <note pname="f" oct="4"/>
  <note pname="a" oct="4"/>
  <note pname="c" oct="5"/>
</chord>
<chord dur="4" tie="t">
  <note pname="f" oct="4"/>
  <note pname="a" oct="4"/>
  <note pname="c" oct="5"/>
</chord>

Local recommendation

For our project:

use tie on note for simple cases
use tie on chord only when the shorthand is genuinely clear
if only some notes of a chord are tied, encode ties on the individual note elements instead
check pitch identity carefully before calling something a tie

Tie vs slur

Do not confuse:

tie = same pitch, sustained duration
slur = phrasing or articulation grouping, usually across different pitches

This distinction matters both musically and computationally.

Minimal checklist

Before encoding a tie, ask:

are the connected notes the same pitch?
is this really a tie, not a slur?
does the tied continuation stay in the same layer?
if the notes are in a chord, are all pitches tied or only some?

Corpora Work Progression

General documentation

Git and GitHub

MEI and XML

Current corpus status

Current corpus assignments

Open bugs

Requested features / improvements

Quick reference: recurring MEI topics

Rests

When do we use ''<rest>''?

When do we use ''<mRest>''?

Local recommendation

Important restriction

Full-bar silence in different meters

What about multi-measure rests?

Local recommendation for multi-measure rests

Empty measure vs. rest

Practical examples

Example 1: ordinary rest

Example 2: full-bar rest

Example 3: multiple-rest in a part

Project rule of thumb

Voice handling

What is a layer?

When does one staff need more than one ''<layer>''?

How do we distinguish voices clearly?

Local project policy for layer numbering

Shared stems and polyphonic overlap

Why this matters for ties and other relations

Recommended workflow

Generalbass

Which elements do we use?

How do we align figured bass with notes or harmonic events?

Example with ''tstamp''

Example with ''startid''

Ordering of figures

Accidentals in figured bass

Uncertain or editorially supplied figures

Recommended local policy

Ties

Basic principle

How are ties encoded?

Ties across barlines

Ties and layers

Ties on chords

Local recommendation

Tie vs slur

Minimal checklist