Transcriptional Protocols
General Introduction
Note:
The Transcriptional Protocols are under revision. If you are currently preparing an edition that you wish to conform to PPEA/SEENET standards, please contact the Archive to receive a draft copy of the current protocols.
What To Do First
The transcriptional protocols are written more as a reference work than as a continuous narrative, and were not intended to introduce new archivists to the arts of electronic editing. As a result, a great deal of redundancy has been built into many entries, which in some cases will duplicate material elsewhere. The narrative introduction to the protocols linked below is not yet complete, but will soon serve as a means of familiarizing archivists with the general features of the Transcriptional Protocols, as well as with common pitfalls and important caveats culled from long experience.
Conventions Used in This Documentation
Directly under the header of each section containing a direct citation of a TEI-conformant element, or any other type of code or routine with a standard, documented specification, you will also find a callout box with links to the specification itself, in this form:
External Specifications:
Generally, these links are intended for advanced users. They will take you not to the basic PPEA/SEENET documentation (which in any case will immediately follow the callout box in question), but rather to the international standard lying behind PPEA/SEENET practice. Following these links will not be necessary for a basic understanding of the Protocols, but rather will serve to inform more advanced discussion of proper usage and questions of the need for extension as these may arise. Even if there is no external link associated with a specification, hovering over its name will invoke a tooltip that will explain briefly what the standard is.
General examples illustrating some aspect of markup application, whether they are images or blocks of code, are contained in boxes with a darker background:
<sic>Dig Before You Call</sic>
<corr>Call Before You Dig</corr>
Specifications of markup such as the standard attributes of a given tag, and the standard values for those attributes, are contained within a similar box, shifted further to the right, with text in list form:
Standard values for the "agent" attribute of the <dig>
tag:
- bulldozer
- backHoe
- spade
- trowel
- teaspoon
- sableBrush
The values listed in such boxes must be used rigorously in your markup, in exactly the form in which they appear in the bulleted list. Otherwise, later processing and display will be adversely affected. Should you need new attributes or values (or even whole new elements), as you would in this case if you had dug with a toothpick or lancet, you will need to review the TEI discussion on Conformance.
Finally, issues of special importance are highlighted in boxed paragraphs of their own, in this form:
Note:
Paragraphs such as these should always be read carefully, as they will contain tips on avoiding pitfalls that can cost you a great deal of time.
Beginning the Transcription
Version Control and Copy of Record
Maintaining Copy of Record, commonly known as "COR," is a matter of paramount importance, because failing to do so can lose you weeks, months or even years of work.
The problem arises from the possibility of having two or more people working different but initially identical copies of the same file. At this initial stage, the work of only one of the people editing the file can be saved, because everyone's work will overwrite the original file when it comes time to return the file to the drive on which the edition is stored.
Transcribing each passus as an individual file has the advantage of allowing for greater ease in giving a team parts of an edition to work on individually and simultaneously, but even this method has proven subject to corruption of COR when more than one person was issued a copy of the same passus, but for a different work process.
As a result, if you allow anyone else to work on your edition, you will need to keep careful records of who has what files, when they were "checked out," when they were returned, how you vetted them before you allowed them to overwrite the old copies, and so forth.
Making backups of the state of your work before COR transfers in more than one copy and more than one location is most wise, and it is absolutely essential to record COR transfers and work done in comments at the head of each file.
Sometimes, but not always, corrupted COR can be repaired using an application such as Beyond Compare, which has a most generous trial use policy, and which is in any case quite inexpensive and very powerful.
Note:
Develop and adhere to policies for maintaining copy of record. Never vary from your policies once you have found them to work. Such policies should include documentation within the altered files and an exchange of emails detailing the transfer. Such a token should be used in addition to email documentation, not in place of it.
It also does not hurt to include some clear-cut physical symbol or gesture or both to accompany the transfer such as a handshake and the transfer of some small token object (a poker chip with the manuscript sigil written on it, for example), since this will enhance memory of the event more than the mere receipt of email will. This is an easily-mocked but anthropologically-sound approach.
Always storing the COR of your edition on the same machine in the same directory will also increase the chances of clean COR transfers, as will always transferring COR over the same medium or media - a CD with a red label inside the jewelcase, for example.
Finally, if you do not wish to take maintenance of COR into account, do all work on paper or in separate .txt files and have one and only one person key or cut-and-paste that work into COR.
A New File for Each Passus
External Specifications:
Transcribe each passus into its own file. Because of the way the Archive's document type definition and its associated entity files have been written, you must name these files on the following pattern, where X is a hypothetical manuscript's sigil, and "passxx" is an abbreviation of "passus" and the passus number:
Xpass00.sgm, Xpass01.sgm, Xpass18.sgm, Xpass20.sgm, etc. (SGML, in old transcriptions)
OR
Xpass00.xml, Xpass01.xml, Xpass18.xml, Xpass20.xml, etc. (XML, for new transcriptions)
Designating the prologue as passus 00 (two zeros), and putting a zero before single digit passus will result in their being sorted in order in any directories in the DOS, Windows and Unix environments, which sort in ASCII order. Otherwise, Xprol.sgm/Xprol.xml will always appear at the end of the list, Xpass2.sgm/Xpass2.xml will be sorted with Xpass20.sgm/Xpass20.xml instead of following Xpass1.sgm/Xpass1.xml immediately (which will instead be followed by Xpass11.sgm/Xpass11.xml), and so forth.
Setting up your edition by transcribing each passus into a separate file may seem a needless complication, but it has significant advantages over a single-file method, including management of Copy of Record when you want to have more than one person working on your edition at one time, and economies in use of memory during later machine processing.
Headers and Closing Tags for Each Passus
External Specifications:
Each passus, and each line group within any given passus must be opened and closed with a tagset on the following model, which represents a one-line passus, the first passus in hypothetical manuscript with sigil X:
<div1 type="passus" n="X1">
<head><hi rend="BinR"><hi rend="rb"><foreign
lang="lat">Passus primus de
visione</hi></foreign></hi></head>
<lg type="strophe">
<l id="X1.1" n="KD1.1"><hi rend="o5"><hi
rend="bl">W</hi></hi>hat þis Mountaigne bymeneþ
&punctus; and þe m<expan>er</expan>ke dale</l>
</lg>
<trailer><foreign><hi rend="rb">Explicit hic passus
paruissimvs Petri Ploghman</foreign></hi></trailer>
</div1>
Note:
The extra spaces between the lines in this example are for ease in reading the code only. They should not be added as hard returns in your transcription, or in any part of your markup. The Elwood browser and several scripts used for preparing editions rely on their being no extra hard returns.
If any line-wrap appears in the example above, it is invoked by your browser, and likewise does not represent part of the intended transcription or markup.
See also Passus Breaks.
Using NoteTab Pro from the Start
It is crucial to use a raw text editor for your transcription instead of a standard word processing program, because programs such as Microsoft Word and Word Perfect can introduce underlying code into your files that is difficult to remove later, even if you save as plain text.
NoteTab Pro is not only a clean raw text editor on the model of Wordpad or Notepad, but also has numerous features that can make your transcription much easier to do. You can for example open all the passus files at once and search them all simultaneously for a given form - and replace that form with something else, either globally or one instance at a time.
NoteTab Pro also has a clip library function - similar to the macros in Microsoft Word - that can be docked to the right or the left side of the workspace. The PPEA and SEENET have developed a custom clip library for its editors that allows automatic application of almost all tags to selected strings of text - a feature that prevents the small errors such as forgetting the forward slash in a closing tag that can lead to large numbers of parsing errors in files.
NoteTab comes in both a paid Pro version for an astonishingly modest price, and in a somewhat reduced free version - NoteTab Light. Both are available at the NoteTab web site.
Text
Individual Characters and Graphs
Thorn, Yogh, and Other Non-ASCII Characters
External Specifications:
If you have access to an SGML or XML browser, use the appropriate entity references to represent these characters. The most common special characters in the Latin and Greek alphabets appears in both SGML and XML (Unicode) format in the Entity References section of the Technical Introduction.
It is imperative to use NoteTab Clips for the SGML or XML entity references of frequently used non-ASCII letters, both upper and lower case, and of some marks of punctuation, so as to keep them perfectly regular. If you do not have access to an SGML or XML browser, however, you may find proofreading easier if you make an initial transcription using the short and completely unambiguous alternate representations in the following list:
- thorn = @
- capital thorn = &T;
- yogh = #
- capital yogh = &#;
- eth = %
- capital eth. = &%;
Each can later be converted to the appropriate Unicode entity reference with a global search/replace.
Note:
Under no circumstances should you use the numeral three (3) as a substitute for the entity reference for a yogh, since when it comes time to search and replace the three (3) with the proper entity reference, the numeral threes that appear in your line numbers, headers and notes will all be changed to yogh.
A Special Case
The manuscript form <3> may represent either <z> or yogh. Transcribe it as <z> when it stands for /z/ or /s/ and yogh when it represents /j/ or the velar spirant.
Allographs
Editors must decide whether it is worthwhile to record allographic forms in any given edition. For instance, the F scribe uses a long <s> as well as a sigma <s> and one other form. He has three forms of <r> which are in free variation and thus carry no information. If you decide that variant letter forms have significance and want to tag them, entity references are the most efficient way to do that; e.g. &sigmas;, &longs;, etc.
Note:
Check the charts of available characters on the Unicode web site, examine the characters available in the Junicode font, and confer with the other editors before you decide that it is necessary to make up your own entity.
Concerning the allographs <i> and <j>: Transcribe <i> as <i> and long <I> as <I>. Use <j> only at the end of series of minims, e.g. <iij> and <hij>, not <iiI> or <hiI>.
Note:
It is not necessary to use entity references for punctuation marks which are represented in the lower ASCII keyboard; all of the special punctuation entity references we presently use are listed in the Punctuation section.
Upper- and Lower-Case Letters
It is of great importance that you resolve how to handle transcription of ambiguous upper and lower case letter forms from the start.
Our policy is to follow the scribe's letterforms, but the issue is problematic because some characters have no distinctive upper and lower case forms. For instance, in many late ME hands there is no discernible upper and lower case distinction for <w>, <h>, or sigma <s>. Policy decisions with regard to capitalization can be made only after analysis of each individual manuscript.
For example, the F scribe clearly intended to emphasize the first letter in each line. After the second folio he marked each with a touch of red ink and in general, when distinctive case forms were available, he used upper case forms for the first character in each line. We have therefore used the modern typographic upper case character when, as in the case of <h>, the forms are indistinguishable. In the case of manuscript G, the scribe almost never began a line with a capital letter, so we normalize to the lower case when the letter forms are ambiguous.
Note:
Editors must decide upon a policy, document the policy explicitly, and then transcribe consistently.
Proper Nouns
The capitalization or non-capitalization of proper nouns will need to be decided on a case-by-case basis, judging by the scribal forms in each manuscript. In cases of non-dimorphic letters such as <w>, the editor should reason on the basis of the scribe's general usage. If a scribe appears to use capitalization for proper nouns in free variation with lower case forms, the editor should determine which the scribe chooses most often, and apply that form consistently to doubtful cases.
Punctuation
Use of Entity References
Use entity references for any scribal pointing not appearing in the lower ASCII keyboard, including the following:
-
&emdash;
-—
- em dash -
&punctuselevatus;
-
- punctus elevatus -
&raisedpoint;
-�B7;
- raised point -
˜
- ~ - tilde -
/
- / - solidus or virgule -
¶
- ¶ - paraph marker -
&tildeamp;
-&̃
- tilded ampersand representing "and" against "et" in some manuscripts
Editorial Punctuation
Punctuation introduced by the editor, such as that in the notes, should use the lower ASCII keyboard. In addition, four characters that were formerly represented by entity references in SGML will be represented by their simple lower ASCII equivalents in XML. These are:
-
¶
- ¶ - paraph marker -
þ
- þ and&Thorn;
- Þ - upper and lower case thorn -
˜
- ~ - tilde -
/
- / - solidus or virgule
Note:
Additionally, the entity references (
and )
(SGML) or 
and 
(XML/Unicode) should be
used within notes, to represent left and right parentheses, as doing so will distinguish
them from the plain ASCII left and right parentheses used in the initial transcription
to set off expanded suspensions. In this way, the ASCII parentheses can easily be
replaced globally with opening and closing <expan>
tags without also
changing the parentheses in notes.
A full chart of the entity references used by the Archive is available in the Technical Matters page.
Shadow Hyphens and the TEI <seg>
Element
Note:
One special case of editorial pointing in the text portion of the transcription (outside of notes) occurs in cases of ambiguous spacing of compounds and participles such as "a bout" or "y nempned." Here a hyphen is added, but it is tagged as a "shadow hyphen," thus:
a<seg type="shadowHyphen">-</seg>bout
For more on this sort of hyphenation, see Word Division.
The general rule for marking up punctuation is that any form that might need to be suppressed
in a given stylesheet will need to be marked up with a <seg>
tag. Other
forms of punctuation will not be marked up.
Types of Punctuation that Are Marked Up
- shadow hyphens (the hyphens between the parts of compounds such as a-bout)
- swung dash and other line fillers (possibly)
- any other form of punctuation that might need to be suppressed by one or more stylesheets (possibly)
Tilded and Plain Ampersand
Some scribes appear to distinguish between <&> with and without tilde, using the one with a tilde in the English text for "and/ond" and the other without in the Latin and French text for "et." We can record these with &~ if it seems useful, replacing this later with the &tildeamp; (SGML) or &̃ (XML) entity reference. Or we can simply describe the practice in the introduction.
Note:
See the special note at Editorial Correction to a Witness on how to handle a double or single tick, a paraph marker or a "cc" indicator in the margin where an intended rubricated paraph was never drawn.
Note:
We represent scribal punctuation with a space on either side of medial points, and on the left side of terminal points.
Displaying Angle Brackets as Such
Displaying angle brackets as such within notes: You will occasionally want
to use angled brackets in notes. Entity references must be used so the browser will not
mistake the contents of the brackets for a botched SGML or XML tag. To display <X>, for example, enter
<X>
.
These entity references are not as cryptic as they might seem, since "<
"
refers to a "less than," and ">
" to a "greater than" symbol. Likewise,
"(
" and ")
" refer to "left parenthesis" and
"right parenthesis."
Un-rubrished Paraph Indicators
Note:
In some manuscripts, paraphs are sometimes indicated by one or more indicators such as a full paraph marker, single or double ticks, or a full paraph marker, in plain text ink, which were then missed by the rubrisher.
In such cases, simply transcribe a ¶
(SGML) or &
(XML) without adding color tags such as
<hi rend="rb"></hi>
.
Roman Numerals
External Specifications:
Roman numerals in the text or titles are tagged to appear as numerals in the diplomatic text
and as words in the critical text. Use the <orig>
(=original) and
<reg>
(=regularized) tags in tandem as follows:
<orig>xij</orig><reg>twelf</reg>
<orig>xij</orig><reg>duodecimus</reg>
Roman numerals in formework or marginalia
require no <reg>
tags.
Ambiguous or Illegible Characters
External Specifications:
Use the <unclear>
tag to indicate where characters are unclear or
ambiguous. If they are unclear due to a scribe's attempt at deletion, usually by erasure or
overwriting, <unclear>
tags should be nested within
<del>
tags. (See Deletions, especially Example
5.) If the characters cannot be discerned at all, use
<damage>
or
<supplied>
tags instead.
Standard attributes of <unclear>
are as follows:
- "reason" indicates why the material is hard to transcribe. Standard attribute values
are:
- ill-formed
- torn
- faded
- rubbed
- smeared
- overbound
- stained
- "resp" indicates the editor responsible for the transcription of the unclear text. The default will be the initials of the named editor(s) as they appear in the TEI header, and need not be recorded.
- "cert" signifies the degree of certainty ascribed to the unclear text.
- "hand" indicates the hand responsible for action that created the difficulty in transcription, where determinable. See the Identifying Hands section.
- "agent" signifies the causative agent for the difficulty. Standard values include:
- water
- mildew
Sample tagging:
<unclear reason="ill-formed" hand="hand1">unclear
material</unclear>
<unclear reason="faded" cert="60%">unclear material</unclear>
Spaces and Gaps
External Specifications:
Note:
For spaces between words, see Word Division.
Use the <space>
(<space/>
= XML) tag to indicate where space is left vacant for
characters (most often seen where an intended ornamental capital was never made). If the
space is due to an erasure, see the Deletions section. Standard
attributes for <space>
are as follows:
- "dim" indicates whether the space is horizontal or vertical.
- "resp" indicates the editor who identified and measured the space. The default will be the initials of the named editor(s) as they appear in the TEI header, and need not be recorded.
- "extent" indicates the size of the gap. We have thus far used the imprecise unit of the space required for a character in the scribal hand, although you may describe the area affected in inches, millimeters, folios, or whatever makes sense.
Sample tagging:
<space dim="horizontal" extent="6">
(SGML)
<space dim="vertical" extent="2 lines"/>
(XML)
Note that this tag has no content and therefore need not be closed in SGML, but will need the closing forward
slash in XML (<space/>
).
Stray Marks, Blots, Stains, and Flourishes
External Specifications:
Most flourishes deemed by the editor not to be textually significant are neither transcribed nor tagged, though they should always be mentioned in the description of the paleographic features of the hand in the Front matter, and may be described in a paleographical note in the transcription. Unintentional marks, ink trails, blots, smears, bleed-through, or offset ink are neither noted nor transcribed in any way unless they are textually significant.
In F, for example, we recorded in provisional notes a number of apparently otiose curls written erratically over various letters throughout the manuscript, for though we could not determine what they meant, they appear to have been intended. We did not, however, tag them.
Note:
Provisional notes or tagging can be developed and applied to marks that may or may
not have a significance that will emerge only upon a broader inspection of the text in
relation to them. Such provisional notes or tagging must be rigorous and absolutely
regular, in order to make it possible to strip out the markup in the event that the
marks are indeed insignificant.
A number of recent projects have presented markings which although of ambiguous
significance, proved to be worthy of markup, in some cases experimentally, in others
with standard TEI/PPEA markup. In each instance, a
different approach was taken as our thinking on this topic has developed gradually. Such
instances include the small bars added in the margins of L (as in F, these were recorded
in the introduction and in notes, but not marked up), the large line filler dashes in M
(marked up with a standard TEI <seg>
tag), and several types of
markings in Ht (tagged experimentally as a means of analysis). Earlier, in the edition
of manuscript L, prominent marginal bars, clearly deliberate but of no clear
significance, were recorded only in a table in the introduction. A careful consideration
should always be made before deciding to mark up something not usually tagged under the
protocols.
Words
Word Division
External Specifications:
Overview
Medieval word division is not steadily consistent with modern usage, and scribal inconsistency of spacing further complicates the matter. Handwritten documents do not present the uniform spacing that modern readers of print have come to expect. In order both to facilitate machine collation and to assist users who will search for specific words, we have decided to resolve the problems of word division by reference to probable scribal meaning. Even in the diplomatic transcriptions,we will not attempt to represent scribal spacing unmediated by reference to meaning.
We follow the word-division of the manuscript as far as practicable, though no attempt is
made to represent the variety of spacing between words and letters. The interpretation of
the scribe's word-division, though it is generally unambiguous, is occasionally a matter of
fine judgment. There is sometimes no obvious space between the indefinite article and the
following noun, e.g. afreman (M20.145), but there seems no good purpose in recording these
as a single word. A hyphen in the transcription (indicated with <seg
type="shadowHyphen">-</seg>
) indicates a space in the manuscript within
a word, or a compound or phrase conventionally hyphenated today. In doubtful cases we have
followed OED and MED to distinguish compounds from phrases. Conversely, some phrases, in
particular or elles, at ese and at ones, are written as one word, orelles, atese, atones.
Phrases like those are marked with <orig><reg>
tags; e.g.,
<orig>orelles</orig><reg>or elles</reg>
Although we will not attempt to represent every nuance of scribal spacing in the transcript,
we will include detailed information in the linguistic descriptions which will accompany
each text. It is, therefore, important for each transcriber to collect data during the
initial transcription. Provisional tagging and notes will call attention to problematic
instances or spacing which requires the transcriber's interpretation. The
<orig>
/ <reg>
and <note>
tags
will be useful at this stage, and may be removed or retained when final decisions about the
manuscript are made.
Examples:
We decided not to represent the scribal "tothe" produced (once) by the scribe of MS F but silently regularized it to "to the." Although not tagged in any way in the transcription, this interpretation is noted in the linguistic information.
Markup
Note:
Since it is necessary to distinguish between the hyphens used by a scribe from those introduced by the editor as a convenience for identifying compounds, each instance of editorial hyphens in the transcription should be tagged as follows:
a<seg type="shadowHyphen">-</seg>bout
Compounds
In general, we will attempt to represent -emic significance. Compounds, therefore, present a
special case. Historically in a state of transition, they may not always concur with modern
usage. In addition, within the same hand, a given "word" may appear variously as a single
word or two. In such cases, we will indicate scribal spacing as accurately as possible by
using a hyphen to indicate the spaces that separate the morphs. For example, we will follow
the scribe in transcribing either "beleve" or "be<seg
type="shadowHyphen">-</seg>leve
."
Close Calls
Close calls: When there is small space, smaller than that between words and larger than that
between letters after one- and two-letter prefixes such as bi-, by-, to-, or a- (apeyre,
among, aboue, etc), the initial transcriber will decide whether to use the shadow hyphen
(<seg type="shadowHyphen">-</seg>
) or to represent one word.
Provisional notes can provide important data about scribal patterns. See the Editorial Notes section.
Note:
Line fillers should not be transcribed in these or any other instances.
Participles
Grammatical considerations: The graphs "y" and "i" as markers of the past participle are
morphemic. When they appear to be separated from the verb stem by a space, we will represent
that space by a shadow hyphen (<seg type="shadowHyphen">-</seg>
),
thus y<seg type="shadowHyphen">-</seg>wro3t
, I<seg
type="shadowHyphen">-</seg>blessed
, y<seg
type="shadowHyphen">-</seg>nempned
, etc.
Note:
Be careful not to represent instances of the first person singular pronoun + preterite verb with a hyphen.
Allegorical Names
Allegorical Names: In the documentary texts, shadow hyphens will indicate instances where a
scribe has separated the morphs of the central allegorical names Do<seg
type="shadowHyphen">-</seg>wel
, Do<seg
type="shadowHyphen">-</seg>bet
, and Do<seg
type="shadowHyphen">-</seg>best
. Regardless of spacing, other
allegorical names will not be hyphenated. This policy applies only to the documentary
editions, not to later critical editions.
Abbreviations
External Specifications:
Use the <expan>
tags to indicate the resolution of standard abbreviations;
e.g. p<expan>ro</expan>p<expan>ter</expan>
. Editors
without an SGML or XML browser may find that proofreading is made easier
if the expanded material is put between parentheses initially; e.g. p(ro)p(ter). The
essential rule here is that one records the interpretation in tags or parentheses, leaving
unambiguous graphs outside them. Eventually parentheses will be replaced with
<expan>
and </expan>
. Highly unusual abbreviations
can be placed between tags or parentheses, but each should be followed by a paleographic
note. See the Editorial Notes section.
Note:
If you use lower ASCII parentheses to indicate <expan>
elements that
will be globally substituted later, you must indicate any parentheses that are to remain
as such with the entities (
(left parenthesis) and
)
(right parenthesis), in order to keep them from being
replaced with <expan>
tags.
The various forms of <&c.> and <&> will be represented by the entity
reference &
. We will indicate scribal spacing, either joining or
separating the <&>and <c>. We will also indicate a dot if it appears. The
<c.>should be expanded. Some possible combinations are as follows:
-
& c<expan>etera</expan>
-
&c<expan>etera</expan>
-
&c<expan>etera</expan>.
-
&c<expan>etera</expan>.
Some scribes make a distinction between the ampersand indicating the English "and" from one indicating the Latin "et" by the introduction of a tilde over the ampersand for the English word. This may be indicated with its own entity reference:
-
&tildeamp;
(SGML) -
&̃
(XML/Unicode)
Note:
The spelling of expansions will have to be regularized based on the spellings of words that are written out. A frequent case in which variation occurs is in the plural, where the scribe's dialect may motivate -es, -is or even -us. The regularization should represent the majority form.
Word Brevigraphs
External Specifications:
SGML and XML allow us to identify the brevigraph we are expanding in the
<expan>
tag with the attribute abbr. Examples of some of the most
common are:
-
<expan abbr="Ihu˜">Iesu</expan>
-
<expan abbr="xpi&tilde">christi</expan>
-
<expan abbr="Ihs˜">Iesus</expan>
-
<expan abbr="xpo˜">christo</expan>
If you choose not to identify the brevigraph, use parentheses or <expan>
without abbr as in the Abbreviations section; e.g. (Iesu) would
eventually become <expan>Iesu</expan>
.
Note:
If you use lower ASCII parentheses to indicate <expan>
elements that
will be globally substituted later, you must indicate any parentheses that are to remain
as such with the entities (
(left parenthesis) and
)
(right parenthesis), in order to keep them from being
replaced with <expan>
tags.
Note:
The spelling of word brevigraphs will have to be regularized based on the spellings of words that are written out. A frequent case in which variation occurs is in the word "Christ," which may also be spelled "Crist." The regularization should represent the majority form.
Language Shifts
External Specifications:
Use <foreign>
tags to mark Latin/French/German text:
-
<foreign lang="lat">nota</foreign>
-
<foreign lang="fre">plus chaud</foreign>
-
<foreign lang="ger">Schriftsprache</foreign>
Note:
Various means of highlighting text (by changes of script or ink or by
underlining, etc.) using the
<hi>
tag are often associated with changes in the language. The
<hi>
tags we use to indicate such highlighting may be nested
within the <foreign>
tags, or vice versa, though arbitrary nesting of
these tags is not without consequences for later processing and display.
The following example shows tagging for "Anima" written in textura, in red ink, in a red box:
<hi rend="BinR"><foreign lang="lat"><hi rend="tx"><hi
rend="rb">Anima</hi></hi></foreign></hi>
The order of tagging in this example - first boxing, then foreign language, then other
<hi>
- is preferable because of the way in which all or most display
technologies such as CSS, XSL, PERL and
any other language that relies on the matching of regular patterns will locate and style or
process such features. A regular - and thus an expected and predictable - order of nesting
will greatly facilitate later display and analysis of your edition.
Occasionally the foreign text and the highlighting are not conterminous, and this introduces
a common complication regarding tag nesting. The example below shows <hi>
tags nesting within <foreign>
tags, where the phrase "in Infernum"
appears with the Latin "in" outside the red box enclosing the rubricated, textura
"Infernum."
<foreign lang="lat">in <hi rend="BinR"><hi rend="tx"><hi
rend="rb">Infernum</hi></hi></hi></foreign>
If <foreign>
tags are nested within <hi>
tags, as in
the first of the following examples (where one English word and one Latin word are in a red
box), the result will parse perfectly, since the DTD allows for such nesting, but in
the cases of boxing, underlining, or any other highlighting that forms a continuous line
across white space, the second order of nesting is necessary in order to make the styling
appear to be continuous, the way it most likely would in a manuscript:
<hi rend="BinR"><foreign
lang="lat">Satisfaccio</foreign>dobest</hi>
produces
(Red box)Satisfaccio(Red box) (Red box)dobest(Red box)
<foreign lang="lat"><hi
rend="BinR">Satisfaccio</hi></foreign><hi rend="BinR">
dobest</hi>
produces
(Red box)Satisfaccio dobest(Red box)
Note:
Be aware that <foreign>
and <hi>
tags do not carry
over past
</l>
tags, so each Latin line will need to be tagged
separately, though breaking a line solely with <lb> tags - i.e. when line numbering several manuscript lines as a single
Latin line in relation to Kane-Donaldson - does not require the use of additional
<foreign>
tags. For an in-depth explanation of the use of
<lb>
tags in conjunction with <foreign>
tags,
as well as sample code, see the section on line numbering
pitfalls in the Line Breaks section that follows.
Unique Readings
External Specifications:
The TEI <app>
element is used encode unique readings. In the case of
unique readings, the "wit" attribute of the <lem>
tag contains only the
sigil of the manuscript in question. The "wit" attribute of the <rdg>
tag
may contain only the following standard values:
- all other mss
- most mss
When the unique reading is clearly the result of a scribal error, <sic>
tags nest inside of the <lem>
tag to indicate that the variant is
unintentional. Recording a scribe's unintentional errors is a means of determining their
type and frequency, leading to a clearer picture of the scribe and his habits.
A complete TEI <app>
tag array is as follows:
<app><lem wit="Dx">kynge</lem><rdg wit="all other
mss">knyght</rdg></app>
Patterns such as frequent omission of unstressed monosyllables, miscounting of minims, or transposition of words or letters may aid in distinguishing between two scribe's of similar handwriting, or they may corroborate the identification of additional samples of the same hand in other manuscripts.
A textual note may also accompany this tagging, explaining the significance of the reading.
Layout
Line Breaks
External Specifications:
Each line in each manuscript in the Archive will be assigned its own number, which will
become the "id" attribute value for that line, and must therefore be globally unique. Hence,
even if the line is repeated elsewhere verbatim, each of these instances will receive its
own unique number. An additional set of attribute values, those in the "n" or "name"
attribute of the <l>
element will correlate the line numbers to those in
the Athlone editions (initially), and eventually to those in the archetype and the critical
text. The "n" attribute may therefore repeat at times, if the same line has been copied into
a manuscript more than once.
<l ID="Q1.1" n="KDP.400">First instance of line in MsQ corresponding to
hypothetical KDP.400</l>
<l ID="Q1.2" n="KDP.400">Second instance of line in MsQ corresponding to
hypothetical KDP.400</l>
Note:
A concordance of parallel lines in the A/Ax, B/Bx and C/Cx texts is under development by the Archive, and is currently in the alpha test stage, to be published as a reference by SEENET after its beta test.
As in the example above, the format for a tag at the beginning of a line always has both an "id" and an "n" attribute, the "id" corresponding to the absolute numerical position of the line in the manuscript, and the "n" (name) corresponding to the line number of the parallel line in the relevant Athlone edition:
<l id="F1.3" n="KDP.4">
Our line number (the "id" of <l>
) in this example is F1.3 which
corresponds to KD Prologue line 4 (recorded in the "n" attribute).
Note:
F has skipped a line, causing its line numbers (like its passus numbers) to be out of synch with Kane-Donaldson. The line number and passus number may frequently be different from that of the parallel line in an Athlone edition.
Each line is ended with the tag </l>
. We have in Charlottesville a program
for inserting both the line numbers and the line terminal tag.
Since the introduction and placement of line break tags is predicated on the assignment of line numbers and editorial decisions as to what constitutes a line in a given manuscript, further discussion appears under the head Line Numbering in the next section.
Line Numbering Special Issues
External Specifications:
Assigning line numbers is fraught with potential pitfalls. The first is
in determining what constitutes one line. In Latin passages, it is not always clear whether
the scribes intended to write prose, or even verse, as separate lines or as run-ons; i.e.,a
physical line break may represent the medieval equivalent of either a "soft return" or "hard
return." The scribe's use of boxes, upper or lowercase letters, terminal punctuation and
grammatical structures may provide clues to his conception of line divisions in long Latin
quotations. Each intended line will been closed in
<l>
tags as shown above, without regard to its physical arrangement.
Where the physical line breaks do not correspond to the scribe's perception of a new line, we
will insert a TEI line break element,
<lb>
(SGML) or
<lb/>
(XML). Consider this example from MS L:
<l id="L13.47" n="KD13.45&agr;"> Vos qui peccata hominum comeditis nisi
pro eis lacrimas & oraciones <lb/> effunderitis . ea que in delicijs
comeditis . in tormentis euometis</l>
The sentence structure seems to dictate that the line would not end at "oraciones." That the
scribe would agree is demonstrated by the indentation of "effunderitis" and his decidedly
lower case <e>. (The above example has been stripped of all tags but the ones under
discussion. In fact, a <foreign>
tag is opened before "Vos" and closed
after "euometis.")
The line break element, <lb>
or <lb/>
, should be used
to represent a line break within all <marginalia>
,
<fw>
, <add>
and <l>
tags as
well as within notes citing more than one line of text. Since <lb>
/
<lb/>
is an empty element, it does not need a closing tag in its SGML form, though it does need the
special form with the forward slash in XML. As a
default, it will cause a line break to occur at the point at which it is inserted, under any
stylesheet we may develop. In special cases, however, such as in notes, an
<lb/>
may be given an "n" attribute value that can be used to
distinguish it from from other linebreaks, making it possible to suppress or add a line
break, or insert a pipe character as needed.
As soon as the transcription has been finished and properly prepared, the PERL scripts for line numbering and tagging should be run.
Paragraph and Strophic Breaks
External Specifications:
The tag <lg type="strophe">
is to be inserted at the beginning of a
strophe and </lg>
at the end.
In some manuscripts, strophes are marked with paraphs or skipped lines or both. Record these,
with <lb>
for skipped lines and ¶ for paraph markers. In most
manuscripts, these paraphs are in red, green, and blue. Where the editor has access to the
manuscript or a color facsimile, the colors should be recorded in
<hi>
tags; e. g. <hi
rend="bl">¶</hi>
.
Note:
See the special note at Editorial Correction to a Witness on how to handle a single or double paraph tick, a parasign, or a "cc" paraph indicator in the margin where the paraph was never drawn or rubrished.
Passus Breaks
External Specifications:
The tag <div1 type="passus" n="Xpass[number]">
--where "X" is the sigil of
the manuscript and "[number]" is the number of the passus--should precede the transcription
of each passus. The final item, the content of the "n" attribute, will of course change with
each passus. The closing tag </div1>
is inserted at the end of the passus
after any trailer, if there is one. <div1>
</div1>
is always the outermost container of each passus file.
Where non-standard passus divisions occur, indicate where passus divisions appear in the
archetype with this tag: <milestone unit="Bpassus" n="[number]">
(SGML) or <milestone
unit="Bpassus" n="[number]"/>
(XML). The
milestone tag is always empty, so it does not need to be closed in SGML, but in an XML document, it requires the special forward-slash
format: <milestone/>
.
Note:
The line numbering program ignores <milestone>
tags, so it is safe to
insert them before sending the file for line assignment.
Foliation
External Specifications:
Between the bottom of each leaf and the top of the next, supply a tag such as this:
<milestone unit="fol." n="36v" entity="M036v">
(SGML)
<milestone unit="fol." n="36v" entity="M036v/">
(XML)
The transcription of folio 36v follows the tag. Since the <milestone>
or
<milestone/>
tag is always empty, marking only the beginning point of
the folio boundary, it need not be closed in SGML, but requires the forward slash format in XML.
Note:
The line numbering program ignores <milestone>
tags, so it is safe to
insert them before sending the file for line assignment.
The entity attribute value of the <milestone>
/
<milestone/>
above refers to the hyperlinked image for folio 36v of
manuscript M. Make certain that your images are named on the regular pattern
sigil-folio-side, since this will make it easier to set up a regular and accurate pattern of
entity naming for the images and their links in the edition.
Forme Work
External Specifications:
The <fw>
element identifies material added by the scribes or printers to
indicate codicological structure, such as headings, top-of-page titles, catchwords,
corrector's marks, guide words for the rubricator in the margin, etc. Attributes of
<fw>
include:
An "id" attribute is always added as a unique identifier for each instance of formework. This id will be assigned by a Perl script after the other editorial work is completed.
For "type" use only the following categories:
- running head
- page (for the page number)
- fol (for the folio number)
- sig (=signature)
- qSig (=quire signature)
- lSig (=leaf signature)
- catch (=catchword)
- cor (where the corrector "signs off" on a gathering)
- guideWords (where scribe has written instructions for the rubricator)
- guideLetters (where the scribe has inserted a guide for the ornamented capital)
For "place" use only the following categories:
- inline
- supralinear
- sublinear
- marginLeft
- marginRight
- topLeft
- topCenter
- topRight
- bottomLeft
- bottomCenter
- bottomRight
Note that the categories are written as one word, camel cased in all of these sample forme work tags except for "running head":
-
<fw type="catch" place="bottomRight">And then</fw>
-
<fw type="sig" place="bottomCenter">g iij</fw>
-
<fw type="cor" place="bottomLeft">coret</fw>
-
<fw type="running head" place="topCenter">Piers Plowman</fw>
-
<fw type="guideWords" place="marginRight"><foreign lang="lat">"Passus primus de visione</foreign></fw>
We do not include modern foliation in our transcription but characterize it in the description of the manuscript in the introduction.
Highlighting and Appearance
The <hi>
element is used to describe the various ways scribes might call
attention to text, such as by changes in script, size, or color, by underlining or boxing,
etc. The only attribute we will need is "rend."
Be aware that <hi>
tags, like <foreign>
tags, do not
carry over past </l>
tags, so each line will need to be tagged
separately. (See the note on <hi>
tag nesting for
examples of how <hi>
and <foreign>
tags may be
nested.) If the highlighted text was added after initial copying, the
<hi>
tags should be nested within
<add>
tags.
Ornamental Capitals
External Specifications:
<hi rend="o8">N</hi>Ow
This tag marks an ornamental capital "N" of 8 lines height followed by a capital "O" and lower case "w". Note that the <o> is the letter <o>, not the digit zero. We do not specify width.
Changes of Script
External Specifications:
In the following example, "Danyel" is written in textura:
<hi rend="tx">Danyel</hi>
We are interpreting shifts in type of script as being for the purpose of emphasis or
highlighting. The TEI actually has a
<handShift>
(SGML) /
<handShift/>
(XML) tag, but as an empty tag, it is less
suitable to our use than one might imagine.
<emph>
tags can be used instead of
<hi>
tags. I chose the latter because it is more non-committal (HND).
Note:
Standard reference works on scripts include:
- Michelle P. Brown, A Guide to Western Historical Scripts from Antiquity to 1600, London: The British Library, 1990
- M. B. Parkes, English Cursive Book Hands, 1250-1500, Oxford: Clarendon, 1969
- Jean F. Preston and Laetitia Yeandle, English Handwriting, 1400-1650: An Introductory Manual, Binghamton, N.Y.: Medieval & Renaissance Texts & Studies, 1992
Rubricated and Other Color-Highlighted Words and Phrases, and Otherwise Highlighted Text
External Specifications:
-
<hi rend="rb">Dowel</hi>
-
<hi rend="tr">Dowel</hi>
-
<hi rend="tr">D</hi>owel
-
<hi rend="bl">D</hi>owel
-
<hi rend="gr">D</hi>owel
In the first example, "Dowel" is rubricated. In the second, the black letters are touched with red ink. In the third example, the "D" alone is touched in red. In the last two examples the initial <D> is written in blue and green ink, respectively.
Underlined Words
External Specifications:
The tagging in the following example indicates that "Glotoun" is underlined in text ink (or the color is unknown to the transcriber):
<hi rend="ul">Glotoun</hi>
If a scribe clearly intends to underline a word, tag the whole word even if the line begins after the first letter or ends before the last.
Boxed Words and Phrases
External Specifications:
Examples of <hi>
tagging for underlining are:
<hi rend="boxed">Repentaunce</hi>
<hi rend="BinR">Repentaunce</hi>
Common abbreviations for contents of rend attributes
- lc = Lombard Cap
- o[number] = ornamented capital, N lines high
- bigger[number] = taller than usual letter, N lines high
- br = brown ink
- gr = green ink
- bl = blue ink
- rb = rubricated
- tr = touched in red
- tg= touched in green
- tx = textura
- ul = underlined with color unspecified or text ink
- ur = underlined in red
- ulANDol = underlined and overlined with color unspecified or text ink
- ulrANDolr = underlined and overlined in red
- boxed = boxed with color unspecified or text ink
- BinR = boxed in red
Add example here of boxing versus flourished underline, including picture.
Three Special Instances of <hi>
Tag Use
External Specifications:
Note:
There are three values for the "rend" attribute of <hi>
that we will use in notes only. They are: "bold" (For the A, B and C of
A-Text, B-Text and C-Text), "sup" (for superscript characters, usually in manuscript
sigils) and "it" (italic, for quotation from the transcription). Do not use these values
in the transcription itself.
Example: "Other <hi rend="bold">B</hi> manuscripts read . . . ."
For a complete discussion of the handling of notes within the transcription, please see the section on Editorial Notes.
Scribal Changes to a Manuscript
Damage
External Specifications:
In general, we record only damage made after the manuscript was first written. Those defects already in or on the vellum or paper and written around are not textually significant. We record damage only if it makes the text unclear or illegible.
If the damaged text cannot be transcribed with certainty, use
<unclear>
tags.
In a s<unclear agent="water">omer se</unclear>soun
If it is completely illegible (cropped, for example), use
<supplied>
tags to record the damage and supply the missing text,
though only if you wish to supply such text.
The <unclear>
or <supplied>
elements may be used
instead of or in addition to the <damage>
tag. Possible attributes and
their values are as follows:
- "type" describes the damage. Standard attribute values are:
- torn
- cropped
- faded
- rubbed
- smeared
- stained
- overbound
- creased
- "agent" signifies the cause of the damage. Standard attribute values are:
- water
- mildew
- "extent" indicates the size of the damaged area. We have thus far used the imprecise unit of the space required for a character in the scribal hand, although you may describe the area affected in inches, millimeters, folios, or whatever makes sense.
- "resp" refers to the transcriber who makes the decision about the existence, type, and extent of the damage. The default will be the initials of the named editor(s) as they appear in the TEI header, and need not be recorded.
- "hand" indicates the scribal hand responsible for the damage, where determinable. So far, we have had no occasion to use this attribute, but sample values would follow the hand designations declared in the TEI header, as follows:
- hand1
- hand2
- handx
Examples:
sapien<damage type="cropped"><supplied source="other B
manuscripts">ter</supplied></damage>
<damage type="stained">wandrynge</damage>
Additions
External Specifications:
Note:
If you have a transcription with markup finished before the publication of manuscripts L
and O (November 2004), the marginalia are most likely recorded in
<add>
tags within notes, and will have to be moved into
<marginalia>
tags.
Six Elements Mark-Up Additions
There are six elements that we use to tag additions, <add>
,
<addSpan>
, <fw>
, <head>
,
<trailer>
and <marginalia>
.
<add>
and <addSpan>
The <add>
tag serves to mark up phrase level text and
<addSpan>
to mark up larger blocks.
Note:
Effective as of July 2003
Use <add>
tags only for words and phrases introduced into the text
after the initial copying (whether by the original scribe, contemporary or later
scribes).
Use
<marginalia>
for all marginalia copied into the manuscript at the initial time of its
production.
Do not in any case use <add>
for material you have added to the text.
(See the Editorial Intervention section.)
Use add tags for textual matter added to the text after the initial transcription. Forme work
and marginalia tags can also be marked when necessary with <add>
tags,
though in many cases it will be impossible to identify the hand responsible for the addition
to the text.
The attribute "place" designates the point at which the addition is made. Use only the following values:
- inline
- supralinear
- sublinear
- marginLeft
- marginRight
- topLeft
- topCenter
- topRight
- bottomLeft
- bottomCenter
- bottomRight
Note that the designations are written as one word, with camel-casing, exactly as they
appear. This is of importance for later processing and display. Other attributes of the
<add>
element are:
- "hand" identifies the scribe who made the addition. See the Identifying Hands section.
- "resp" identifies the editor or transcriber who identified the hand. The default will be the initials of the named editor(s) as they appear in the TEI header, and need not be recorded.
- "cert" signifies the transcriber's degree of certainty as to the identification of the hand.
The following examples illustrate tagging for words of the poem omitted during initial copying but subsequently supplied, the first above the line by the original scribe, the second careted and written in the right margin by an unidentified hand:
<add place="supralinear" hand="hand1">for</add>
<add place="marginRight" hand="handx" >Dowell</add>
You may wish to reiterate or supplement the information in <add>
tags with
a note, as in the sample line below:
Right so <add place="supralinear"hand="hand1">bi</add><note
type="textual"> All other manuscripts omit <hi rend="it">bi</hi>,
added above the line in W.</note> persons and preestes
See the Editorial Notes section for further discussion on the tagging and nesting of notes.
Marginalia Cited in Legacy Notes
Note:
Because the <add>
tag used to be use to mark up marginalia, recorded
inside codicological notes, legacy markup of this kind will have to be moved into
<marginalia>
tags, which are documented in the Marginalia section.
If the added material is not meant to be part of the text, put it into
<marginalia>
or <fw>
tags as appropriate. You will
in many cases wish to attach an explanatory note as in the following instance where
"Stretford" is written in the left margin by a later hand.
<marginalia id="XP.14m1" place="marginLeft" hand="hand3">Stretford<note
id="XP.14m1n1" type="codicological"><ref>XP.14:</ref> A
sixteenth-century hand has added <hi rend="it">Stretford</hi> in the
left margin.</note></marginalia> <l id="XP14" n="etc.
Since hand3 is already identified in the header and introduction as a sixteenth-century hand and the information about place and hand appears in the display if asked for, probably the discursive note is unnecessary.
<addSpan>
The <add>
tag is used for short sequences of text, single words, or
phrases. <addSpan>
must be used for larger level additions because
<add>
tags do not carry over past structural boundaries like
</l>
. (See the Line Breaks section.)
<addSpan>
has the same attributes as <add>
, with
the addition of the attribute "to," which refers to the spot where the added material ends.
(There is also the possibility of using the attribute "type," but that would be used only if
the added text is not on an original manuscript page.) Instead of the expected closing tag,
<addSpan>
tags are closed by an <anchor>
tag
placed at the end of the span of added text. If, for example, two lines were omitted in the
body of the text and added by the original scribe in the bottom margin, the tags might
appear as follows:
<l id="L5.257" n="KD5.252"><addSpan place="bottomCenter" hand="hand1"
TO=addend01> And haue ymade many a knyƷte . bothe mercere &
draper<expan>e</expan></l>
<l id="L5.258" n="KD5.253"> þat payed neuere for his prentishode .
nouƷte a peire gloues <anchor id=addend01></l>
Note:
The value in the "to" attribute and <anchor>
id may not be a line
number or any other element present elsewhere. We will use "addend" + a number. Also,
this value is not in quotation marks like all others we use. Finally, the
<addSpan>
and <anchor>
tags must each be
within <l>
tags, as in the above example.
<head>
and <trailer>
Headers and trailers such as the passus headers and explicits are marked up with
<head>
and
<trailer>
tags, which are documented in the Headers and Closing Tags for Each
Passus section.
<fw>
Running titles, guide words, catchwords and signatures and any other forme work are marked up
with <fw>
tags, documented in the Forme Work
section.
<marginalia>
Marginalia are marked up the with the <marginalia>
tag, documented in the
Marginalia section.
Deletions
External Specifications:
As with <add>
and <addSpan>
above, we use two elements
to tag deletions, <del>
and <delSpan>
.
<del>
serves to mark up phrase level text and
<delSpan>
to mark up larger blocks.
Use <del>
</del>
tags where a word or passage is deleted or marked for deletion by
a scribe, annotator, or corrector. The content of these tags may be either the characters
that were deleted, if they are legible either under white or ultraviolet light, or symbolic
if they are not legible. Symbolic representations of deleted characters should be supplied
as folows: one period (.) per character up to five characters when it is possible to
determine or guess the number of characters deleted, ...?... for deletions of six to a dozen
characters, and ...?...?... for deletions of one half-line or more. In some cases, readers
should be told in a paleographical note to consult the manuscript
images, and should be given a link to the relevant image.
If the deleted text is unclear,
<unclear>
tags may be nested within
<del>
tags (as shown in Example 4 below).
<unclear>
tags give the option of expressing your degree of
confidence in the reading. If the deleted text can be easily read, or, at the opposite
extreme, cannot be read at all, there is no need to insert <unclear>
tags.
Note:
The <del>
tag will be followed by <add>
tags (previous section) where a scribe has deleted and then
substituted text. (See Examples 2-5 below.) This ordering is not
only logical, but also has consequences for later processing and display.
Standard attributes and attribute values for the <del>
tag are as
follows:
- "rend" indicates how the deletion was made in the text. Standard attribute values are: The "intended" attribute value, which indicates a deletion that was intended but not realized, must always be accompanied by a note. (See example #5.)
- subpunction
- erasure
- overwritten
- linedThrough
- bracketed
- ul
- "type" is synonymous with rend. We have chosen to use rend.
- "resp" indicates the editor responsible for identifying the hand of the deletion. The default will be the initials of the named editor(s) as they appear in the TEI header, and need not be recorded.
- "hand" identifies the scribe responsible for the deletion. See the Identifying Hands section.
- "cert" indicates the degree of certainty in attributing the deletion to a hand.
Example 1)The word "Plowman" struck through with nothing
substituted: <del rend="lined through"
hand="hand1">Plowman</del>
Example 2) The character "t" subpuncted in the word "clept" and
replaced in the right margin with "l": c<del
rend="subpunction"hand="hand2">t</del><add place="marginRight"
hand="hand2">l</add>ept
Example 3) The letter "e" erased and replaced with "y":
<del rend="erasure">e</del><add
place="inline">y</add>
Example 4) Some letter, probably an "o," overwritten with "n":
<del rend="overwritten" hand="hand1"> <unclear
cert="80%">o</unclear></del><add place="inline"
hand="hand1">n</add>
Example 5) The original scribe wrote "for egre" where all other
manuscripts have "ful egre." A partially scraped corrector's mark in the margin
indicates that the error was noticed. A scribe (not the text scribe) has added correct
"ful" in the right margin, without deleting "for." <del hand="hand3"
rend="intended">for</del><add place="marginRight"
hand="hand3">ful</add><note resp="hnd" type="codicological">A
partially scraped corrector's cross in the left margin indicates that the correction
was intended but not carried out.</note>
The <del>
tag is used for short sequences of text, single words, or
phrases. <delSpan>
must be used for larger level deletions because
<del>
tags do not carry over past structural boundaries like
</l>
. <delSpan>
has the same attributes as
<del>
, with the addition of the "to" attribute, which refers to the
spot where the deletion ends. Because <delSpan>
and
<addSpan>
are similar, the example shown in the previous section for
<addSpan>
should be helpful.
Editorial Intervetion
Editorial Alterations to the Text
Text Supplied
External Specifications:
You may use <supplied>
tags where text is missing or completely illegible
and you can supply it by reference to another source or by conjecture.
Standard attributes for <supplied>
are as follows:
- "reason" indicates why the material had to be supplied. Standard values for this attribute are:
- torn
- faded
- rubbed
- smeared
- overbound
- stained
- patched
- "resp" indicates the editor responsible for supplying the letter, word, or passage
contained within the
<supplied>
element. The default will be the initials of the named editor(s) as they appear in the TEI header, and need not be recorded. - "source" states the source of the supplied text (the editor's initials in the case of conjecture).
- "hand" indicates the scribal hand responsible for the damage that obliterated the text, where determinable. So far, we have had no occasion for this attribute.
- "agent" signifies the causative agent for the loss of text, where determinable. Standard values are:
- mildew
- stained
- water
Note:
Note the crucial distinction between the easily confused "hand" and "agent" attributes.
Sample <supplied>
tags:
<supplied reason="cropped "source="other B
manuscripts">Wh</supplied>er
Not<supplied reason="overbound" source="hnd">a</supplied>
Editorial Correction to a Witness
External Specifications:
TEI-conformant
SGML and XML permit several different ways of marking corrections to the base manuscript by
an editor. We have chosen to use the <sic>
<corr>
tags in tandem to show the manuscript reading and our emendation,
respectively, because this will provide the widest range of display options. Note the use of
square brackets in the following examples:
<sic>for</sic><corr>for [hem]</corr>
<sic>kaue</sic><corr>k[n]aue</corr>
Note:
In instances where a single or double tick, a paraph marker, or a "cc" indicator was
written in the margin to indicate to the rubricator where the parasign should go, but it
was never drawn, simply record a parasign--¶
--without adding
<hi rend="rb">
(rubric) tagging.
The <corr>
tag may include the "resp" and "cert" attributes if this more
complicated markup seems necessary or useful:
<sic>kaue</sic><corr resp="hnd"
cert="100%">k[n]aue</corr>
Editor Refrains from Correcting a Witness He Thinks Is Mistaken
External Specifications:
A <sic>
tag may be used without <corr>
if the editor
elects not to correct the text. <sic>seten to seten to</sic>
A more complicated tag can be used if the editor wishes not to display a correction but does want to record his opinion.
<sic resp="hnd" corr="my[s]chief"cert="99%">mychief</sic>
In this example, "cert" indicates the degree of certainty ascribed by a spineless HND to what he believes is the correct reading, but the displayed text will contain the erroneous reading "mychief." Note that a style sheet can be crafted that will display either the tag or the attribute, so the emendation is in fact made here but not displayed.
Editorial Notes
External Specifications:
Editorial Notes within <l>
Tags
The Medieval Academy of America and the Chicago Style Manual
Note:
Bibliographies and notes must conform to the style manual of our publisher, the Medieval Academy of America. In doubtful cases, the Medieval Academy refers authors to the Chicago Manual of Style, 15th ed. (2003).
Note:
Protocols regarding when to make a note, when to record in markup only, and when to have both are forthcoming.
IDs and <ref>
Tags for Notes within <l>
Tags
We will use <note>
tags with nested <ref>
tags
indicating line number, as follows:
<note type="textual">
<ref>M20.65:</ref>
Content of note.</note>
All notes must also be assigned an ID number, the value of which is based on the line number in which the note appears. Thus the note above, if it were the first note in the line, would receive the following ID:
<note
id="M20.65n1"
type="textual"><ref>M20.65:</ref> Content of
note.</note>
Subsequent notes on this line would be numbered "M20.65n2," "M20.65n3," and so forth.
Note:
Since these ID values are best assigned by a script, they can be added to an edition at the final stage of work, rather than by hand as the editor works.
So far we have designated the following note types: codicological, paleographic, linguistic, lexical, historical, source, theological, and textual.
Order of Sigils Listed in Textual Notes
Sigils should be listed in textual notes with the base text sigil first, followed by the beta and alpha sigils in the following order:
WHmCrGYOC2CBLMRF becomes WLMCrHmCGOC2YBRF, where "W" would be the base manuscript.
This sigil order is based on the stemma constructed by Robert Adams.
We will follow the convention of displaying the Piers Plowman A, B, and C designations in
bold type: <hi rend="bold">B</hi>
. Other useful values for
<hi rend>
in notes are "sup" (=superscript) and "it" (=italic).
<note type="textual"><ref>W3.83:</ref>W alone reads <hi
rend="it">enpoisone</hi>. All other <hi
rend="bold">B</hi>manuscripts read <hi
rend="it">poisone</hi>,except OC<hi rend="sup">2</hi> which
have <hi rend="it">punyschen</hi>.</note>
See the Punctuation section, for how to display angled brackets in notes.
A provisional note one you intend to remove after some issue is resolved may take the following simplified form:
<note> Content of note.</note>
Note:
For notes attached to marginalia, formework, guideletters and other secondary matter in the transcription, see the section on Editorial Notes on Matter Other than Primary Text.
Editorial Notes on Matter Other than Primary Text
External Specifications:
Often, you will need to make a note on an element of the manuscript other
than the main body of the text, such as formework, headers, trailers and marginalia. Such
notes need to be kept to a consistent format as in any other note, except that their
<ref>
cannot be keyed to the ID of the <l>
element
in which they appear, since they do not in fact appear inside of an
<l>
.
Note:
Since ID values are best assigned by a script, they can be added to an edition at
the final stage of work, rather than by hand as the editor works.
<ref>
content, however, should be added by the editor at the
time the note is generated.
Four Basic Conventions: Headers, Interlinear Elements, Formework & Trailers
In such cases, we have developed a single convention with three variants: one for note
elements appearing before or within the header of a passus, a second for elements appearing
after the headers, either before the first line or between the <l>
elements within a passus, a third for formework, and a fourth for those appearing in
conjunction with trailers.
Note:
In every case detailed below, the note should be nested inside the element on which it
comments, just as notes on passages inside of <l>
elements are always
nested inside the <l>
.
Notes Before or Within a <head>
Element
We encode notes appearing before or within the <head>
element of a passus
with the fictitious line number of zero (0) as the content of its <ref>
element. Thus, the note in MsM on the marginalium Assit principio... that appears
before the first line of the poem is encoded with the content of the
<ref>
element set to "MP.0:," and with ID's based on this fictitious
line number zero (0):
<marginalia id="MP.0m1"
place="topCenter" hand="handx">[element content]
<note id="MP.0mn1"
type="codicological" place="unspecified" anchored="yes">
<ref>MP.0:</ref>
The heading is written in a similar ink to that of the
text...</note></marginalia>
The ID values are not as cryptic as they may seem. "MP.0m1" represents "Manuscript M, Prologue, line zero, marginalium number 1." Likewise, "MP.0mn1" represents "Manuscript M, Prologue, line zero, marginalium number 1, note number 1."
The note itself should be nested inside of the element on which it comments, in this case,
inside the <marginalia>
element. A note on the content of the header
itself would be encoded on exactly the same model, and would also be nested inside of the
<head>
element:
<head id="M2.0h1">
[element content]
<note id="M2.0hn1
type="codicological" place="unspecified" anchored="yes">
<ref>M2.0:</ref>
No blank line follows this rubric, which is
centered.</note></head>
Notes on Marginalia Appearing After a <head>
Element
Notes that need to be made on marginalia appearing after the <head>
element (generally between <l>
elements) should be assigned the line
number of the line nearest to them, with <marginalia>
tags and their
contents placed immediately above that line in the transcription.
<l id="M2.114" n="KD2.112">Munde e Mellere...</l>
<marginalia id="M2.115m1"
place="marginRight" hand="hand3">[marginalium]
<note
id="M2.115mn1"><ref>M2.115:</ref>
[note]</note></marginalia>
<l id="M2.115" n="KD2.113">In e date...</l>
Note:
In this example and the one following, line wraps have been added to clarify where elements begin and end. Additional line wraps may also be invoked by your browser if your screen is set to a resolution below 1024x768. In no case should such extra "hard returns" be added to your transcription.
Notes on Formework Appearing After a <head>
Element
Formework appearing at the top of a leaf takes the ID of the first line on the leaf, but the
catchwords and signatures of various kinds at the foot of a leaf take the number of the last
<l>
on the leaf:
<l id="M2.130" n="KD2.128">Ȝe shul abiggen it boe . by god at me made .
</l>
</lg>
<fw id="M2.130fw1"
type="catch" place="bottomRight">Wel ȝe wyten wernardus</fw>
<fw id="M2.130fw2"
type="cor" place="bottomRight"><hi
rend="ur">ex<expan>aminatur</expan></hi></fw>
<fw id="M2.130fw3"
type="cor" place="bottomLeft">coret</fw>
<fw id="M2.130fw4"
type="cor" place="bottomCenter">coret</fw>
<fw id="M2.130fw5"
type="quire signature"
place="bottomRight">I<expan>us</expan></fw>
<milestone
n="9r" unit="fol." entity="B.M9r"/>
<fw id="M2.131fw1"
type="runningHead" hand="hand5"
place="topRight">ij<expan>us</expan>
p<expan>assus</expan></fw>
<lg type="strophe"
org="uniform" sample="complete">
<l id="M2.131" n="KD2.129">
<note id="M2.131n1"
type="codicological">
<ref>M2.131:</ref>
The <//> is to indicate...</note> Wel...faille</l>
Notes on Trailers (<trailer>
)
Since trailers follow the last line contained within a <div>
, and are
typically the last element to appear within any <div>
, they cannot take
as part of their ID value or <ref>
content the line number of a following
line. Hence, they simply take that of the preceding one, following the rest of the
conventions exactly as in the other cases:
<trailer id="M20.386t1">
<foreign lang="lat"><hi rend="display"><hi
rend="tr">E</hi>xplicit hic dialogus...</hi></foreign>
<note id="M20.386tn1"
type="textual">
<ref>M20.386:</ref>
This form of explicit...</note></trailer>
<trailer id="M20.386t2">
<foreign lang="lat"><hi rend="display">Penna precor...;</hi></foreign>
<note id="M20.386tn2"
type="textual">
<ref>M20.386:</ref>
The Colophons de...</note></trailer>
Marginalia
External Specifications:
<marginalia>
is a PPEA extension element.
Note:
For a detailed discussion of the distinctions between marginalia, formework and corrector's marks, see the >Important Distinctions section of the General Introduction.
The <marginalia>
element is used to tag matter not intended to be part of
the original poetic text nor forme work. That would include rubrics or glosses intended by
the original scribe to be part of the original page as well as annotations, rubrics,
glosses, etc. that are supplied by later hands.
Attributes for the marginalia tag include the following:
- "place" This attribute should always be supplied. For place, use only the following values:
- inline
- supralinear
- sublinear
- marginLeft
- marginRight
- topLeft
- topCenter
- topRight
- bottomLeft
- bottomCenter
- bottomRight
- "hand" This attribute should always be supplied. It identifies the scribe responsible for the marginalia.
- "id" A unique identifier. We use these for hypertextual linkages involving marginalia, so they must always be present. Under normal circumstances, we will add them in Charlottesville after completion of the edition.
- "type" Identifies the type of marginalia. No immediate plans for implementation.
If the marginal material is pictorial, the most common being a pointing hand, use
<figDesc>
tags within <figure>
tags within
<add>
tags in a note:
<note type="codicological">A scribe has drawn a <add place="marginRight"
hand="handx"><figure><figDesc>pointing
hand</figDesc></figure></add>in the right
margin.</note>
Note:
Notes pertaining to any feature of the text contained within marginalia tags must also be contained within the marginalia tags. Many, if not all, will be codicological.
Note:
Each <marginalia>
tag must be placed directly above the
<l>
tag of the first line to which it is pertinent. In the case
of marginalia pertaining to several lines of text, put the
<marginalia>
tag above the first line to which it pertains and
supply a codicological note inside the <marginalia>
tag explaining
which additional lines it might be applicable to. In cases where two or more marginal
comments are associated with a line, each must be recorded in its own marginalia
element.
Identifying Hands
External Specifications:
For each manuscript, we will identify and describe as many contributing scribal hands as we can distinguish with confidence. A hand recognizably the same in two or more additions or changes to the text (whether by way of marginalia or corrections) should be given an identifying number, and that number will be identified in the document header.
The primary copyists in each manuscript are designated in order of appearance as hand1, hand2, etc., in order of appearance.
A hand not thus characterized will be labeled "handx" in the SGML or XML tags (as neither SGML nor XML permits the use of a question mark, as in "hand?").
Beyond that, editors may use whatever designations they find useful. For example, if the editor is unable to recognize repeated instances of materials written by a hand, various hands may be lumped together, either simply as "handx" or identified by century or style. For instance, one might label one of several fifteenth-century hands as "hand15x" or secretary hands as "hand16saecx" or some other convenient designator.
In cases in which single instances of hands from entirely different, clearly identifiable eras appear, a designation such as hand19 and hand16 might be clearer than a simple handx referring to all such hands taken together. The addition of "x" to the hand designation is intended to make clear the ambiguity of the identification.
Note:
Handx as a hand identifier can be used as a simple place holder pending later decisions.
Revision Dates
Revised on the following dates: 13 January 1994, 29 November 1995, 18 January 1997, 19 May 1997, 3 November 1997, 1 June 1998, 30 October 1998, 25 January 1999, 26 March 1999, 17 June 1999, 28 March 2001, 16 April 2001, 5 June 2001, 19 June 2003, 18 October 2004, 28 July 2005, 5 September 2005, September 23, 2005. Markup revisions: 23-25 May 2017.