OTAP: Reversible Transcription

  1. Punctuation of an Electronic Version of a 19c Ottoman Arabic Script Text
    1. Punction Conventions of 19c Arabic Script Printers
      1. Models
        1. MS scribal tradition
        2. Latin script typesetting
        3. Arabic script typesetting
          • Many characters in Arabic script typesetting begin on the baseline and descend below the baseline, unlike Latin script where all characters are above the line and only a small number (y,g,j, etc.) descend partially below the line.
          • The position of a character within a word determines its relation to the other characters in the word and also to the baseline. This means that a given character may be on the line sometimes and above the line other times.
          • Certain characters in the Arabic script are always on or below the baseline, regardless of their position within a word. (e.g. ر , ز , د , ذ)
      2. Changing Practice: 1876-1928
        1. Early printing
        2. Changes after 1910
      3. Guidance Offered by Recent Work in Arabic Typography
        1. Design centered focus
        2. Little exploration/consideration of historical practice
    2. Technical Issues in the Unicode Representation of Punctuation in Arabic Script Texts
      1. RTL Punctuation Codes in the Arabic Registers
        1. Two Arabic script punctuators
          1. Question mark: ؟ - ؟
          2. Arabic comma: ، - ،
        2. Borrowed LTR Punctuators
          1. Opening quotation mark: ›› - ››
          2. Closing quotation mark: ‹‹ - ‹‹
          3. Dot: . - . / · - · (used as a comma on p. 3)
          4. Em dash (widest): — - —
          5. Dot and em dash: —. - —. / —· - —·
          6. Colon: : - :
          7. Opening parenthesis: ) - )
          8. Closing parenthesis: ( - (
          9. Exclamation mark: ! - !
          10. En dash (skinnier): – - –
          11. Figure dash (skinniest): ‒ - ‒
          12. Dot and en dash: –. - –. / –· - –·
        3. Accounting for the absence of additional punctuation marks in the Arabic registers
          1. lack of input of scholars working with historical texts
          2. restricted use of punction in modern Arabic texts
          3. existence of electronic typesetting and fonts developed as national standards before the widespread adoption of the Unicode standard.
            • The development and adoption of the Unicode standard -- a unique code for each glyph in a writing system -- occured rather late in the adoption computer-based document creation and typesetting. Prior to that time, multiple font encodings existed for various operating systems and, indeed, it was common practice for scholars and others to use font creation software packages, such as Fontographer or Fontmonger, to create fonts for specialized typesetting needs.
            • Further, many non-Latin based writing systems, such as CJK, had developed national standards, for example, Big5 for Chinese used in Taiwan, Hong Kong, and Macau, obviating the need for these linguistic communities, with freely available software and operating systems supporting national encoding standards, to convert the Unicode standard, which would have entailed tremendous costs.
            • Something similar may have occured in the case of the development of computer-based Arabic scripts with well-to-do Gulf states funding the development and adoption of contemporary functional Arabic script software that had restricted punctuation needs.
      2. Use of Latin Punctutation to Supplement "Missing" Glyphs
        1. Need for accuracy in the represenation of historical texts
        2. Mixing of RTL/LTR as an issue in typesetting Unicode texts.
    4. Intibah Sample Punctuation Chart