OTAP: Reversible Transcription
- Punctuation of an Electronic Version of a
19c Ottoman Arabic Script Text
- Punction Conventions of 19c Arabic Script Printers
- MS scribal tradition
- Latin script typesetting
- Arabic script typesetting
Changing Practice: 1876-1928
- Many characters in Arabic script typesetting begin on the baseline and descend below the baseline, unlike Latin script where all characters are above the line and only a small number (y,g,j, etc.) descend partially below the line.
- The position of a character within a word determines its relation to the other characters in the word and also to the baseline. This means that a given character may be on the line sometimes and above the line other times.
- Certain characters in the Arabic script are always on or below the baseline, regardless of their position within a word. (e.g. ر , ز , د , ذ)
Guidance Offered by Recent Work in Arabic Typography
- Early printing
- Changes after 1910
Technical Issues in the Unicode Representation of Punctuation
in Arabic Script Texts
- Design centered focus
- Little exploration/consideration of historical practice
- RTL Punctuation Codes in the Arabic Registers
- Two Arabic script punctuators
- Question mark: ؟ - ؟
- Arabic comma: ، - ،
- Borrowed LTR Punctuators
- Opening quotation mark: ›› - ››
- Closing quotation mark: &#x2039;&#x2039; - ‹‹
- Dot: &#x002e; - . / &#x00b7; - · (used as a comma on p. 3)
- Em dash (widest): &#x2014; - —
- Dot and em dash: —. - —. / —· - —·
- Colon: : - :
- Opening parenthesis: &#x0029; - )
- Closing parenthesis: &#x0028; - (
- Exclamation mark: ! - !
- En dash (skinnier): – - –
- Figure dash (skinniest): ‒ - ‒
- Dot and en dash: –. - –. / –· - –·
- Accounting for the absence of additional punctuation marks
in the Arabic registers
- lack of input of scholars working with historical texts
- restricted use of punction in modern Arabic texts
- existence of electronic typesetting
and fonts developed as national standards before
the widespread adoption of the Unicode standard.
Use of Latin Punctutation to Supplement "Missing" Glyphs
- The development and adoption of the Unicode
standard -- a unique code for each glyph in
a writing system -- occured rather late in the
adoption computer-based document creation and
typesetting. Prior to that time, multiple font
encodings existed for various operating systems
and, indeed, it was common practice for scholars
and others to use font creation software packages,
such as Fontographer or Fontmonger, to create
fonts for specialized typesetting needs.
- Further, many non-Latin based writing systems,
such as CJK, had developed national standards,
for example, Big5 for Chinese used in Taiwan, Hong Kong,
and Macau, obviating the need for these linguistic
communities, with freely available software and
operating systems supporting national encoding
standards, to convert the Unicode standard, which
would have entailed tremendous costs.
- Something similar may have occured in the
case of the development of computer-based
Arabic scripts with well-to-do Gulf states
funding the development and adoption of contemporary
functional Arabic script software that had
restricted punctuation needs.
- Need for accuracy in the represenation of historical texts
- Mixing of RTL/LTR as an issue in typesetting Unicode texts.
Intibah Sample Punctuation Chart
- Namık Kemal, İntibah, 1876.
- The Unicode standard, version 2.0, Unicode Consortium
Reading, Mass. : Addison-Wesley Developers Press, c1996,
[QA268 .U56] 1996
- AbiFarès, Huda Smitshuijzen,
Arabic typography : a comprehensive sourcebook,
London : Saqi, 2001 [ Z251.A6 A25 2001]
- Tom Milo, "Some comments on the Arabic block in Unicode" Decotype
- Walter Andrews and Pierre McKay, ``The Ottoman Texts Project.''
Texniques. No. 5.