Tool Description


For each of the control buttons on the Reversible Transcription Intake page (e.g., Ottoman Arabic), there is a php script that is invoked when the button is clicked. The scripts are passed the contents of the Input Text Area and the contents of the Metadata fields and return the formatted output. These script files are as follows:


Each of these scripts works in a very similar fashion. Each loads a file of substitution rules that are used to perform the transcription. Any tags (e.g., "<italic>") in the input text area are removed and saved for later reinsertion. This is also true of html entity references (e.g., "#&1576;"). Both of these are removed so the application of the substitution rules will not mangle them. Incidental whitespace in the input is simply discarded. The substitution rules are then applied. The rules are in the form:

"B7" -> "+1576;", /* Arabic Letter Beh */

In this example, a "B7" in the input will be replaced by "+1576;" which is an entity reference for the Arabic Letter Beh. The "+" will be replaced by "#&" right before the final output. "*" and "**" are used to mark line breaks and line group breaks respectively. Line breaks are done with div tags and line groups have a p tag surrounding the groups of lines, each with its own div tag. The code in each script is commented if more detail is needed about its specific operation.

In the file of substitution rules, the rules are listed in the order they will be applied, separated by commas. Anything between "/*" and "*/" is a comment and can used to add documentation. Each part of the rule is in quotes so that white space in the input can be interpreted correctly as needed. These are two distinct substitution rules:

" B7" -> " +1576;",
"B7" -> "+1576;",

The rules are applied in the order they appear in the rules file. When adding rules, one must consider where to order them in the list. In the two rules shown above, applying the rule without the space first would almost certainly prevent the one with the space from ever being applied.

The rules files are as follows:


The XML php script uses the same rules file as the Ottoman Arabic. There is also a set of rules in the file "dynamic_chart_rules.txt". This is used to dynamically generate a table of all the substitution rules with a brief descriptive note. This chart is intended primarily for debugging purposes for the person modifying or adding rules to the transcription rules files.

Dynamic Rules Chart

To modify the chart, one must add an entry in the "dynamic_chart_rules.txt" file for each change. For example:

"k0" -> "an (optional) descriptive note",

Whitespace before or after a conversion code is represented in this chart by [space].

Functional Files

A complete list of the functional files (with a descriptive note) is as follows:

  The intake page where text is entered to get a reverse transcription.

PHP scripts:

  Script to return Ottoman Arabic.
  Script to return Ottoman Latin.
  Script to return Turkish.
  Script to return the Cyrillic output. This currently in development.
  Script to return XML. This currently in development.
  Has a couple of auxiliary functions that are used in common by all the scripts
  for reading in the rules files.

Substitution Rules text files:


These are very minimal stylesheets:


Site Map
   First page of the Reverse Transcription Tool featuring a project description and a link to an article about reverse transcription.
  The above mentioned article.
  Page with the following links for more information and link to the tool itself.
  Instructions for using the reverse transcription tool.
  Chart of consonant conversions in English order.
  Chart of consonant conversions in Arabic order.
  Chart of vowel conversion codes.
  Chart of special character codes.
  Dynamically generated chart that displays all of the substitution rules.
  Incomplete article.
  The intake page where text is entered to get a reverse transcription.
  Script to return Ottoman Arabic. Run this by clicking on the Ottoman Arabic button.
  Script to return Ottoman Latin. Run this by clicking on the Ottoman Latin button.
  Script to return Turkish. Run this by clicking on the Turkish button.
  Script to return the Other output. This currently in development.
  Script to return XML. This currently in development.

Send email to:


Non-commerical use of files on this site is allowed with attribution, all other uses are prohibited.
Accepting these restrictions is a condition of entering the website.