Translation Graphs
		   by Tom Veatch, 2003, 2008, 2017-18

-------------------------------------------------------------------
	  	            Motivation 
-------------------------------------------------------------------

Suppose you are competent in one language (call that language L1) and
you are interested in a document (whether as text, audio, or video) in
another language which you don't know (call that language L2).
Wouldn't it be nice if someone had done some work on that document to
make it accessible to you in that other language?  Of course a normal
paragraph by paragraph translation of the whole, which you may find in
a published translation of that authored work, would do something like
this, but you wouldn't be getting it in the other language at all, but
in your own language.  You'd learn the translated content, but you
wouldn't learn the language.

I have in mind, instead, something that opens the language itself to
you, so that as you, the learner, go through the document as analysed
and worked up by a linguist/translator, you gradually become able to
recognize and understand the elements of the other language and
ultimately to understand the original itself.  This worked-up form of
the document would have to provide a variety of more-accessible forms
of the parts of the document: translations that are word-by-word, not
just paragraph by paragraph, links to audio playback of pronunciations
of the symbols and words, perhaps of larger sections.  Using it should
teach you and enable you to get a lot of its content; the second or
fifth time you encounter a word in that language, you might not have
to look at its dictionary entry to begin to understand it yourself.
Ultimately, with enough such documents, and enough time devoted to
working through them, you would become a competent reader and even
listener in that other language.

To enable this vision of language learning through a mediated,
supported, but direct encounter with the original L2 document, I have
had to envision a whole architecture of language data
representation, markup, storage, lookup tools, editing systems,
display systems and the like, which would be needed to take that
original L2 document and add the L1 resources needed and then to make
them accessible to you as you read through the document.  This is my
draft description of that system.

The key idea, of course, is multilinearity.  Consider the original
document as a line of text.  Maybe a very very long line, but in the
abstract, just a sequence of symbols on a single line.  Then, any
additional representation that supports or makes accessible to you any
piece of the original document, can be considered as a translation of
a piece of that first line which is then written onto some additional
second (or Nth) line, in a way where you can tell which part of the
first line it is a translation of.  Such a representation is
multilinear.  

You might have a large number of lines, lines that show pronunciation,
others that do vocabulary, lines with big gaps in them, lines that
refer to data outside the workup in a dictionary or an audio library
or on the internet, lines that call your attention to syntax or
dialect features, lines that link to clear pronunciations from an
audio dictionary and lines that link to live, vernacular or fast
speech recordings of whole sentences or turns, so that you can learn
what fast speech sounds like in that language, not just careful
pronunciations from a dictionary.  

And the system that ultimately presents it to you could have an
intelligent model of what you know as a learner of this new language
and it could display for you, at a suitable pace and with the right
amount of repetition and testing, the easiest next elements for you to
learn, as you gradually acquire competence with all the many things in
that document.  Well, such a trainer is one step beyond the scope of
this current discussion; it is something that, after we achieve
the technical requirements discussed here, we could then aim to build.

The Translation Graph infrastructure enables the workup of documents,
structurally, and multilingually, and enables their access for
learners. Then mediating to a learner/reader is more: a tool that
serves as an instructor, carrying on an ongoing dialog with the
individual to know what they got out of it, the system should learn
how comfortable you are with the elements of L2 as you work with it.


	      Possible application examples and features:

For example, an Apple TV remote control that you can click when you
don't understand something in the recent history of the current media
playback; if it's media annotated with this kind of data, and the
remote understands the meaning of the click as "Explain that to me",
then the video can pause, an Interlinear Glossed Text (IGT) analysis
can be displayed, and the user can browse until they learn what they
want, and click onward to continue.

Dave Graff's idea: This could applied to multi-lingual Twitter
feeds.  People who want to understand the L2 twitter data (and authors
who want to be understood) might contribute a lot of data checking and
editing to such a system.  Getting the community usefulness going,
after some critical mass is achieved, with live dictionaries, live
algorithms, and humans involved, it could become quite useful to all.

Tom continues: I want anyone to go into an L2 situation and be
maximally supported to learn and understand what they don't know.
This is far more ambitious than the Star Trek universal translater.
It applies equally to multi-lingual Twitter, to foreign watchers of
previously unsubtitled English movies, to audio concordances for
learning dialect features, to archiving and study of ancient religious
texts, to any form of language whether textual, audio, scans, slides,
or video that is of interest to the point it is worth doing the work
on it to make it accessible to another language. Put an app into your
iPad and watch the TV with it, when it recognizes a place in the film
where someone has made a TG workup out of it, that's a language
tutorial you can use, and the app provides for the user to click a
button and see and go through an IGT to learn -- on the iPad, if the
TV isn't smart enough to show it on the TV.  Or have it be
knowledgeable about you enough to pause and give you a translation of
something it thinks will be helpful to you, once in a while.  And you
can click the "Huh?" button here or there as a question about what did
that mean, and it can offer contextualized help.  Even partly
understanding native speakers can use similar controls over
presentation of the Translation Graphs to turn the subtitles on and
off.  A clicked mouse or enabled pointer over a part of the text could
show a reverse-video circle or speech bubble with the corresponding L1
translation of some part of the L2 content shown, adjusted
continuously as the mouse moves around the document, so that the
learner can point at what they don't know, see it in translation,
point away to see the original again, etc., until they don't have to
see the translation any more because they understand it in its
original L2 form.

 * Presentation for learning may be computer controlled based on a
   model of the reader/learner's knowledge, or manually parameterized.
   Anyway with TGs we now have data to support the learner.

   A map to the meanings of the grammatical tags found in the markup.
   
   A map to a concordance for any morpheme should be a click away.

   A map to an IPA reference and to a pronunciation guide and script
   description should be a click away.

   An audio format where the text is performed in a recording with a
   bouncing ball display should be a click away.

   All this may be hidden and the image/video/audio media
   (dis-)played, with the display of all tiers or a parameterized,
   selected subset, a click away during playback when the audience is
   puzzled and wants to understand the part they just heard but didn't
   understand.

A YouTube app with a TG from its original L2 language to the L1
language of various learner populations, would show the video with
IGTs scrolling below.

A reader app would show the L2 original, allowing pointer-guided
selection of parts the reader wants to learn more about, and
deselection to indicate they got it. With use, the original becomes
understandable to the learner.

-------------------------------------------------------------------
	       	           Introduction
-------------------------------------------------------------------

TGML is a markup language for Translation Graphs.  A Translation Graph
represents a single language event or document in its multiple
translations, forms, layers, levels, segmentations, etc.  For example,
a Prakrit Buddhist text, with sentence-by-sentence English
translations, could be represented with three levels: 1) the
unsegmented original, 2) the sentence-by-sentence segmented original,
and 3) the sentence-by-sentence translated version.  The levels share
an abstract, consistent, partial ordering, but possibly no other data,
though they correspond with one another.  The correspondences are
defined by sharing boundaries across levels in the internal
segmentation of each level.

The TGML concept enables multiple-file representation of TGs such that
data of any type representing segments may be aligned together as they
correspond with one another sequentially in a document.

Segmentation and grouping are defined by nodes and arcs in a simple,
DAG ("directed acyclic graph", ignore this name if it is puzzling), or
lattice, or tiered structure with arcs for content and nodes,
optionally shared across tiers, representing alignment.  By definition
start/end nodes are required to be shared across all tiers.  Each
tier's arc contents are of a single type of data or translation.  No
tier need be more privileged than another.  TGs are typically works in
progress, as there are as many different translations, renderings,
commentaries, and analyses for any linguistic form as the mind of a
linguist can conceive.  Believe me, that is infinite.  Arcs of a
single tier together form a single pass through the entire document
and represent a single type of data.

Examples of the type of data within a given tier might be: original
images of palm leaf sheets, or, OCR-generated word hypotheses, or,
manually corrected phonetic transcription of a certain actor's verbal
rendering, or, a segment of a video (defined by data type,
time-stamps, perhaps player or other information), or, an audio
recording of one of a thousand instances of the word "OK", by a
study's many speakers in various natural contexts of occurrence.