Hierarchical Translation Equivalence over Word Alignments
Khalil Sima'an, Gideon Maillette de Buy Wenniger

Abstract:
We present a theory of word alignments in machine translation (MT)
that equips every word alignment with a hierarchical representation
with exact semantics defined over the translation equivalence
relations known as hierarchical phrase pairs. The hierarchical
representation consists of a set of synchronous trees (called
Hierarchical Alignment Trees -- HATs), each specifying a
bilingual compositional build-up for a given word aligned,
translation equivalent sentence pair. Every HAT consists of a single
tree with nodes decorated with local transducers that conservatively
generalize the asymmetric bilingual trees of Inversion Transduction
Grammar (ITG). The HAT representation is proven semantically
equivalent to the word alignment it represents, and minimal (among the
semantically equivalent alternatives) because it densely represents
the subsumption order between pairs of (hierarchical) phrase pairs. We
present an algorithm that interprets every word alignment as a
semantically equivalent set of HATs, and contribute an empirical study
concerning the exact coverage of subclasses of HATs that are
semantically equivalent to subclasses of manual and automatic word
alignments.