This is a tentative syllabus and course schedule. It is subject to change.

Week Date Topic Reading Assignments
1 3/28 Introduction to MT Intro readings  Hw1 out
2 4/04 Word-alignment models for MT Word alignment models readings

 Hw1 due Wed 4/06

Hw2 out

3 4/11 Phrase-based models Phrase translation models reading  
4 4/18 Decoding for phrase-based models and minimum error rate training SMT Ch 6

 Hw2 due Wed 4/20

Hw3 (version 2) out



Hierarchical Reordering, Discontinuous Phrases, Factored translation models

Factored models by Philipp Koehn

Advanced PBT readings

Hw3 due Wed 4/17



Data extraction

Tree-based models I     (Formalism &  Hiero)

Tree-based models I readings

Project proposals due

Hw4 out

Hw2 solutions


Tree-based models II Linguistically syntax-based Tree-based models II readings Hw3 solutions
8 5/16 Tree-based models III Decoding algorithms Tree-based models III readings Hw4 due 5/16
9 5/23

Advanced specialized techniques: Feature-rich models

An end-to-end discriminative approach to machine translation (Liang et al 2006)

11,001 new features for statistical machine translation (Chiang et al 2009)

online papers TBD  project updates due 5/23
10 5/30 Holiday    
  Last week or finals week Final project presentations   Project reports due



The main text is the book Statistical Machine Translation by Philipp Koehn.


Required: SMT Ch 1, Ch 8, and Ch 2.3

Word Translation Models

Required background: SMT Ch 2 and SMT Ch 3
Required: SMT Ch 4.1 to 4.4.2

Optional: SMT Ch 4.4.3 to 4.4.5, Och & Ney 03, Germann et al. 03 
   A Statistical MT Tutorial Workbook (a gentle introduction to word-level IBM models by  Kevin Knight)

Word Alignment I

Required: SMT Ch 4.5 and HMM alignment
Optional: TBD

Phrase Translation Models

Required: SMT Ch 5.1 to 5.3 SMT Ch 6.1 to 6.3
Optional: SMT Ch 5.4-5.5 Ch 6.4
Optional: papers TBD

Tree-based Models I

Required: parts of SMT Ch 11
Optional: Chiang 2007

Tree-based Models II


Liang Huang, Kevin Knight, and Aravind Joshi (2006). Statistical Syntax-Directed Translation with Extended Domain of Locality. In Proceedings of the 7th Biennial Conference of the Association for Machine Translation in the Americas (AMTA), Boston, MA.[paper]

Menezes, Arul, and Quirk, Chris. Syntactic Models for Structural Word Insertion and Deletion during Translation, in Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Honolulu, Hawaii, October 2008 [paper]

W06-1606: Daniel Marcu; Wei Wang; Abdessamad Echihabi; Kevin Knight SPMT: Statistical Machine Translation with Syntactified Target Language Phrases [paper]



No announcements

Send questions about this workspace to KRISTINA TOUTANOVA.