Benjamin Marie,
researcher in Natural Language Processing (NLP)



I am a French scientist working as a researcher at the Advanced Translation Technology Laboratory at NICT (Kyoto, Japan) since May 2016 and on the tenure-track since April 2019. My research is focusing on improving Machine Translation (MT) for low-resource language pairs, especially involving languages of East and South Asia.

Before joining NICT, I was a Ph.D. student at LIMSI-CNRS (Orsay, France), supervised by Aurélien Max and Anne Vilnat, also simultaneously engineer for the company Lingua-Et-Machina and sometimes teacher at Université Paris-Saclay.

Currently, I am focusing on how to make a better use of monolingual data in MT. I am also working in evaluation for MT and in cross-lingual semantics.

Topics of interest : unsupervised MT, bilingual semantics, paraphrasing, evaluation metrics for MT

Current Grants/Fundings

NICT Tenure-track funding. Ending 2021.

JSPS (Japan Society for the Promotion of Science) grant for early-career scientists: Neural Machine Translation for User-Generated Contents. Ending 2022.

Committees

Best paper committees: ACL 2018 (Demo)

Paper reviewer: ACL (2020-2017), AAAI (2021-2020), COLING (2020,2016), EMNLP (2020,2019*-2017), IJCAI (2020,2019), IJCNLP (2017), LREC (2020,2018), NAACL (2019-2016), ACM TALLIP, IEEE/ACM TASLP
*: outstanding reviewer

Publications

2020

[pdf,bibtex]

Marie, B., Fujita, A. (2020). Synthesizing Parallel Data of User-Generated Texts with Zero-Shot Neural Machine Translation. To appear in TACL.

Marie, B., Fujita, A. (2020). Iterative Training of Unsupervised Neural and Statistical Machine Translation Systems. In TALLIP Vol. 19 issue 5 (2020).

Marie, B., Raphaël, R., Fujita, A. (2020). Tagged Back-translation Revisited: Why Does It Really Work?. In ACL 2020, online.

2019

Marie, B., Kaing, H., Mon, A.M., Ding, C., Fujita, A., Utiyama, M. and Sumita, E. (2019). Supervised and Unsupervised Machine Translation for Myanmar-English and Khmer-English. In WAT 2019, Hong Kong.
Ranked 1st for En->Km and Km->En.

Marie, B., Sun, H., Wang, R., Chen, K., Fujita, A., Utiyama, M. and Sumita, E. (2019). NICT’s Unsupervised Neural and Statistical Machine Translation Systems for the WMT19 News Translation Task. In WMT19, Florence, Italy.
Ranked 1st.

Marie, B., Dabre, R., and Fujita, A. (2019). NICT’s Machine Translation Systems for the WMT19 Similar Language Translation Task. In WMT19, Florence, Italy.

Dabre, R., Chen, K., Marie, B., Wang, R., Fujita, A., Utiyama, M. and Sumita, E. (2019). NICT’s Supervised Neural Machine Translation Systems for the WMT19 News Translation Task. In WMT19, Florence, Italy.

Marie, B. and Fujita, A. (2019). Unsupervised Joint Training of Bilingual Word Embeddings. In ACL 2019, Florence, Italy.

Marie, B. and Fujita, A. (2019). Unsupervised Extraction of Partial Translations for Neural Machine Translation. In NAACL-HLT 2019, Minneapolis, USA.

2018

Marie, B., Fujita, A., Sumita, E. (2018). Combination of Statistical and Neural Machine Translation for Myanmar–English. In WAT 2018, Hong Kong.
Ranked 1st (BLEU) for My-En and En-My.

Wang, R., Marie, B., Utiyama, M., Sumita, E. (2018). NICT's Corpus Filtering Systems for the WMT18 Parallel Corpus Filtering Task. In WMT18, Bruxelles, Belgium.

Marie, B., Wang, R., Fujita, A., Utiyama, M., Sumita, E. (2018). NICT's Neural and Statistical Machine Translation Systems for the WMT18 News Translation Task. In WMT18, Bruxelles, Belgium.
Ranked 1st (BLEU) for Et-En, En-Et, En-Fi, and Fi-En.

Marie, B. and Fujita, A. (2018). A Smorgasbord of Features to Combine Phrase-Based and Neural Machine Translation. In AMTA 2018, Boston, USA.

Marie, B. and Fujita, A. (2018). Phrase Table Induction Using Monolingual Data for Low-Resource Statistical Machine Translation. In TALLIP Vol. 17 issue 3 (2018).

2017

Marie, B. and Fujita, A. (2017). Phrase Table Induction Using In-Domain Monolingual Data for Domain Adaptation in Statistical Machine Translation. In TACL Vol. 5 (2017). Presented at ACL 2018

Marie, B. and Fujita, A. (2017). Efficient Extraction of Pseudo-Parallel Sentences from Raw Monolingual Data Using Word Embeddings. In ACL 2017, Vancouver, Canada.

2015

Marie, B. and Max, A. (2015). Touch-Based Pre-Post-Editing of Machine Translation Output. In EMNLP 2015, Lisbon, Portugal.

Marie, B., Allauzen, A., Burlot, F., Do, Q. K., Ive, J., Knyazeva, E., Labeau, M., Lavergne, T., Löser, K., Pécheux, N., Yvon, F. (2015). LIMSI@WMT'15: Translation Task. In WMT'15, Lisbon, Portugal.
Ranked 1st for En-Fr and Fr-En.

Marie, B. and Apidianaki, M. (2015). Alignment-based sense selection in METEOR and the RATATOUILLE recipe. In WMT'15, Lisbon, Portugal.
Ranked 1st for En-Fr and Fr-En.

Marie, B. and Max, A. (2015). Multi-Pass Decoding With Complex Feature Guidance for Statistical Machine Translation. In ACL-IJCNLP 2015, Beijing, China.

Apidianaki, M., Marie, B. (2015). METEOR-WSD: Improved Sense Matching in MT Evaluation. In SSST-9, Denver, US.

2014

Marie, B., Max, A. (2014). Confidence-based Rewriting of Machine Translation Output. In EMNLP 2014, Doha, Qatar.

Pécheux, N., Gong, L., Do, Q. K., Marie, B., Ivanishcheva, Y., Allauzen, A., Lavergne, T., Niehues, J., Max, A., Yvon, F. (2014). LIMSI @ WMT’14 Medical Translation Task. In WMT’14, Baltimore, US.

2013

Marie, B. and Max, A. (2013). A Study in Greedy Oracle Improvement of Translation Hypotheses. In IWSLT 13, Heidelberg, Germany.

Reports

2016

Ph.D. thesis: Complex Feature Guidance for Statistical Machine Translation (french)

2013

Project ANR TRACE report, part 5.2 (french)

2012

M.S. thesis: Improving Machine Translation Outputs by Greedy Search (french)