Skip to content

Benjamin Marie's Blog

Analysis About AI, Natural Language Processing, and Machine Translation

Benjamin Marie's Blog

Analysis About AI, Natural Language Processing, and Machine Translation

  • Blog
  • About
  • Contact
    • Home
    • Benjamin Marie
Evaluation Scientific credibility

Do Bigger Evaluation Datasets Make Your Results More Significant?

May 13, 2023

The size of the test set shouldn’t have any impact on the evaluation, provided that the test set has been correctly created. Increasing its size shouldn’t change the p-value of…

Evaluation Machine translation Scientific credibility

Scientific Credibility in Machine Translation Research: Pitfalls and Promising Trends

May 11, 2023

Are we at a turning point? My conclusions from the annotation of 1,000+ scientific papers

Machine translation GPT LLM

AI Won’t Replace Translators

Mar 31, 2023

Machine translation has seen many breakthroughs in its 70 years of existence. The concept of machines replacing human translators has been a topic of prediction and discussion from the inception…

Evaluation Machine translation

Traditional Versus Neural Metrics for Machine Translation Evaluation

Mar 9, 2023

Since 2010, 100+ automatic metrics have been proposed to improve machine translation evaluation. In this article, I present the most popular metrics that are used as alternatives, or in addition,…

Evaluation GPT LLM Machine translation

Translate with ChatGPT

Feb 16, 2023

A very robust machine translation system.

Evaluation Machine translation

12 Critical Flaws of BLEU

Dec 12, 2022

BLEU is an extremely popular evaluation metric for AI. It was originally proposed 20 years ago for machine translation evaluation, but it is nowadays commonly used in many natural language processing (NLP)…

Evaluation LLM Machine translation

How Good Is Google PaLM at Translation?

Dec 2, 2022

But how good is PaLM at translation compared to the standard machine translation encoder-decoder approach?

Conference Framework/Tool LLM Machine translation

AACL-IJCNLP 2022 Highlights

Nov 25, 2022

The AACL 2022 was held jointly with the IJCNLP from the 20th to the 23rd of November. This was the second edition of the AACL, the Asian chapter of the Association for…

Evaluation Machine translation Scientific credibility

BLEU: A Misunderstood Metric from Another Age

Nov 5, 2022

In this article, we will go back 20 years ago to expose the main reasons that brought BLEU to existence and made it a very successful metric. We will look…

Evaluation LLM Machine translation Scientific credibility

Why the Evaluation of OpenAI Whisper Is Not Entirely Credible

Oct 31, 2022

Whisper is evaluated on 6 tasks (section 3 of the research paper). I demonstrate that the conclusions drawn from 3 of these evaluation tasks are flawed ❌ or misleading ❌.

Posts navigation

1 2

Next Page »

About the author:
Ph.D, research scientist in NLP/AI.
Advocate of the scientific credibility.
Building next-gen AI translation systems: https://slaitor.com

  • Conference
  • Evaluation
  • Framework/Tool
  • GPT
  • LLM
  • Machine translation
  • Scientific credibility

You Missed

Evaluation Scientific credibility

Do Bigger Evaluation Datasets Make Your Results More Significant?

Evaluation Machine translation Scientific credibility

Scientific Credibility in Machine Translation Research: Pitfalls and Promising Trends

Machine translation GPT LLM

AI Won’t Replace Translators

Evaluation Machine translation

Traditional Versus Neural Metrics for Machine Translation Evaluation

Benjamin Marie's Blog

Analysis About AI, Natural Language Processing, and Machine Translation

Copyright © All rights reserved | Blogus by Themeansar.