Skip to content

Benjamin Marie's Blog

Analysis About AI, Natural Language Processing, and Machine Translation

Benjamin Marie's Blog

Analysis About AI, Natural Language Processing, and Machine Translation

  • Blog
  • About
  • Contact
    • Home
    • Scientific credibility
Evaluation Scientific credibility

Do Bigger Evaluation Datasets Make Your Results More Significant?

May 13, 2023

The size of the test set shouldn’t have any impact on the evaluation, provided that the test set has been correctly created. Increasing its size shouldn’t change the p-value of…

Evaluation Machine translation Scientific credibility

Scientific Credibility in Machine Translation Research: Pitfalls and Promising Trends

May 11, 2023

Are we at a turning point? My conclusions from the annotation of 1,000+ scientific papers

Evaluation Machine translation Scientific credibility

BLEU: A Misunderstood Metric from Another Age

Nov 5, 2022

In this article, we will go back 20 years ago to expose the main reasons that brought BLEU to existence and made it a very successful metric. We will look…

Evaluation LLM Machine translation Scientific credibility

Why the Evaluation of OpenAI Whisper Is Not Entirely Credible

Oct 31, 2022

Whisper is evaluated on 6 tasks (section 3 of the research paper). I demonstrate that the conclusions drawn from 3 of these evaluation tasks are flawed ❌ or misleading ❌.

About the author:
Ph.D, research scientist in NLP/AI.
Advocate of the scientific credibility.
Building next-gen AI translation systems: https://slaitor.com

  • Conference
  • Evaluation
  • Framework/Tool
  • GPT
  • LLM
  • Machine translation
  • Scientific credibility

You Missed

Evaluation Scientific credibility

Do Bigger Evaluation Datasets Make Your Results More Significant?

Evaluation Machine translation Scientific credibility

Scientific Credibility in Machine Translation Research: Pitfalls and Promising Trends

Machine translation GPT LLM

AI Won’t Replace Translators

Evaluation Machine translation

Traditional Versus Neural Metrics for Machine Translation Evaluation

Benjamin Marie's Blog

Analysis About AI, Natural Language Processing, and Machine Translation

Copyright © All rights reserved | Blogus by Themeansar.