Natural Language Inference: Machine-Translated

Natural Language Inference Annotation Task

Understanding Natural Language Inference (NLI)

Natural Language Inference (NLI) is a fundamental task in natural language understanding that evaluates the logical relationship between two statements. This task tests whether a machine can understand the semantic implications and contradictions that humans naturally recognize in language.

THIS TASK

This study investigates whether inference recognition transfers across languages after machine translation from English. Occasionally, a machine-translated sentence may sound odd or incomplete. In these rare cases, you can select Nonsense. However, please avoid using this label frequently—try to interpret the sentence’s intended meaning first.

Why NLI Matters: NLI is crucial for applications like fact-checking, question answering, reading comprehension, and dialogue systems. A model that understands entailment can better reason about text and make logical inferences.

The Two-Sentence Structure

Every NLI task involves two components:

📌 Premise: The foundational statement accepted as true. This provides the context or facts.
🔍 Hypothesis: The statement we're testing against the premise. We determine its relationship to the premise.

The Four Relationship Categories

✅

Entailment

The hypothesis logically follows from the premise. If the premise is true, the hypothesis must be true.

Think: "This is a logical consequence"

❌

Contradiction

The hypothesis directly conflicts with the premise. If the premise is true, the hypothesis must be false.

Think: "These cannot both be true"

➖

Neutral

The hypothesis might or might not be true. The premise doesn't provide enough information to determine truth or falsehood.

Think: "Could be true, but we don't know"

⚠️

Nonsense

The hypothesis is incoherent, grammatically broken, or doesn't form a complete, understandable statement.

Think: "This doesn't make sense"

💡 Sample Examples

Example Set 1: Basic Logical Inference

Premise: The museum closes at 6 PM every weekday.

Hypothesis: You cannot visit the museum at 7 PM on Tuesday.

✅ Entailment

Why? Tuesday is a weekday, and if the museum closes at 6 PM, it's logically impossible to visit at 7 PM. This is a clear temporal entailment.

Premise: Sarah has been a vegetarian for five years.

Hypothesis: Sarah ate a steak yesterday.

❌ Contradiction

Why? Vegetarians don't eat meat by definition. If Sarah has been vegetarian for five years, eating a steak yesterday would contradict this fact.

Premise: The company hired 10 new software engineers.

Hypothesis: The company's revenue increased this quarter.

➖ Neutral

Why? Hiring engineers might lead to growth, but it doesn't guarantee increased revenue. Revenue could stay the same or even decrease. We lack sufficient information.

Example Set 2: Nuanced Reasoning

Premise: All participants in the study were between 18 and 25 years old.

Hypothesis: No minors participated in the study.

✅ Entailment

Why? Since minors are defined as those under 18, and all participants were 18+, this is a logical entailment. This tests understanding of age definitions.

Premise: The medicine should be taken twice daily with food.

Hypothesis: Taking the medicine on an empty stomach is recommended.

❌ Contradiction

Why? "With food" directly contradicts "on an empty stomach." This is a clear instructional contradiction.

Premise: The experiment was conducted in a controlled laboratory environment.

Hypothesis: The results can be generalized to real-world settings.

➖ Neutral

Why? Lab conditions don't automatically mean results can't be generalized, but they also don't guarantee it. External validity requires additional evidence not provided in the premise.

Example Set 3: Tricky Edge Cases

Premise: Either John or Mary will attend the meeting.

Hypothesis: John will attend the meeting.

➖ Neutral

Why? "Either/or" means at least one will attend, but doesn't specify which one. John might attend, or Mary might attend, or both might attend. We cannot determine John's attendance with certainty.

Premise: The temperature dropped below freezing last night.

Hypothesis: Water in outdoor containers would have frozen.

✅ Entailment

Why? This tests physical knowledge. Water freezes at 0°C/32°F (freezing point), so if temperature went below freezing, exposed water must freeze. This is a scientific entailment.

Premise: The restaurant serves Italian cuisine.

Hypothesis: You can order sushi there.

➖ Neutral

Why? While Italian restaurants typically don't serve sushi, some fusion restaurants might. The premise doesn't explicitly state it's exclusively Italian, so we can't be certain.

🌍 Multilingual NLI Examples

🇩🇪 Deutsch

🇸🇦 العربية

🇪🇸 Español

🇧🇷 Português

🇨🇳 中文

Deutsch (German)

Premise: Der Zug fährt jeden Morgen um 7:30 Uhr ab.

Hypothesis: Man kann um 7:45 Uhr in den Zug einsteigen.

❌ Widerspruch (Contradiction)

If the train departs at 7:30, you cannot board at 7:45. This tests temporal reasoning.

Premise: Die Bibliothek hat über 100.000 Bücher.

Hypothesis: Die Bibliothek ist sehr gut ausgestattet.

✅ Schlussfolgerung (Entailment)

100,000+ books logically implies a well-equipped library. This is subjective but reasonable inference.

Premise: Das Konzert wurde wegen Regen verschoben.

Hypothesis: Die Band war krank.

➖ Neutral

The concert was postponed due to rain, not because the band was sick. These are independent facts.

العربية (Arabic)

الجملة الأصلية: الطبيب نصح المريض بالراحة التامة لمدة أسبوعين.

الافتراض: يجب على المريض تجنب العمل الشاق.

✅ تضمين (Entailment)

Complete rest for two weeks logically entails avoiding strenuous work.

الجملة الأصلية: جميع المتاجر مغلقة يوم الجمعة.

الافتراض: يمكنك التسوق يوم الجمعة.

❌ تناقض (Contradiction)

If all stores are closed on Friday, shopping on Friday is impossible.

الجملة الأصلية: الطالب يدرس الهندسة في الجامعة.

الافتراض: الطالب يجيد الرياضيات.

➖ محايد (Neutral)

Studying engineering doesn't automatically mean being good at mathematics, though it's common.

Español (Spanish)

Premisa: La conferencia comienza exactamente a las 9:00 AM.

Hipótesis: Si llegas a las 9:15, habrás perdido el inicio.

✅ Implicación (Entailment)

If the conference starts at 9:00 and you arrive at 9:15, you've missed the beginning by definition.

Premisa: María es alérgica a los frutos secos.

Hipótesis: María puede comer almendras sin problemas.

❌ Contradicción (Contradiction)

Almonds are nuts. If María is allergic to nuts, she cannot eat almonds without problems.

Premisa: El restaurante tiene una estrella Michelin.

Hipótesis: La comida es muy cara.

➖ Neutral

Michelin stars often correlate with higher prices, but it's not guaranteed. Some starred restaurants are affordable.

Português (Portuguese)

Premissa: Todos os alunos da turma passaram no exame.

Hipótese: Nenhum aluno reprovou.

✅ Implicação (Entailment)

"All students passed" is logically equivalent to "no student failed." This tests logical equivalence.

Premissa: O voo está programado para decolar às 14h.

Hipótese: O voo já decolou às 13h.

❌ Contradição (Contradiction)

If the flight is scheduled for 2 PM, it cannot have already departed at 1 PM (assuming current time is before 2 PM).

Premissa: A empresa lançou um novo produto no mercado.

Hipótese: As vendas da empresa vão aumentar.

➖ Neutro (Neutral)

Launching a new product doesn't guarantee increased sales. The product could fail or cannibalize existing sales.

中文 (Chinese)

前提： 这家商店每天营业到晚上10点。

假设： 你可以在晚上11点去购物。

❌ 矛盾 (Contradiction)

If the store closes at 10 PM, shopping at 11 PM is impossible.

前提： 所有参赛者都必须年满18岁。

假设： 未成年人不能参加比赛。

✅ 蕴含 (Entailment)

If all participants must be 18+, then minors (under 18) cannot participate. This is logical entailment.

前提： 这部电影获得了奥斯卡最佳影片奖。

假设： 每个人都喜欢这部电影。

➖ 中性 (Neutral)

Winning an Oscar doesn't mean everyone likes the film. Awards and universal appeal are independent.

🔑 Annotation Access

Each annotator receives a unique, language-specific access code to unlock their assigned dataset. These codes ensure you work on the correct language and track your contributions. Your access code will be provided to you directly by the project coordinator.

💬 When (and When Not) to Use “NonSense”

The label NonSense should be used only when a sentence pair cannot be interpreted meaningfully — for example, because of a corrupted or incomplete translation. If both sentences still make sense, even if they differ in meaning, please choose one of Entailment, Contradiction, or Neutral instead.

✅ When to Annotate Normally

If a sentence is fully grammatical and interpretable, it should not be labeled as NonSense.
Minor translation differences or tense shifts are fine — choose the logical relation instead.
Proper names or titles translated differently still count as valid sentences. Use Neutral or Contradiction if meaning changes.

🌍 Multilingual Examples

English

Premise: She sang Happy Birthday.
Hypothesis: She sang the birthday song.
→ Entailment (same meaning)
Premise: He hit the ball.
Hypothesis: He hits the ball.
→ Neutral (different tense, unclear time reference)
Premise: Adam Driver starred in the film.
Hypothesis: Adam the driver starred in the film.
→ Contradiction (mistranslation changes meaning, but both sentences make sense)

🚫 When to Use “NonSense”

Reserve this label only for pairs that are truly uninterpretable or broken, such as:

Machine translation errors that produce unreadable text
Incomplete or truncated sentences
Random strings with no coherent meaning

If the pair makes sense at all, even if it’s wrong or mismatched — it’s not NonSense.

🧩 Short Comprehension Quiz

Made with ❤️ for the NLI Research Community