Key Classifications of Facial Expression Recognition Datasets: Controlled and In the Wild

2025-08-16

Haruka Asanuma

I'm Asanuma, an intern at Quixotiks since October 2024. Having been involved in research, I'm now looking forward to sharing our survey findings in articles.

In this first installment, we'll delve into "facial expression recognition datasets" that AI uses to classify human expressions into emotion labels. While datasets can also be categorized by whether they contain images or videos, this article will specifically focus on explaining two crucial distinctions that define their nature: "Controlled" and "In the Wild."

Two Types of Facial Expression Recognition Datasets

Datasets used for AI to learn facial expressions can be broadly categorized into two types based on their creation method: data generated in a "controlled laboratory" setting and data collected from "real-world, everyday situations." These are commonly known as Controlled and In the Wild (Uncontrolled) datasets.

Controlled: Background, lighting, and poses are predetermined. The expressions to be made are also specified by the researchers.
In the Wild: Backgrounds and situations vary widely, capturing natural, spontaneous moments.

Now, let's explore the characteristics of each type and their representative datasets.

The World of Controlled: Expressions in a Controlled Laboratory

Controlled datasets are captured in a uniform environment prepared by researchers, with participants asked to make specific expressions.

Characteristics:
- Backgrounds are plain white or gray.
- Subjects face the camera directly.
- Posed expressions based on instructions like "Please smile."
- Consistent lighting and other conditions, resulting in low noise.
Representative example: CK+ (The Extended Cohn-Kanade Dataset)
✅ Advantages:
- The data is very clean and suitable for basic research, such as the relationship between expressions and facial muscle movements.
- The labeling accuracy is very high.
❌ Disadvantages:
- Since these are acted expressions, they diverge from the natural emotional expressions we show in daily life.
- AI trained solely on this data will find it difficult to handle the diverse expressions of the real world.

The "In the Wild" World: Everyday Expressions

To overcome the limitations of controlled datasets, the "In the Wild" dataset was born from the movement to collect data from the real world.

Features:
- Cut from various videos, such as movies, TV shows, and YouTube.
- Diverse facial orientations, lighting conditions, and backgrounds.
- Rich in spontaneous, non-acted expressions.
Examples: AFEW, DFEW, CAER, FERV39K, MAFW, etc.

✅ Advantages:

More practical and essential for developing expression recognition models usable in the real world.
Leads to improved recognition accuracy for hidden faces, various angles, and under complex lighting.
❌ Disadvantages (Challenges):
- The data contains a lot of noise (background, occlusions, etc.), making it difficult to handle and learn from.
- Labeling emotions is very difficult, for example, determining "Was that truly an expression of joy?" Since the person labeling the data is different from the person who actually made the expression, it's unclear if the emotional labeling is accurate.

Summary: Dataset Quick Reference Chart

特性	Controlled	In the Wild (Uncontrolled)
環境	実験室など、管理された環境	日常生活、映画、Web動画など
表情	指示された演技表情	自然発生的な表情
撮影条件	正面、無背景、均一な照明	様々な角度、複雑な背景、多様な照明
データ	クリーンで扱いやすい	ノイズが多く複雑
得意なこと	基礎研究、顔の動きの分析	実社会での応用、頑健なモデル開発
課題	現実世界との乖離	ラベリングの難しさ

This time, we explained "Controlled" and "In the Wild," which are major classification categories for facial expression recognition datasets. It's not a matter of one being superior to the other;both types of datasets play crucial roles depending on the research objectives.

Next time, we will delve deeper into the history of "In the Wild" datasets.

‍

More Blogs

June 13, 2026

Running torchaudio with DGX Spark (sm_121)

June 13, 2026

[MLflow] A Guide to Building a Secure LLM Learning Management Environment with Tailscale + AWS EC2/S3

June 13, 2026

Setting up NemoClaw on a DGX Spark via remote access

Key Classifications of Facial Expression Recognition Datasets: Controlled and In the Wild

Two Types of Facial Expression Recognition Datasets

The World of Controlled: Expressions in a Controlled Laboratory

The "In the Wild" World: Everyday Expressions

Summary: Dataset Quick Reference Chart

Contact Us