[Study Notes] MAML++, ANIL, and Reptile

[Study Notes] MAML++, ANIL, and Reptile

January 08, 2025

This is the continued summary of meta-learning methods for initialization including MAML++[1], ANIL[2], and Reptile[3].

Algorithm	Problems Addressed	Improvements
MAML++	• Gradient instability • Fixed learning rate limitations • Costly second-order derivatives • Optimization instability during training	• Per-layer and per-step learning rates • Gradient preprocessing • Multi-step loss optimization • Derivative-order annealing
ANIL	• Computational overhead • Unnecessary parameter updates • Complex adaptation process	• Restricts inner loop updates to final layer • Maintains feature extractor in outer loop only • Simplified adaptation mechanism • Reduced computational complexity
Reptile	• Second-order derivative complexity • High computational costs • Implementation complexity	• First-order approximation • Simple SGD-based update rule • Direct parameter space optimization • Batch-based training approach

The main differences between these approaches lie in their computational complexity and underlying assumptions:

MAML++ is the most sophisticated but computationally intensive
ANIL provides similar performance to MAML with much less computation
Reptile is the simplest and most computationally efficient, though potentially less powerful

References

Antoniou, A., et al. (2019). How to train your MAML. ICLR.
Raghu, A., et al. (2019). Rapid learning or feature reuse? Towards understanding the effectiveness of MAML. ICLR.
Nichol, A. (2018). On first-order meta-learning algorithms. arXiv preprint arXiv:1803.02999.