[Study Notes] MAML++, ANIL, and Reptile

January 08, 2025

This is the continued summary of meta-learning methods for initialization including MAML++[1], ANIL[2], and Reptile[3].

Algorithm Problems Addressed Improvements
MAML++ • Gradient instability
• Fixed learning rate limitations
• Costly second-order derivatives
• Optimization instability during training
• Per-layer and per-step learning rates
• Gradient preprocessing
• Multi-step loss optimization
• Derivative-order annealing
ANIL • Computational overhead
• Unnecessary parameter updates
• Complex adaptation process
• Restricts inner loop updates to final layer
• Maintains feature extractor in outer loop only
• Simplified adaptation mechanism
• Reduced computational complexity
Reptile • Second-order derivative complexity
• High computational costs
• Implementation complexity
• First-order approximation
• Simple SGD-based update rule
• Direct parameter space optimization
• Batch-based training approach
The main differences between these approaches lie in their computational complexity and underlying assumptions:

References

  1. Antoniou, A., et al. (2019). How to train your MAML. ICLR.
  2. Raghu, A., et al. (2019). Rapid learning or feature reuse? Towards understanding the effectiveness of MAML. ICLR.
  3. Nichol, A. (2018). On first-order meta-learning algorithms. arXiv preprint arXiv:1803.02999.