Topic and Content

Neural networks have revolutionized artificial intelligence, excelling in a multitude of application scenarios. However, as we advance toward increasingly large foundation models—such as expansive vision transformers or massive language models—the challenges of ensuring stable training become more pronounced. Issues like loss spikes, vanishing or exploding gradients, and difficulties achieving smooth convergence can significantly prolong training cycles, ultimately undermining overall performance and reliability. Establishing stable training paradigms is therefore essential to support the growing complexity and importance of next-generation neural architectures. This workshop aims to bring together researchers and practitioners to explore strategies that enhance the stability of neural network training. We will focus on areas including data quality, advanced optimization methods, architectural innovations, and the early detection and mitigation of training instabilities. By fostering the exchange of ideas, best practices, and cutting-edge techniques, we strive to cultivate more robust and dependable models. While we particularly welcome contributions related to large foundation models, we invite insights from all domains of neural network research where stability plays a critical role.

We will cover a range of topics that contribute to the stability of neural network training, including but not limited to:

  1. Data Quality and Preprocessing for Stability

    Investigating how high-quality, well-preprocessed data can enhance training stability. Topics include data cleaning, balancing, augmentation, and ensuring data diversity to prevent overfitting and promote smooth convergence.

  2. Advanced Optimizers for Stable Training

    Exploring optimization algorithms that improve training stability, such as adaptive learning rates, momentum-based methods, and second-order optimizers. Discussion on how these approaches can mitigate loss spikes and facilitate consistent gradient flow, especially at massive scales.

  3. Architectural Innovations Promoting Stability

    Examining model architectures that inherently support stable training and prevent phenomena like vanishing or exploding gradients. Special focus is given to structures that can handle the complexity and depth of large-scale vision and language models.

  4. Spike-Awareness and Mitigation Techniques

    Developing methods to detect and respond to training instabilities in real-time. Emphasis on identifying and addressing loss spikes, which can indicate issues like inappropriate learning rates, poor parameter initialization, or data anomalies.

  5. Efficient Hardware Utilization for Stability

    Leveraging hardware accelerators, mixed-precision training, gradient checkpointing, and other techniques to manage computational resources effectively. This ensures stable training even as models grow to billions of parameters and beyond.

  6. Case Studies and Best Practices

    Sharing experiences from successful implementations of stable neural network training. Real-world examples highlight common challenges and proven solutions for stabilizing neural network in diverse research and industry contexts.

Paper Submission

Please submit your papers via EasyChair.

Key Dates

Paper Guidelines

All submissions must adhere to the following formatting requirements: