Topic and Content
Neural networks have revolutionized artificial intelligence, excelling in a multitude of application scenarios. However, as we advance toward increasingly large foundation models—such as expansive vision transformers or massive language models—the challenges of ensuring stable training become more pronounced. Issues like loss spikes, vanishing or exploding gradients, and difficulties achieving smooth convergence can significantly prolong training cycles, ultimately undermining overall performance and reliability. Establishing stable training paradigms is therefore essential to support the growing complexity and importance of next-generation neural architectures. This workshop aims to bring together researchers and practitioners to explore strategies that enhance the stability of neural network training. We will focus on areas including data quality, advanced optimization methods, architectural innovations, and the early detection and mitigation of training instabilities. By fostering the exchange of ideas, best practices, and cutting-edge techniques, we strive to cultivate more robust and dependable models. While we particularly welcome contributions related to large foundation models, we invite insights from all domains of neural network research where stability plays a critical role.
We will cover a range of topics that contribute to the stability of neural network training, including but not limited to:
-
Data Quality and Preprocessing for Stability
Investigating how high-quality, well-preprocessed data can enhance training stability. Topics include data cleaning, balancing, augmentation, and ensuring data diversity to prevent overfitting and promote smooth convergence.
-
Advanced Optimizers for Stable Training
Exploring optimization algorithms that improve training stability, such as adaptive learning rates, momentum-based methods, and second-order optimizers. Discussion on how these approaches can mitigate loss spikes and facilitate consistent gradient flow, especially at massive scales.
-
Architectural Innovations Promoting Stability
Examining model architectures that inherently support stable training and prevent phenomena like vanishing or exploding gradients. Special focus is given to structures that can handle the complexity and depth of large-scale vision and language models.
-
Spike-Awareness and Mitigation Techniques
Developing methods to detect and respond to training instabilities in real-time. Emphasis on identifying and addressing loss spikes, which can indicate issues like inappropriate learning rates, poor parameter initialization, or data anomalies.
-
Efficient Hardware Utilization for Stability
Leveraging hardware accelerators, mixed-precision training, gradient checkpointing, and other techniques to manage computational resources effectively. This ensures stable training even as models grow to billions of parameters and beyond.
-
Case Studies and Best Practices
Sharing experiences from successful implementations of stable neural network training. Real-world examples highlight common challenges and proven solutions for stabilizing neural network in diverse research and industry contexts.
Paper Submission
Please submit your papers via EasyChair.
Key Dates
- Paper Submission Deadline: 15 January, 2025 AoE
- Notification of Acceptance: 1 March, 2025
- Final Camera-Ready Copy Deadline: 7 March, 2025
- Workshop Date: During IEEE CAI 2025
Paper Guidelines
All submissions must adhere to the following formatting requirements:
- Use the IEEE style files for conference proceedings, available at IEEE Templates.
- Follow double-blind reviewing: In LaTeX, use:
\author{\IEEEauthorblockN{Anonymous Authors}}
- Only PDF format is accepted.
- Paper Size: A4 (210mm x 297mm).
- Length:
- Long papers: up to 6 pages (plus up to 2 extra pages with additional charge).
- Abstract papers: up to 2 pages (plus up to 2 extra pages with additional charge).
- Formatting: double column, single spaced, 10-point Times Roman font. Use the official IEEE style files.
- No page numbers (they will be inserted later).
- No new authors can be added after the submission deadline.