Skip to main content
On the Width Scaling of Neural Optimizers Under Matrix Operator Norms I: Row/Column Normalization and Hyperparameter Transfer | Buildability Receipt | ScienceToStartup