Second-Order Optimization
I remember this was mentioned from CS231n as a way to do optimization, so we don’t need to worry about Learning Rate and having to do it in multiple steps, we can just do in a single step. However, it is extremely computationally expensive and not feasible practically in the Deep Learning world.
https://www.cs.toronto.edu/~rgrosse/courses/csc2541_2021/readings/L04_second_order.pdf