Andrej Karpathy

The GOAT. Teacher note for how I can learn to teach like him.

https://twitter.com/philduan/status/1709079931500843154

  • Inspecting large amount of data manually and visualizing the exact data fed into the network (after all the filtering, post-processing, etc.) is one of the best ML practices I’ve learned from Karpathy

There’s another note that argues for this: Communicate results to the public in a fun, down-to-earth way

Wow i need to think about what he says: learning should not be fun. https://twitter.com/karpathy/status/1756380066580455557

There’s no doubt, he says that he wants to revisit his passion for teaching after working as the lead of Self-Driving Car Companies.

He taught me so much, such as:

logits = log counts, not sure what this is but I hear it quite often. → Ohh, he explains it here at 1:28:00, but it’s the of the Softmax Function

This guide is sooo good:

He also says that sometimes, we use the log to visualize the loss.

Mistake: I was following along, and I was noticing big differences, and it was because I was printing the log loss instead of the real loss like he was doing…

Todo: I don’t understand the embeddings from lecture 3, how the MLP input actually works with the word embeddings.

To watch:

Other Notes

Guy who got mad about me fanboying

I want to write this thought down because it was interesting.

He says he knows Andrej Karpathy. I said how I look up to him, and he was trying to shrug it off.

But ultimately, it comes down to you. There is still a long way to go.

Podcast

You don’t want to work on a solution to a problem that will only work in 10 years. You want to git the team small hits of Dopamine to stay motivated over the terms. Start selling that product, generate revenue, and over time, improve it with some sort of ultimate goal.

He has this belief of 10000 hours to become an expert. Only compare yourself years ago. Comparing yourself to others is very harmful.

People get paralyzed by choice.

On spending lots of time doing wrong things: You will accumulate scar tissue (battle scars). These mistakes are not dead work. You should just focus on working. What have you done? Focus

From the above: Andrej Karpathy doesn’t love teaching. He loves happy humans. Teaching is actually really annoying, the amount of time you have to spend to create something good and of value.

It takes him 10 hours to create 1 hour of good content.

It’s also not purely an altruistic act. Teaching also helps you learn.

The source of truth is in code. It’s not the slides. You get to see it in action.

Interview

How it is like working with Elon Musk?

  • I starkly remember, where he talks about working with Elon and how he doesn’t need to understand the entire system (and every single detail that comes with it) in order to make good decisions. He is able to do that by First Principles thinking. Andrej Karpathy says that himself on the other hand, feel like he needs to understand every single detail in order to make great decisions.

Transformer Lecture

Remove all the intermediate work and package it into a final product.