It’s this idea that things that are really easy for humans are actually extremely hard for robots (think walking or stacking a cup).
And conversely, things that are really easy for robots are actually really complicated for humans (think multiplication).
From the Visual SLAM book (page 11):
In the field of computer vision, a task that seems natural to a human can be very challenging for a computer. Images are nothing but numerical matrices in computers. A computer has no idea what these matrices mean (this is the problem that machine learning is also trying to solve). In visual SLAM, we can only see blocks of pixels, knowing that they are the results of projections by spatial points onto the camera’s imaging plane. In order to quantify a camera’s movement, we must first understand the geometric relationship between a camera and the spatial points.