Whereas you and I would process the stream of perceptual input, the computer vision system is basically taking the pixels and processes strings of pixels – vectorised, we would say – and as it processes these images, it's picking up patterns in the images, it's picking up patterns in the pixels.
你和我處理的是感知輸入流,而計算機視覺系統處理的基本上是像素和像素串--我們可以說是矢量化--在處理這些影像時,它在影像中捕捉模式,在像素中捕捉模式。