23
Update: I looked up how many images are in the training sets for the big AI models
I found a paper from Stanford that said one common dataset has over 5 billion pictures. It made me think about the sheer scale of what these programs learn from, and how impossible it is for a person to even see that much. Does that volume of input change how we should view the output?
4 comments
Log in to join the discussion
Log In4 Comments
emeryj663mo ago
So is it learning or just really good at math?
6
gavin3653mo ago
But does it really learn, or just copy patterns? A kid can see one cat and know what it is, but these things need billions of examples.
4
grant_knight793mo ago
Saw a good point about this the other day. The argument was that a kid seeing one cat has a whole lifetime of other context built in, like knowing what fur is or what eyes look like. These models start with a blank slate, so yeah, they need a ton of data to build those basic ideas from scratch. It's a different kind of learning, but calling it just copying feels a bit simple, gavin365. They are finding patterns, but so are our brains, just way more efficiently. The real trick is whether they can use those patterns in new ways like a person can.
5
foster.tessa2mo ago
Honestly, that scale is the whole point, right? It needs all those pictures because it doesn't have a body or a life. It never felt sun or got scratched by a cat. So it builds its whole idea of a cat from a billion flat images. The output is a kind of average of all that data. Makes you wonder if it can ever really get the weird, specific stuff a person would notice.
2