30
TIL that old advice about training AI on diverse data was actually right and I learned it the hard way
A buddy of mine who works in data science told me last year that I needed to feed my image recognition model more varied examples of everyday objects. I thought he was being overly cautious. So I just used a bunch of photos from my own collection of stuff around the house in Portland. Long story short, my model could identify my coffee mug and my cat perfectly. But when I tried it on a friend's different shaped mug or a random stray cat, it failed almost every time. I wasted about 3 months and 40 hours of training time on this. Now I'm going back to pull in public datasets from Flickr with way more variety. Has anyone else ignored specific advice about AI training data and regretted it?
3 comments
Log in to join the discussion
Log In3 Comments
foster.tessa1mo ago
Your model couldn't even recognize a different coffee mug? That's wild.
4
sagecooper1mo ago
Holy cow, 40 hours on a model that only knows your cat and coffee mug?
1
valc911mo ago
That 40 hours thing reminds me of how we spend more time organizing our stuff than actually using it, you know?
6