Around 2007, I solved a stringent problem with AI. I automated a painstaking image labeling problem. Now, we see AI in everything! It’s mandatory in any consumer product! Or is it? Over time, I learned some guidelines on what problems to tackle with AI.
First blood!
I was a first-year PhD student when I landed a sweet partnership with a bunch of hardworking doctors. I was doing computer vision on ultrasound images. Doctors sent me CDs full of images that needed categorizing by various criteria. One criterion was particularly tricky – it couldn’t be automated just by reading DICOM data. I had to use my brain and manually label those images.
This was one of my first victories with machine learning! After a few thousand images, I realized I couldn’t label them at the same rate doctors were sending them in. My day job wasn’t labeling images – it was detecting diseases and other important stuff. I had to come up with a fast automatic labeling algorithm that would work in production. I trusted my instinct and coded up a mix of textbook approaches plus my half-trained radiologist intuition.
The results were awesome! So awesome that I built a data ingestion pipeline that completely removed me from viewing each image. The solution was innovative enough to get accepted as a paper! Victory across all fronts!
With enough confidence under my belt and knowing how success tastes, I went on to solve several intricate problems with computer vision. I didn’t limit myself to healthcare but moved sideways to optical microscopy, electron microscopy, and then non-computer vision problems (like time series).
AI must be everywhere!
Fast forward many years, and we find AI in everything: smart TVs, smart toasters, smart freezers, and even AI-powered toothbrushes. The picture is clear – today’s consumer understands the value of AI and wants it in everything! Right?
It’s a clear need that producers have to satisfy, and we AI engineers have to fulfill! No marketing or overpromises here, is there?
Of course, here and there you find pesky customers like myself who hate AI-powered headphones because you can’t turn off the voice enhancement. But maybe that’s just me being around machine learning and AI for too long.
AI makes everything better! You are missing out!!!!1
Let me tell you about some times when AI wasn’t quite that successful! Don’t get me wrong – I love AI and machine learning. When it works, I’m still amazed by the quality of results it produces! Below you’ll read about several cherry-picked problems that didn’t quite make it into my CV. These are learning lessons without which the successes in my CV would have been impossible.
Perfection is a must!
A few years into AI, I had developed a methodology and expertise that helped me deliver successes with some predictability. A friend of mine, passionate about trading, came to me with charts full of bars and lines. He started explaining about trends, candles, and Fibonacci. “Okay, okay,” I thought, “machine learning is excellent at detecting these patterns and spotting when they emerge! Let’s throw in some hours, and we’ll be incredibly rich!”
Spoiler alert! [Narrator’s voice]: In fact, they never got rich.
Looking back, machine learning was a great success – our best practices were vital because we didn’t get poor! No narrator voice needed here; we both knew failure was a possible outcome. After all, ML and trading have been dancing together since the 1950s.
One of my guilty pleasures when I started with AI and machine learning was performance estimation. This field tries to guess how our model will behave in the real world based on the few samples we have available.
Long story short: as soon as a repetitive pattern emerged in the market, the market was efficient enough to account for it. Acting on it later meant losses. We discovered this because each time our machine learning showed decent performance on historical data, we tested it with fresh data. And guess what? A few weeks into the “future,” we were at a loss.
To put a cherry on top, all of this was automated. Whatever new idea we had about how markets operate, I coded it up and let it run tests on many indices and stocks across many markets. I was lucky enough to have access to (back then) non-trivial compute power, so I went crazy!
With formal training in performance estimation, each time a signal surfaced from the sea of AI torture, I added another layer of tests and validations. AI is like a golden fish that does everything in its power to solve your pain – unfortunately, sometimes by bluntly lying! It’s our job as AI experts to catch these scenarios. In our case, a leaky model or slight “innocent” overfitting might have bankrupted us!
Looking back over the years, trust was another factor. We both trusted each other’s intentions and expertise, and we didn’t challenge the acceptance criteria. We’re not rich, but hey – we’re not broke either!
Harsh human touch versus gentle nurturing of AI!
Imagine a scenario where patients get their treatment fast and accurately without sitting in long queues! This is achievable with several levels of filtration. At each step, you cure what you’re certain you can cure, and what’s above your expertise gets forwarded to the next layer. This is exactly how a healthy medical system works! Wouldn’t it be great if we could scale up the first layer? Even better – what if a doctor was available in seconds without leaving your home?
[Narrator’s voice]: While many people rely on Google, they dislike healthcare robots!
An AI “doctor” will never scold you about eating too much salt or dropping that evening walk because, you know, talk shows! Even if the real doctor chastises your smoking habits over telco, you can always just hang up! Well, well… it turned out that no matter the advantages or insurances, no matter the technical guarantees, healthcare is a field where human-human interaction is vital! Just look around for a failed AI-powered telemedicine solution!
Being surrounded by doctors and having the technical know-how meant I saw AI use cases everywhere! Another friend of mine was excellent at what she did, and her private practice was overflowing with patients. Keeping track of the schedule was a nightmare because some patients were very young. They had their own quirks, like getting sick overnight. Of course, AI to the rescue! We had quite nice tools like chatbots, sentiment analysis, optimization algorithms, linear constraint solvers, and the power of omnichannel communication! How hard could it be to solve a schedule from one day to another?
The technical solution wasn’t that straightforward – some rules had to be hard-coded. For example, a newborn had absolute priority while an adult in pain could be safely postponed. Unfortunately, these rules were impossible to detect through statistical learning. But no worries! I had a PhD in capturing and encoding human expertise! No show-stoppers there!
However, the tool, no matter how technically advanced, was completely rejected by both patients and my friend! Why? Because in some interactions, humans are the key, and no matter how many frictions a human interaction has, it’s still preferable over anything AI can offer! Even an annoyed, overworked receptionist beats a perfectly polite AI chatbot.
Of course, this is domain-specific. Take customer service – AI-powered robots at phone companies are fairly successful (though we all know how much we “love” them). Don’t take it as a hard and fast rule, but beware when social interactions are at play!
AI solves important problems!
Remember the theme in the section above? Pain. Let’s look at a solution that was technically solid but burned a relationship – just because there was no pain. An acquaintance wanted to solve a business case needing scalability, automatic judgment of situations, and data mining over large volumes. With my know-how and modern deep learning, I delivered a solution that worked quite decently.
My client bluntly rejected it. He said it was full of errors, too imprecise, not helpful for his business. Feelings weren’t pleasant – lots of work and resources wasted on both sides. Then he dropped the line that finally explained everything: “I thought machine learning was something good, but it seems it’s just a pile of useless voodoo!”
Turns out he just wanted to test the waters, to see what machine learning was all about. The business case was completely made up – there was no need for it in his business and no solution to compare with. He expected perfect results and was severely disappointed by what I delivered. Some businessmen think having AI is an asset that will help with sales (because all customers want AI, right?), help with valuation (because VCs are asking about AI strategy), and overall it’s just nice to have inside the business. Like fancy jewelry that sits on display – there are no drawbacks in using AI, right?
The flags
Far away from claiming I have the perfect recipe for AI, here are some red and green flags I use to navigate the AI+business land.
No pain? AI is not for the catwalk!
When a friend comes to me with an AI problem, validating the pain is the first thing I do. If it’s something “nice to have because reasons,” I try to convince them there are better ways to understand AI than spending time, money, and maybe relationships on a smelly, greasy, and faulty solution that’s not meant for the catwalk or Prime Time showcasing.
The baseline
If you have a baseline, you can compare our AI Alchemy with whatever currently works in the company. And when the cold numbers are shown, any decision-maker will buy whatever voodoo magic we’re selling because they understand one thing: It helps their business driver! No PowerPoint, no mathematical demonstrations, no lengthy CV beats cold, analytical numbers found at the bottom of the P&L sheet.
No baseline is a smell for no pain. In the end, there are tons of metrics to measure in any business – nobody spends time measuring something they don’t care about.
Humans in the loop
AI works well when there’s a human in the decision loop. A human can correct errors, catch weird answers, overwrite nonsensical suggestions, and ignore AI altogether when it shouldn’t have a say. A lot of AI shortcomings and failures can be fixed with just this simple “trick.”
It doesn’t matter if the human is the final client and AI is just an advisor, or if we as a company have humans who filter and edit decisions before they reach the client. If you can, always keep a human in the loop! Careful though – the human must have the power to override the AI! Even a simple on/off switch is good enough to make AI human-compatible.
Humans have no say!
When AI is automatically applied to humans – like automatic insurance quotations or advertising campaigns – that’s a pretty strong yellow flag. Add other flags, like errors or domain drifts, and we’re in the danger zone! With automatic insurance quotes, if humans can’t challenge the process, we’re potentially hurting people. At least with wrong product ads, humans can skip the commercial (though age-inappropriate content is another story). Proctoring is another gray area where poor students can get kicked out because someone ignored the big red sticker: “FALSE POSITIVE RESULTS ARE ABOVE ZERO!!!“
Domain does not shift
If the domain is fixed, then our job as AI engineers is easier. We can handle slowly evolving domains – we have tools to monitor the drift. If the domain changes fast, then AI is a hard sell. And if the domain reacts to AI, well, things get very complicated very fast!
Unnecessary evil!
There are domains where we have better tools than AI. For example, if we can write down equations for a process, we have way better tools than AI. Take automatic control – if you have a good model, you already have tools like model predictive control that can run a nuclear plant for you.
In this situation, AI can help only if we’re lazy – that is, we let AI learn the model, but we’ll have to accept its strong limitations.
A perfect example where an approximate physical model is good enough is in video generative AI. If water doesn’t fall exactly like in a physical simulation, it’s not a problem as long as it looks natural. But if you want to land a rocket on a drone ship in the middle of the ocean, making a pretty movie with waves is completely useless, no matter how good those waves look. Sometimes machine learning and AI are just the wrong answer. We must recognize when better solutions are available!
The bitter lesson
AI and machine learning generally benefit from lots of data and lots of compute. Scaling laws and the bitter lessons are always the elephants in the room.
If an untrained person can easily solve the task, that’s a good sign the problem could be tackled by AI. License plate reading, OCR, sentiment analysis, and spotting common objects in natural scenes are tasks that can be solved quite easily by models running on a high-end laptop.
Back in the day, it was okay to start with handcrafted features. I was very good at all kinds of niche frequency image analysis. Even then, having more images meant I could rely on “brute force,” where “dumber” methods + lots of data = success! These days, there’s no excuse. Foundation models are step zero in any problem. Fine-tuning comes later, when we want to cut costs, improve results, worry about data privacy, etc. Scaling always wins. A L W A Y S!
Perfection is a must!
100% reliability in detection rates is a very strong red flag for us AI engineers. If your business requires perfect responses, I’ll personally keep AI very far away from you. However, if you already have failure management processes and your business can handle some degree of failure, then by all means – look into AI to grow your business!
My take is that for AI to work, we need a “safe to fail” environment. This applies both to AI development and deployment.
Adversarial attacks? We are on the losing side!
Adversarial environments are another place where AI should be used with caution and needs other compensating factors. Take proctoring solutions. We have online, real-time surveillance and offline cheating detectors. I argue this is an adversarial setup because each time we make progress in detection rates, there are actors improving their falsification methods. Remember that any statistical tool will make two kinds of mistakes, and in proctoring, false positive mistakes can badly hurt honest students. Ruining somebody’s career is a very costly false positive error!
For adversaries, both error types cost the same. Someone who cheats such a proctoring system has nothing to lose and everything to gain if they trick the AI. Moreover, if they use AI, it’s a perfect tool because they can scale! The cost of failure is almost zero and the benefits are huge! Nice business to have, not gonna lie!
Because bad actors have such a huge upper hand and our costs of failure are so high, I personally avoid any kind of automatic, unbiased, completely fail-proof evaluation tools! After 17 years of university teaching, nothing beats human expertise when it comes to fraud detection.
Even worse is when we have a feedback loop in the market. After each AI-generated decision enters the market, the market reacts and changes its behavior in different ways. This is both scary and an opportunity for people manipulation and subliminal influences. What’s worse is that these topics are strictly forbidden by the EU AI Act. If we can’t research them, we can’t investigate how to protect against them. We’ll learn about them when our non-friendly neighbors start applying them to us. After all, they don’t have an AI Act!
Necessary evil!
Unfortunately, there are areas where AI is the only possible solution and we have to somehow live with its drawbacks. One of the hottest fields right now is social media. There are two major use cases: hate speech detection and content recommendation. We constantly dunk on how badly all social media channels handle these tasks. But looking at our criteria above, we have an adversarial environment and automatic application to humans. We, as users, expect close to 100% reliability, and failure management is mostly non-existent!
Social media companies use AI because right now it’s the only solution that lets them scale without going out of business!
The ending
Check my article on scoring sheets to learn how to transform this story into an actionable tool for identifying the “low hanging fruits” ripe enough for AI harvesting.
In a later story, I’ll tell you a recipe that will boost the success rate of “good” machine learning projects. A recipe that will eventually let you approach those proverbial tar pits – machine learning business cases with some yellow flags in the mix! But for now, if you’re just starting out, stick to the green zone! Plenty of use cases to solve! If you want help spotting these opportunities in your business, reach out to an AI expert – they’ll help you find the right problems to tackle.
I’m still learning, so stay tuned for fresh painful lessons!