Insights

ML defect prediction without a data science degree

Soham Harmalkar • March 2026 • 4 min read

There is a lot of hype around AI in manufacturing. Most of it ignores a simple fact:

You don't need advanced machine learning to predict defects. You need structured thinking and usable data.

The reality on the shop floor

Most manufacturing environments already generate large amounts of data — process parameters like temperature, pressure, and cycle time; inspection results; rejection logs; supplier batch information.

The issue is not data availability. It is data fragmentation and lack of structure.

Where companies go wrong

Two common mistakes:

Waiting for perfect data infrastructure. Teams delay initiatives because data is not "clean enough." Meanwhile, defects keep occurring, scrap keeps accumulating, and the problem doesn't wait for your data lake.

Overengineering solutions. Jumping directly to complex neural networks without understanding basic relationships between process parameters and defect occurrence. A simple correlation analysis often reveals 80% of the insight.

A practical approach

At Bajaj Auto, I worked with the IT department to build a system that analyzed X-ray images of castings against reference sets per ASTM E2422. It used MATLAB's Image Processing Toolbox — not TensorFlow, not PyTorch. It achieved roughly 80% accuracy and reduced defect rates by 40%, saving over €50K annually in rework and scrap.

The approach was straightforward:

Map key process parameters to defect occurrence patterns
Identify conditions where rejection rates spike
Track parameter drift over time against historical baselines
Correlate supplier batches with defect trends

Even this basic analysis revealed which parameters were critical, where process instability existed, and when intervention was required — before defects reached the assembly line.

Why engineers have the advantage

An engineer with process knowledge understands which variables actually matter, how processes behave under variation, and what constitutes a meaningful signal versus noise.

I knew what porosity looks like in an aluminum casting. I knew which process parameters cause which defect patterns. I knew which defects are cosmetic versus structural. That domain knowledge is what made the tool useful, not the algorithm.

A data scientist can build a classifier. But can they look at an X-ray and tell you whether that porosity cluster is in a structural zone that will fail under load, or in a cosmetic area the customer will never see? That distinction is the difference between a useful tool and an expensive experiment.

The real opportunity

AI in manufacturing is not about complexity. It is about structuring the right data, asking the right questions, and acting on insights. You don't need to become a data scientist. You need to apply engineering thinking to data.

For companies hiring in this space: the most effective approach isn't to hire a pure ML engineer for a manufacturing AI project. It's to hire an engineer who understands the process and give them the tools — or pair them with a data scientist. Domain expertise doesn't transfer from a textbook.

I built this defect prediction system at Bajaj Auto with no formal data science training — just 7 years of casting process knowledge and a willingness to learn MATLAB. I'm now completing my MBA at HHL Leipzig, exploring the intersection of manufacturing engineering and digital transformation.