As artificial intelligence moves closer to the devices people use every day, Rohit Kulkarni says the biggest shift is not just faster processing, but a complete rethink of how intelligent systems are architected.
Rohit Kulkarni, who has 18 years of experience in systems architecture and design, currently works in the semiconductor industry as an AI architect focused on Edge AI. His work centres on bringing real-time inference closer to the device, reducing latency, lowering dependence on the cloud, and supporting Small Language Models within constrained computing environments.
In this interview, he discusses why Edge AI matters, how lower latency changes user experience and business costs, and why AI architecture, semiconductor platforms, security, and Small Language Models will shape the next generation of intelligent devices.
You’ve spent 18 years in systems architecture and design. How has your view of computing changed with the shift from cloud-heavy AI to Edge AI?
For most of my career, the assumption was simple: compute lives in the cloud, and devices are dumb terminals that ship data back and forth.
Edge AI flips that. Now I’m working on systems where intelligence has to live much closer to the device, constrained by power, heat, and memory, but still expected to make real-time decisions.
It’s forced a shift from “how do we scale compute?” to “how do we make every cycle and every milliwatt count?” Architecturally, that’s a much harder and more interesting problem.
Edge AI is often described as a way to reduce latency. In practical terms, what does lower latency mean for users, devices, and businesses?
It’s the difference between a security camera that reacts to an intruder and one that reports that an intruder was there five seconds ago.
For a user, it means a voice assistant that responds instantly instead of buffering. For a device, it means decisions can happen locally without waiting on a round trip to a data center, which also means it still works if the network drops.
For a business, lower latency often means lower cloud cost, since you’re not shipping every frame of video or every sensor reading upstream for processing.
You currently work in the semiconductor industry as an AI architect focused on Edge AI. Why is system architecture so critical for real-time AI inference?
Because the model is only half the story. You can have a brilliant AI model, but if the platform around it can’t deliver the memory bandwidth, secure boot chain, and trustworthy data path fast enough, none of it matters in real time.
In our work on the architecture behind Edge AI systems, the engineering challenge sat at the security-and-trust layer: designing how post-quantum cryptographic signing could be woven into the secure boot chain without blowing the latency or power budget.
Security and performance are often treated as a trade-off. The real architectural work is making sure they don’t have to be.
Small Language Models are becoming more important for on-device intelligence. Why are SLMs better suited than large models for many Edge AI applications?
Large models assume you have a data center behind you.
SLMs are built for the world most devices live in: limited memory, no GPU farm, and a power budget measured in milliwatts, not megawatts.
A well-tuned small model that’s purpose-built for one task, such as detecting a person in a camera frame, will outperform a general-purpose large model on-device every time because it fits in the memory and thermal envelope you actually have.
What are the biggest technical challenges in bringing AI inference closer to the device, especially when power, heat, memory, and cost are limited?
A few things compound on each other.
Memory bandwidth is usually the real bottleneck, not raw compute. Thermal limits mean you can’t just run flat out. You have to manage duty cycles.
Increasingly, security has its own cost too. Cryptographic operations, including post-quantum signing, take cycles. So you’re balancing inference speed against secure boot and a trust chain that more regulations are starting to require.
Which industries do you think will benefit most from low-latency Edge AI?
Security and surveillance is an obvious one. Real-time detection beats after-the-fact analysis every time.
Automotive is right behind it, where milliseconds matter for safety systems. Healthcare devices benefit enormously when privacy requires data to stay local. Manufacturing gets predictive maintenance without network dependency.
I’d put security, automotive, and healthcare at the top, with consumer electronics following close behind as the cost curve keeps coming down.
Many companies still rely heavily on cloud-based AI. What should decision-makers consider when deciding between cloud AI, Edge AI, or a hybrid model?
Three questions, really.
How much latency can you tolerate? How sensitive is the data? Does it legally or practically need to stay local? And what’s your actual cost model at scale?
Cloud inference cost compounds with every device and every query. Most serious deployments I see land on hybrid: edge for real-time, privacy-sensitive decisions, and cloud for heavy lifting such as model retraining and aggregate analytics.
Looking ahead, how do you see Edge AI architecture, semiconductor platforms, and SLMs changing the next generation of intelligent devices?
I think the next wave of devices won’t talk about “AI” as a feature anymore. It’ll just be assumed, the way Wi-Fi is assumed today.
Security becomes inseparable from the AI conversation too. Regulations such as the EU’s Cyber Resilience Act are already pushing software vendors toward security by design. With Edge AI and SLM integration, AI platforms will need secure-by-design principles from the start.
I’d expect post-quantum readiness to become a baseline requirement rather than a differentiator within a few years.
The platforms get smaller, the models get sharper at doing one thing well, and the device gets smarter without needing to phone home for it.
Conclusion
Rohit Kulkarni’s view of Edge AI points to a broader change in the technology industry. The future of AI will not only be defined by larger models or more powerful cloud systems. It will also depend on smaller, faster, more secure systems that can make decisions locally.
For industries where milliseconds, privacy, reliability, and cost matter, Edge AI offers a practical path forward. As Small Language Models become more efficient and semiconductor platforms become more capable, intelligence is likely to move deeper into everyday devices.
The result may be a quieter but more important AI shift: devices that no longer need to send every decision to the cloud because the intelligence is already built into the system.
Published June 29, 2026
