Friday, October 24, 2025

From Text to Pixels: How AI Models Are Learning to See and Think

Artificial intelligence keeps surprising us. Just when we thought large language models (LLMs) were all about reading and writing text, new research is showing they can also learn directly from images — even from the tiny pixels that make up a picture.

A recent study called DeepSeek-OCR takes this idea further. It’s designed to read text from images, like a super-smart version of the scanners that turn printed pages into digital files. But instead of just converting pictures into text, DeepSeek-OCR lets the model understand the pixels themselves. That raises an exciting question: could future AI models skip words entirely and just “think in pixels”?

This idea builds on a trend known as multimodal AI, where systems can handle more than one kind of input — for example, both pictures and text. OpenAI’s GPT-4o, released back in May 2024, was already doing this, and was much better at understanding context because of it.

But there’s another reason researchers are looking for change: cost. Training and running huge AI models takes enormous computing power. A McKinsey report in June 2024 found that AI training costs have been growing by about 20 percent each year. To keep progress affordable, scientists are exploring compression techniques — ways to make models smaller and faster without losing smarts.

One interesting example is ChunkLLM, a lightweight system that speeds up long-text processing by breaking data into small, meaningful chunks. Instead of wasting power re-reading everything, it learns when and where to focus attention — a clever shortcut that saves time and memory.

It’s a pattern we’ve seen before. In the early days of the semiconductor industry, engineers used scan compression to test chips faster and cheaper while keeping performance high. Now, AI researchers are doing something similar: compressing how models learn and think.

From compressed circuits to compressed thoughts, the goal stays the same — do more with less. And maybe, just maybe, the next big leap in AI won’t come from more data, but from smarter ways of seeing and thinking.



REFERENCES

Haoran WeiYaofeng SunYukun Li   [2510.18234] DeepSeek-OCR: Contexts Optical Compression  arXiv:2510.18234 [cs.CV]  https://doi.org/10.48550/arXiv.2510.18234  [v1] Tue, 21 Oct 2025 02:41:44 UTC (7,007 KB)

Haojie OuyangJianwei LvLei RenChen WeiXiaojie WangFangxiang Feng   [2510.02361] ChunkLLM: A Lightweight Pluggable Framework for Accelerating LLMs Inference   arXiv:2510.02361 [cs.CL]  https://doi.org/10.48550/arXiv.2510.02361  [v1] Sun, 28 Sep 2025 11:04:00 UTC (427 KB)


Friday, October 10, 2025

The Man, the Dog, and the Chip: AI Takes a Byte Out of EDA

Almost 50 years ago, someone cracked a joke that aged remarkably well:

“The factory of the future will have only two employees - a man and a dog. The man will be there to feed the dog. The dog will be there to keep the man from touching the equipment.”

In 2025 this punchline has finally made its way into Electronic Design Automation (EDA). Only now, the “equipment” in question is an AI system with more neural layers than the human brain has excuses for late tape-outs.

Will Circuits Start Writing Themselves?

Recent papers like “Large Language Models for EDA: From Assistants to Agents” (He et al., 2025) and “AutoEDA: Enabling EDA Flow Automation through Microservice-Based LLM Agents” (Lu et al., 2025) hint that AI isn’t just helping with design - it’s taking the wheel. Or perhaps, more accurately, re-routing the traces.

AI-driven verification tools (Ravikumar, 2025) and self-aware silicon (Vargas et al., 2025) now promise chips that can debug themselves faster than an engineer can find the semicolon they forgot. Researchers are even generating pseudo circuits at the RTL stage, which sounds suspiciously like AI daydreaming about better hardware.

Meanwhile, CircuitFusion (Fang et al., 2025) teaches chips to learn multimodally - combining circuit diagrams, timing data, and layout specs into one grand, caffeinated neural symphony. Think of it as ChatGPT meets circuit board karaoke.

As Peter Denning observed in Communications of the ACM, AI is both “light and darkness”—a Dickensian tale told in Verilog. Sure, we might get faster chips and fewer bugs, but we might also get less human engineering intuition, replaced by a kind of silicon omniscience that never sleeps and never spills coffee on the FPGA board.

Ray Kurzweil imagines a beautiful merger of human and machine minds. Sohn imagines a utopia. The rest of us? We’re just hoping the dog keeps us from accidentally retraining the wrong model.

Forget the flashy “AI singularity.” The real risk is the automation singularity—a slow, incremental outsourcing of human judgment to the same systems we built to help us. AI systems that prioritize speed, cost-cutting, and surveillance could erode not only our autonomy but also the joy of discovery—the little “Aha!” moments that made engineering fun in the first place.

AI in EDA is neither apocalypse nor utopia - it’s a grand debugging session for humanity’s relationship with technology. We’re learning to co-design not just chips, but the very process of innovation.

So, as the man and the dog look over the humming chip factory of the future, one thing is clear: the dog may still guard the console - but now, it’s also probably wearing an AI-powered collar that runs a lightweight EDA agent.


REFERENCES

https://cacm.acm.org/opinion/three-ai-futures/

Ravikumar S. AI-driven verification: Augmenting engineers in semiconductor EDA workflows. World Journal of Advanced Engineering Technology and Sciences. 2025 May 30;15(2):223-30.

Liu S, Fang W, Lu Y, Zhang Q, Xie Z. Towards Big Data in AI for EDA Research: Generation of New Pseudo Circuits at RTL Stage. InProceedings of the 30th Asia and South Pacific Design Automation Conference 2025 Jan 20 (pp. 527-533).

Mandadi SP. AI-Driven Engineering Productivity in the Semiconductor Industry: A Technological Paradigm Shift. Journal of Computer Science and Technology Studies. 2025 Jul 13;7(7):543-9.

He Z, Pu Y, Wu H, Qiu Y, Qiu T, Yu B. Large Language Models for EDA: From Assistants to Agents. Foundations and Trends® in Electronic Design Automation. 2025 Apr 30;14(4):295-314.

He Z, Yu B. Large language models for eda: Future or mirage?. In Proceedings of the 2024 International Symposium on Physical Design 2024 Mar 12 (pp. 65-66).

Xu Z, Li B, Wang L. Rethinking LLM-Based RTL Code Optimization Via Timing Logic Metamorphosis. arXiv preprint arXiv:2507.16808. 2025 Jul 22.

Mohamed KS. The Basics of EDA Tools for IC: “A Physics-Aware Approach”. InNext Generation EDA Flow: Motivations, Opportunities, Challenges and Future Directions 2025 Apr 12 (pp. 91-129). Cham: Springer Nature Switzerland.

Vargas F, Andjelkovic M, Krstic M, Kar A, Deshwal S, Chauhan YS, Amrouch H, Tille D, Huhn S. Self-Aware Silicon: Enhancing Lifecycle Management with Intelligent Testing and Data Insights. In2025 IEEE European Test Symposium (ETS) 2025 May 26 (pp. 1-10). IEEE.

https://www.linkedin.com/feed/update/urn:li:activity:7356043406298546176/

https://www.linkedin.com/in/sebastian-huhn-84657768/

https://www.linkedin.com/posts/sebastian-huhn-84657768_ieeeets-siliconlifecyclemanagement-testandreliability-activity-7334929170373783552-fuCk

F Vargas, M Andjelkovic, M Krstic, A Kar… - … IEEE European Test …, 2025 - ieeexplore.ieee.org

Fang W, Liu S, Wang J, Xie Z. Circuitfusion: multimodal circuit representation learning for agile chip design. arXiv preprint arXiv:2505.02168. 2025 May 4.  https://arxiv.org/pdf/2505.02168

https://github.com/hkust-zhiyao/CircuitFusion

Fang W, Wang J, Lu Y, Liu S, Wu Y, Ma Y, Xie Z. A survey of circuit foundation model: Foundation ai models for vlsi circuit design and eda. arXiv preprint arXiv:2504.03711. 2025 Mar 28.

Lu Y, Au HI, Zhang J, Pan J, Wang Y, Li A, Zhang J, Chen Y. AutoEDA: Enabling EDA Flow Automation through Microservice-Based LLM Agents. arXiv preprint arXiv:2508.01012. 2025 Aug 1.

Wei A, Tan H, Suresh T, Mendoza D, Teixeira TS, Wang K, Trippel C, Aiken A. VeriCoder: Enhancing LLM-Based RTL Code Generation through Functional Correctness Validation. arXiv preprint arXiv:2504.15659. 2025 Apr 22.

Next Generation EDA Flow: https://www.google.com/books/edition/Next_Generation_EDA_Flow/

RapidGPT: https://docs.primis.ai/ - industry’s first AI-based pair-designer tailored to ASIC and FPGA engineers

OpenAI x Broadcom — The OpenAI Podcast Ep. 8: https://youtu.be/qqAbVTFnfk8?si=DSl5apccjADsM7jc

From Text to Pixels: How AI Models Are Learning to See and Think

Artificial intelligence keeps surprising us. Just when we thought large language models (LLMs) were all about reading and writing text, new ...