Category: CUSTOM-CHIP

The Semiconductor Shift Toward Processor-In-Memory And Processing-Near-Memory

Image Generated Using Nano Banana

Reliance Of AI And Data Workloads On Computer Architecture

AI and modern data workloads have transformed how we think about computing systems. Traditional processors were designed for sequential tasks and moderate data movement. Today’s AI models work with enormous datasets and large numbers of parameters that must move constantly between memory and compute units. This movement introduces delays and consumes significant energy. As a result, memory bandwidth and the distance to the data have become major performance bottlenecks.

Graphics processors, tensor accelerators, and custom architectures try to address these issues by increasing parallelism. Yet, parallel computing alone cannot solve the challenge if data cannot reach the compute units fast enough. The cost of moving data inside a system is now often higher than the cost of the computation itself.

This places the spotlight on the relationship between compute location, memory hierarchy, and data flow. As models grow in size and applications demand faster responses, the gap between processor speed and memory access continues to widen.

The computing industry often refers to this as the memory wall. When AI tasks require moving gigabytes of data per operation, each additional millimeter of distance within a chip or package matters. To break this pattern, new approaches look at placing compute engines closer to where data is stored.

This shift has sparked interest in Processor In-Memory and Processing Near-Memory solutions.

Instead of pulling data along long paths, the system reorganizes itself so that computation occurs either within the memory arrays or very close to them. This architectural change aims to reduce latency, cut energy use, and support the growing scale of AI workloads.

What Is Processor-In-Memory And Processing-Near-Memory

Processor-In-Memory places simple compute units directly inside memory arrays. The idea is to perform certain operations, such as multiplication and accumulation, inside the storage cells or peripheral logic. By doing this, data does not need to travel to a separate processor. This can lead to significant improvements in throughput and reductions in energy consumption for specific AI tasks, especially those involving matrix operations.

Processing-Near-Memory keeps memory arrays unchanged but integrates compute units very close to them, usually on the same stack or interposer. These compute units are not inside the memory but sit at a minimal distance from it. This enables faster data access than traditional architectures without requiring significant changes to memory cell structures. PNM often offers a more flexible design path because memory vendors do not need to modify core-array technology.

Here is a simple comparison of the two approaches.

Feature	Processor-In-Memory	Processing-Near-Memory
Compute location	Inside memory arrays or peripheral logic	Adjacent to memory through same stack or substrate
Memory modification	Requires changes to memory cell or array design	Uses standard memory with added compute units nearby
Data movement	Very low due to in-array operation	Low because compute is positioned close to data
Flexibility	Limited to specific operations built into memory	Wider range of compute tasks possible
Technology maturity	Still emerging and specialized	More compatible with existing memory roadmaps

Both approaches challenge the long-standing separation between computing and storage. Instead of treating memory as a passive container for data, they treat it as an active part of the computation pipeline. This helps systems scale with the rising demands of AI without relying entirely on larger, more power-hungry processors.

Research Efforts For Processor In Memory And Processing Near Memory

Research activity in this area has grown quickly as AI and data workloads demand new architectural ideas. Both Processor In Memory and Processing Near Memory have attracted intense attention from academic and industrial groups. PIM work often focuses on reducing data movement by performing arithmetic inside or at the edge of memory arrays. At the same time, PNM research explores programmable compute units placed near memory stacks to improve bandwidth and latency.

The selected examples below show how each direction is pushing the boundaries of energy efficiency, scalability, and workload suitability.

Category	Example Work	Key Focus	What It Demonstrates	Link
Processor In Memory	SparseP: Efficient Sparse Matrix Vector Multiplication on Real PIM Systems (2022)	Implements SpMV on real PIM hardware	Shows strong gains for memory-bound workloads by computing inside memory arrays	Paper
Processor In Memory	Neural-PIM: Efficient PIM with Neural Approximation of Peripherals (2022)	Uses RRAM crossbars and approximation circuits	Shows how analog compute in memory can accelerate neural networks while cutting conversion overhead	Paper
Processing Near Memory	A Modern Primer on Processing In Memory (Conceptual framework)	Defines PIM vs PNM in stacked memory systems	Clarifies architectural boundaries and highlights PNM integration paths in 3D memory	Paper
Processing Near Memory	Analysis of Real Processing In Memory Hardware (2021)	Evaluates DRAM with adjacent compute cores	Provides methods used widely in PNM evaluation for bandwidth and workload behavior	Paper

This comparison above captures both experimental implementations and architectural frameworks.

Together, they show how PIM pushes compute directly into memory structures, while PNM enables more flexible acceleration by placing logic close to high-bandwidth memory.

Implications And When Each Approach Can Benefit

Processor-In-Memory is often most useful when the workload is highly repetitive and dominated by simple arithmetic on large matrices. Examples include neural network inference and certain scientific operations. Since operations occur in memory, energy savings can be substantial. However, PIM is less suitable for general-purpose tasks that require flexible instruction sets or complex branching.

Processing-Near-Memory is a more adaptable option for systems that need performance improvements but cannot redesign memory cells. It supports tasks such as training large AI models, running recommendation engines, and accelerating analytics pipelines. Because PNM units are programmable, they can handle a broader range of workloads while still providing shorter data paths than traditional processors.

*Image Credit: Computing Landscape Review*

In real systems, both approaches may coexist. PIM might handle dense linear algebra while PNM handles control logic, preprocessing, and other mixed operations. The choice depends on workload structure, system integration limits, and power budgets. As AI becomes embedded in more devices, from data centers to edge sensors, these hybrids create new ways to deliver faster responses at lower energy.

The Direction Forward

The movement toward Processor-In-Memory and Processing-Near-Memory signals a larger architectural shift across the semiconductor world. Instead of treating compute and memory as separate units connected by wide interfaces, the industry is exploring tightly coupled designs that reflect the actual behavior of modern AI workloads. This shift helps push past the limits of conventional architectures and opens new opportunities for performance scaling.

As more applications rely on real-time analytics, foundation models, and data-intensive tasks, the pressure on memory systems will continue to increase. Designs that bring compute closer to data are becoming essential to maintaining progress. Whether through in-memory operations or near-memory acceleration, these ideas point toward a future where data movement becomes a manageable cost rather than a fundamental barrier.

The direction is clear. To support the next generation of AI and computing systems, the computing industry is rethinking distance, energy, and data flow at the chip level. Processor-In-Memory and Processing-Near-Memory represent two critical steps in that journey, reshaping how systems are built and how performance is achieved.

December 6, 2025

The Race For Semiconductor Custom Chip Is Only Getting Started

Image Generated Using DALL-E

What Are Semiconductor Custom Chip

Semiconductor custom chips, also known as application-specific integrated circuits (ASICs), represent a specialized category of electronic components designed to perform specific functions or tasks within electronic devices.

Unlike general-purpose chips that can run a wide range of applications, custom chips are engineered for a particular application or product, offering optimized performance, power efficiency, and often reduced size compared to their off-the-shelf counterparts.

These chips are tailored to meet the unique requirements of a project, including specific computational tasks, signal processing, or control functions, making them indispensable in industries such as telecommunications, automotive, consumer electronics, and increasingly in emerging technologies like IoT (Internet of Things) and AI (Artificial Intelligence).

The design and fabrication of custom chips involve a collaborative process between the chip designers and manufacturers, ensuring that the final product precisely matches the functional and operational specifications of the intended application.

Integration Of AI And Semiconductor Custom Chip

Lately, the AI industry is realizing the potential of these custom chips, and below are the main reasons:

ASPECT	CONNECTION
Optimized Performance	Designed specifically for AI workloads, offering faster data processing and efficient execution.
Energy Efficiency	Engineered for minimal energy consumption, crucial for mobile and edge computing AI applications.
Tailored Hardware Acceleration	Incorporate accelerators like TPUs for improved speed in AI computations, enabling real-time processing.
Flexibility And Scalability	Allows integration of various AI functionalities, adaptable to evolving computational demands.
Cost-Effectiveness	Optimizes hardware for specific tasks, reducing unnecessary components and lowering production costs.
Enhanced Security	Incorporates security features to protect AI data and algorithms, critical for sensitive applications.

Why The Race For Custom Chip Is Only Getting Started

Several compelling reasons drive the race among AI software companies to build custom chips, and indications suggest that this competition is only gaining momentum due to the rapidly evolving landscape of artificial intelligence and machine learning.

Here are the primary factors fueling this race:

REASONS	EXPLANATION
Demand For Higher Computational Power	AI models’ growing complexity necessitates chips capable of efficient, high-speed data processing to enable advanced applications.
Energy Efficiency	Custom chips are optimized for lower power consumption, essential for mobile and edge computing AI applications, to extend battery life and reduce operational costs.
Competitive Advantage	Tailoring hardware to specific needs offers performance, capabilities, and cost benefits, providing a competitive edge in various sectors.
Reduced Dependence On External Suppliers	Developing in-house chips reduces reliance on third-party manufacturers, offering more control over supply chains and potentially lower costs.
Innovations In AI Require Tailored Solutions	Emerging AI algorithms and models need specific hardware features, making custom chips vital for supporting proprietary technologies.
Latency Reduction	Custom chips enable on-site data processing in edge devices, facilitating real-time decision-making crucial for applications like autonomous driving.
Increased AI Accessibility	By making AI solutions more affordable and energy-efficient, custom chips help democratize AI technology, fostering innovation across numerous sectors.

The race for custom chips, particularly in artificial intelligence (AI), is burgeoning at an unprecedented pace, driven by the insatiable demand for more powerful, efficient, and specialized computing solutions. This surge is not merely a trend but a fundamental shift in how technology ecosystems evolve to meet the intricate demands of modern applications and services.

How Semiconductor Industry Will Benefit From AI SoC Chip Race

As we stand on the cusp of technological innovations that demand tailored computational capabilities, the race for custom chip development is only gaining momentum. It promises to reshape industries, foster new levels of innovation, and redefine the competitive landscape, ensuring the journey toward more advanced, application-specific integrated circuits (ASICs) is just beginning.

Below are the major benefits:

BENEFITS	EXPLANATION
Increased Demand For Advanced Semiconductors	Rising needs for custom AI chips boost production volumes and drive technological advancements in semiconductor manufacturing.
Innovation And Technological Advancements	The specific requirements of AI applications incentivize the development of new chip architectures, manufacturing techniques, and materials, propelling industry-wide technological progress.
Diversification Of Revenue Streams	Custom AI chips open up new markets, allowing semiconductor companies to cater to a diverse customer base and reduce reliance on a few large clients, enhancing financial stability.
Partnerships And Collaborations	The complexity of AI chip production encourages collaborations between semiconductor firms and AI companies, leading to shared R&D and co-development of technologies, fostering a more integrated supply chain.
Global Market Expansion	The worldwide spread of AI technologies necessitates investments in global supply chains and manufacturing capabilities, allowing semiconductor companies to tap into new regional markets.
Enhanced Manufacturing Capabilities	Producing custom AI chips requires semiconductor manufacturers to adopt advanced fabrication technologies and improve production efficiencies, benefiting the broader manufacturing capabilities of the industry.
Workforce Development	The demand for skilled personnel in R&D, manufacturing, and testing of custom AI chips encourages the industry to invest in developing a talented workforce, promoting a culture of innovation.
Regulatory And Policy Engagement	The growing recognition of semiconductors’ importance in national security and economies opens opportunities for the industry to engage with governments on supportive policies and regulations, enhancing industry resilience.

Furthermore, the competitive landscape of the tech industry is another catalyst propelling the race for custom chip development. Companies seek to differentiate their products and services by leveraging the unique capabilities that custom chips offer, such as reduced latency, enhanced data privacy, and the ability to perform sophisticated AI tasks at the edge of networks.

This differentiation is crucial in industries where performance and efficiency can directly impact user experience and operational costs, such as cloud computing, consumer electronics, and automotive technologies. Thus, reigniting the race to make better custom chips.

February 11, 2024