Category: CUSTOM-CHIP

  • The Semiconductor Shift Toward Processor-In-Memory And Processing-Near-Memory

    Image Generated Using Nano Banana


    Reliance Of AI And Data Workloads On Computer Architecture

    AI and modern data workloads have transformed how we think about computing systems. Traditional processors were designed for sequential tasks and moderate data movement. Today’s AI models work with enormous datasets and large numbers of parameters that must move constantly between memory and compute units. This movement introduces delays and consumes significant energy. As a result, memory bandwidth and the distance to the data have become major performance bottlenecks.

    Graphics processors, tensor accelerators, and custom architectures try to address these issues by increasing parallelism. Yet, parallel computing alone cannot solve the challenge if data cannot reach the compute units fast enough. The cost of moving data inside a system is now often higher than the cost of the computation itself.

    This places the spotlight on the relationship between compute location, memory hierarchy, and data flow. As models grow in size and applications demand faster responses, the gap between processor speed and memory access continues to widen.

    The computing industry often refers to this as the memory wall. When AI tasks require moving gigabytes of data per operation, each additional millimeter of distance within a chip or package matters. To break this pattern, new approaches look at placing compute engines closer to where data is stored.

    This shift has sparked interest in Processor In-Memory and Processing Near-Memory solutions.

    Instead of pulling data along long paths, the system reorganizes itself so that computation occurs either within the memory arrays or very close to them. This architectural change aims to reduce latency, cut energy use, and support the growing scale of AI workloads.


    What Is Processor-In-Memory And Processing-Near-Memory

    Processor-In-Memory places simple compute units directly inside memory arrays. The idea is to perform certain operations, such as multiplication and accumulation, inside the storage cells or peripheral logic. By doing this, data does not need to travel to a separate processor. This can lead to significant improvements in throughput and reductions in energy consumption for specific AI tasks, especially those involving matrix operations.

    Processing-Near-Memory keeps memory arrays unchanged but integrates compute units very close to them, usually on the same stack or interposer. These compute units are not inside the memory but sit at a minimal distance from it. This enables faster data access than traditional architectures without requiring significant changes to memory cell structures. PNM often offers a more flexible design path because memory vendors do not need to modify core-array technology.

    Here is a simple comparison of the two approaches.

    FeatureProcessor-In-MemoryProcessing-Near-Memory
    Compute locationInside memory arrays or peripheral logicAdjacent to memory through same stack or substrate
    Memory modificationRequires changes to memory cell or array designUses standard memory with added compute units nearby
    Data movementVery low due to in-array operationLow because compute is positioned close to data
    FlexibilityLimited to specific operations built into memoryWider range of compute tasks possible
    Technology maturityStill emerging and specializedMore compatible with existing memory roadmaps

    Both approaches challenge the long-standing separation between computing and storage. Instead of treating memory as a passive container for data, they treat it as an active part of the computation pipeline. This helps systems scale with the rising demands of AI without relying entirely on larger, more power-hungry processors.


    Research Efforts For Processor In Memory And Processing Near Memory

    Research activity in this area has grown quickly as AI and data workloads demand new architectural ideas. Both Processor In Memory and Processing Near Memory have attracted intense attention from academic and industrial groups. PIM work often focuses on reducing data movement by performing arithmetic inside or at the edge of memory arrays. At the same time, PNM research explores programmable compute units placed near memory stacks to improve bandwidth and latency.

    The selected examples below show how each direction is pushing the boundaries of energy efficiency, scalability, and workload suitability.

    Image Credit: SparseP
    CategoryExample WorkKey FocusWhat It DemonstratesLink
    Processor In MemorySparseP: Efficient Sparse Matrix Vector Multiplication on Real PIM Systems (2022)Implements SpMV on real PIM hardwareShows strong gains for memory-bound workloads by computing inside memory arraysPaper
    Processor In MemoryNeural-PIM: Efficient PIM with Neural Approximation of Peripherals (2022)Uses RRAM crossbars and approximation circuitsShows how analog compute in memory can accelerate neural networks while cutting conversion overheadPaper
    Processing Near MemoryA Modern Primer on Processing In Memory (Conceptual framework)Defines PIM vs PNM in stacked memory systemsClarifies architectural boundaries and highlights PNM integration paths in 3D memoryPaper
    Processing Near MemoryAnalysis of Real Processing In Memory Hardware (2021)Evaluates DRAM with adjacent compute coresProvides methods used widely in PNM evaluation for bandwidth and workload behaviorPaper

    This comparison above captures both experimental implementations and architectural frameworks.

    Together, they show how PIM pushes compute directly into memory structures, while PNM enables more flexible acceleration by placing logic close to high-bandwidth memory.


    Implications And When Each Approach Can Benefit

    Processor-In-Memory is often most useful when the workload is highly repetitive and dominated by simple arithmetic on large matrices. Examples include neural network inference and certain scientific operations. Since operations occur in memory, energy savings can be substantial. However, PIM is less suitable for general-purpose tasks that require flexible instruction sets or complex branching.

    Processing-Near-Memory is a more adaptable option for systems that need performance improvements but cannot redesign memory cells. It supports tasks such as training large AI models, running recommendation engines, and accelerating analytics pipelines. Because PNM units are programmable, they can handle a broader range of workloads while still providing shorter data paths than traditional processors.

    Image Credit: Computing Landscape Review

    In real systems, both approaches may coexist. PIM might handle dense linear algebra while PNM handles control logic, preprocessing, and other mixed operations. The choice depends on workload structure, system integration limits, and power budgets. As AI becomes embedded in more devices, from data centers to edge sensors, these hybrids create new ways to deliver faster responses at lower energy.


    The Direction Forward

    The movement toward Processor-In-Memory and Processing-Near-Memory signals a larger architectural shift across the semiconductor world. Instead of treating compute and memory as separate units connected by wide interfaces, the industry is exploring tightly coupled designs that reflect the actual behavior of modern AI workloads. This shift helps push past the limits of conventional architectures and opens new opportunities for performance scaling.

    As more applications rely on real-time analytics, foundation models, and data-intensive tasks, the pressure on memory systems will continue to increase. Designs that bring compute closer to data are becoming essential to maintaining progress. Whether through in-memory operations or near-memory acceleration, these ideas point toward a future where data movement becomes a manageable cost rather than a fundamental barrier.

    The direction is clear. To support the next generation of AI and computing systems, the computing industry is rethinking distance, energy, and data flow at the chip level. Processor-In-Memory and Processing-Near-Memory represent two critical steps in that journey, reshaping how systems are built and how performance is achieved.


  • The Race For Semiconductor Custom Chip Is Only Getting Started

    Image Generated Using DALL-E


    What Are Semiconductor Custom Chip

    Semiconductor custom chips, also known as application-specific integrated circuits (ASICs), represent a specialized category of electronic components designed to perform specific functions or tasks within electronic devices.

    Unlike general-purpose chips that can run a wide range of applications, custom chips are engineered for a particular application or product, offering optimized performance, power efficiency, and often reduced size compared to their off-the-shelf counterparts.

    These chips are tailored to meet the unique requirements of a project, including specific computational tasks, signal processing, or control functions, making them indispensable in industries such as telecommunications, automotive, consumer electronics, and increasingly in emerging technologies like IoT (Internet of Things) and AI (Artificial Intelligence).

    The design and fabrication of custom chips involve a collaborative process between the chip designers and manufacturers, ensuring that the final product precisely matches the functional and operational specifications of the intended application.


    Integration Of AI And Semiconductor Custom Chip

    Lately, the AI industry is realizing the potential of these custom chips, and below are the main reasons:

    ASPECTCONNECTION
    Optimized PerformanceDesigned specifically for AI workloads, offering faster data processing and efficient execution.
    Energy EfficiencyEngineered for minimal energy consumption, crucial for mobile and edge computing AI applications.
    Tailored Hardware AccelerationIncorporate accelerators like TPUs for improved speed in AI computations, enabling real-time processing.
    Flexibility And ScalabilityAllows integration of various AI functionalities, adaptable to evolving computational demands.
    Cost-EffectivenessOptimizes hardware for specific tasks, reducing unnecessary components and lowering production costs.
    Enhanced SecurityIncorporates security features to protect AI data and algorithms, critical for sensitive applications.

    Picture By Chetan Arvind Patil

    Why The Race For Custom Chip Is Only Getting Started

    Several compelling reasons drive the race among AI software companies to build custom chips, and indications suggest that this competition is only gaining momentum due to the rapidly evolving landscape of artificial intelligence and machine learning.

    Here are the primary factors fueling this race:

    REASONSEXPLANATION
    Demand For Higher Computational PowerAI models’ growing complexity necessitates chips capable of efficient, high-speed data processing to enable advanced applications.
    Energy EfficiencyCustom chips are optimized for lower power consumption, essential for mobile and edge computing AI applications, to extend battery life and reduce operational costs.
    Competitive AdvantageTailoring hardware to specific needs offers performance, capabilities, and cost benefits, providing a competitive edge in various sectors.
    Reduced Dependence On External SuppliersDeveloping in-house chips reduces reliance on third-party manufacturers, offering more control over supply chains and potentially lower costs.
    Innovations In AI Require Tailored SolutionsEmerging AI algorithms and models need specific hardware features, making custom chips vital for supporting proprietary technologies.
    Latency ReductionCustom chips enable on-site data processing in edge devices, facilitating real-time decision-making crucial for applications like autonomous driving.
    Increased AI AccessibilityBy making AI solutions more affordable and energy-efficient, custom chips help democratize AI technology, fostering innovation across numerous sectors.

    The race for custom chips, particularly in artificial intelligence (AI), is burgeoning at an unprecedented pace, driven by the insatiable demand for more powerful, efficient, and specialized computing solutions. This surge is not merely a trend but a fundamental shift in how technology ecosystems evolve to meet the intricate demands of modern applications and services.


    How Semiconductor Industry Will Benefit From AI SoC Chip Race

    As we stand on the cusp of technological innovations that demand tailored computational capabilities, the race for custom chip development is only gaining momentum. It promises to reshape industries, foster new levels of innovation, and redefine the competitive landscape, ensuring the journey toward more advanced, application-specific integrated circuits (ASICs) is just beginning.

    Below are the major benefits:

    BENEFITSEXPLANATION
    Increased Demand For Advanced SemiconductorsRising needs for custom AI chips boost production volumes and drive technological advancements in semiconductor manufacturing.
    Innovation And Technological AdvancementsThe specific requirements of AI applications incentivize the development of new chip architectures, manufacturing techniques, and materials, propelling industry-wide technological progress.
    Diversification Of Revenue StreamsCustom AI chips open up new markets, allowing semiconductor companies to cater to a diverse customer base and reduce reliance on a few large clients, enhancing financial stability.
    Partnerships And CollaborationsThe complexity of AI chip production encourages collaborations between semiconductor firms and AI companies, leading to shared R&D and co-development of technologies, fostering a more integrated supply chain.
    Global Market ExpansionThe worldwide spread of AI technologies necessitates investments in global supply chains and manufacturing capabilities, allowing semiconductor companies to tap into new regional markets.
    Enhanced Manufacturing CapabilitiesProducing custom AI chips requires semiconductor manufacturers to adopt advanced fabrication technologies and improve production efficiencies, benefiting the broader manufacturing capabilities of the industry.
    Workforce DevelopmentThe demand for skilled personnel in R&D, manufacturing, and testing of custom AI chips encourages the industry to invest in developing a talented workforce, promoting a culture of innovation.
    Regulatory And Policy EngagementThe growing recognition of semiconductors’ importance in national security and economies opens opportunities for the industry to engage with governments on supportive policies and regulations, enhancing industry resilience.

    Furthermore, the competitive landscape of the tech industry is another catalyst propelling the race for custom chip development. Companies seek to differentiate their products and services by leveraging the unique capabilities that custom chips offer, such as reduced latency, enhanced data privacy, and the ability to perform sophisticated AI tasks at the edge of networks.

    This differentiation is crucial in industries where performance and efficiency can directly impact user experience and operational costs, such as cloud computing, consumer electronics, and automotive technologies. Thus, reigniting the race to make better custom chips.