Domain Specific ARCHITECTURE
ADVANCE Computer ARCHITECTURE
presented By:
Vamsi Krishna Raju (030709040)Venkata Sai Karthikeya (030690801)Yamini (030239558) Vinay Kumar Reddy (030738290)
Domain SPECIFIC ARCHITECTURE:
- A Domain-Specific Architecture (DSA) is an assemblage of software components
- Specialized for a particular domain
- Generalized for effective use across that domain
- Composed in a standardized structure (topology) effective for building successful applications.
IntroduCTION
- Domain-specific architecture refers to a type of computer architecture that is specifically designed for a particular application domain or set of related applications.
- Unlike general-purpose computing systems, which are designed to handle a wide range of tasks, domain-specific architectures are optimized for specific tasks, allowing for increased performance, efficiency, and power consumption
- Designing domain-specific architectures requires a deep understanding of the domain and the specific application requirements, as well as a thorough knowledge of hardware design and optimization techniques.
WHY DOmain Specific Architecture?
- Gordon Moore predicted the number of transistors per chip would double every year or two. THE ENDING OF Moore’s Law leaves domain specific architectures as the future of computing.
Example: Google’s tensor processing unit, DRAM chip introduced in 2014
- Moore’s law has caused major limitations in energy efficient general purpose architectures making implementation technology and parallelism to reach their maximum potential for performance improvements.
- Domain Specific Architecture with domain specific
hardware acceleration was introduced to address this issue and improve the performance that cannot be achieved by improving general purpose computing.
Difference between general purpose architecture and domain specific architecture
The main difference between general-purpose architecture and domain-specific architecture is their level of specialization. General-purpose architecture is designed to handle a wide range of applications and tasks, while domain-specific architecture is optimized for a specific application domain or set of related applications.
- Functionality
- Performance
- Power Efficiency
- Cost Effective
- Flexibility
Examples of DSA
- Tensor Processing Unit (TPU), Google
- I Catapult, Microsoft
- I Crest, Intel
- I Pixel Visual Core, Google
Guidelines For DSA
- Use dedicated memories (of a size that can minimize transfers)
- Use lower performance implementations and used space savings to build more ALUs/bigger memories
- Use parallelism that matches the target domain
- Use simplest/smallest data types needed for the domain
- Use domain specific languages
Guidelines of DSA
Research Focus on
- DSA using Deep neural networks.
- Mobile and Embedded applications using CNN.
- ECG-based authentication.
- DSA in Scalable Distributed Sequence Alignment System
- 3D Integration for Medical Image Processing
1.DSA using Deep Neural Network
- Moore’s law: the number of transistors in a dense integrated circuit (IC) doubles about every two years.
- Dennard scaling: As the dimensions of a device go down, so does power consumption.
- The DRAM chip introduced in 2014 contained eight billion transistors, and a 16-billion transistor DRAM chip will not be in mass production until 2019.
- A trailblazing example is Google’s tensor processing unit (TPU), first deployed in 2015,and that provides services today for more than one billion people. It runs deep neural
- networks (DNNs) 15 to 30 times faster with 30 to 80 times better energy efficiency than contemporary CPUs and GPUs in similar technologies.
Tensor Processing Unit(TPU)
- Google's tensor processing unit (TPU), which was introduced in 2015 and currently offers services to more than one billion people, is a pioneering example. Compared to modern CPUs and GPUs in comparable technologies, it operates deep neural networks (DNNs) 15–30 times quicker and uses 30–80 times less energy.
- The TensorFlow framework for neural networks is programmed in the TPU chip, an application-specific integrated circuit, to power many crucial applications in Google data centers, including image recognition, language translation, search, and gaming.
TPU Block Diagram
2.Mobile and Embedded applications using CNN
- The efficient and practical solution for CNN deployment and implementation is Domain Specific Architecture (DSA).
- Feature extraction from input is useful for tasks such as object classification and face authentication. Traditional methods of face feature extraction employ one or more layers of the CNN network's FC layers. The output of FC layer dimension would be reduced from convolution layers of very high dimensional data using inner product computation.
- The CNN-DSA accelerator is used as the feature extractor in the basic application mode, so the power efficient accelerator handles the majority of the heavy-lifting convolutional computations. The CNN-DSA accelerator can be reconfigured to support CNN model coefficients with varying layer sizes and types.
CNN-DSA Hardware Design
- CNN processing engines extract features out of an input image by performing multiple layers of 3x3
- convolutions with rectifications and 2x2 pooling operations.
It includes : CNN processing block The first set of memory buffers for storing imagery data. The second set of memory buffers for storing filter coefficients.
Object Classification
- A face recognition system is expected to identify faces present in images and videos automatically. It can operate in either or both of two modes:
1. Face verification (or authentication): involves a one-to-one match that compares a query face image against a template face image whose identity is being claimed. 2. Face identification (or recognition): involves one-to-many matches that compares a query face image against all the template images in the database to determine the identity of the query face.
Face Recognition
Face recognition is a visual pattern recognition problem.
- A face is a three-dimensional object subject to varying illumination, pose, expression is to be identified based on it’s two-dimensional image (or three-dimensional images obtained by laser scan).
- A face recognition system generally consists of 4 modules – detection, alignment, feature extraction, and matching.
- Localization and normalization (face detection and alignment) are processing steps before face recognition (facial feature extraction and matching) is performed.
- After a face is normalized, feature extraction is performed to provide effective information that is useful for distinguishing between faces of different persons and stable with respect to the geometrical and photometrical variations.
- For face matching, the extracted feature vector of the input face is matched against those of enrolled faces in the database; it outputs the identity of the face when a match is found with sufficient confidence or indicates an unknown face otherwise.
Pipeline for Face Recognition
Methodologies
- Nonlinear classification for face detection may be performed using neural networks or kernel-based methods.
- Neural methods: a classifier may be trained directly using preprocessed and normalized face and nonface training subwindows.
- Convolutional Neural Networks are designed to recognize visual patterns directly from pixel images with minimal preprocessing.
- The 24-dimensional feature vector provides a good representation for classifying face and nonface patterns.
- Convolutional neural networks are trained by back-propagation algorithms.
- Kernel VM classifiers perform nonlinear classification for face detection using face and nonface examples.
- Although such methods are able to learn nonlinear boundaries, a large number of support vectors may be needed to capture a highly nonlinear boundary. For this reason, fast real-time performance has so far been a difficulty with SVM classifiers thus trained.
Proposed System
3.DSA For ECG based Authentication
- An increasingly used method for biometric user authentication is the electrocardiogram (ECG). A domain-specific architecture (DSA) for ECG biometric authentication is an important step in the direction of efficient and secure architectures for EBA.
- ECG signals, which represent the electrical activity of the human heart, are simple to get, individually identifiable, long-lasting, and information-rich, making them a great option for user authentication in systems with limited resources. As a result, studies on quick and energy-saving EBA techniques have been done in the past, looking at both the algorithm and the hardware implementation.
Architecture Diagram For EBA-DSA
EBA-DSA Overview
- The architecture comprises a base out-of-order core similar to modern-day ARM-based high-performance processors for consumer devices, such as smartphones. The functional units (execution units) include two ALUs, one load, and one storage unit.
- For energy savings, the design uses spin-transfer torque RAM (STTRAM) caches, given their low leakage power and normally-off computing capabilities , A 4-way set associative 16KB cache with 64B blocks. A 32KB cache, which is common in smartphones, was over-provisioned for the algorithm.
- To further limit the energy overheads, we used reduced retention STTRAM caches that only retain data for a limited period of time, after which the data is invalidated.
4.DSA in Scalable Distributed Sequence Alignment System
- Sequence alignment algorithms are a basic and critical component of many bioinformatics fields. With rapid development of sequencing technology, the fast growing reference database volumes and longer length of query sequence become new challenges for sequence alignment. However, the algorithms have prohibitively high time and space complexity.
- DSA, a scalable distributed sequence alignment system that employs Apache Spark to process sequences data in a horizontally scalable distributed environment, and leverages data parallel strategy based on Single Instruction Multiple Data (SIMD) instruction to parallelize the algorithms in each core of worker node.
Overview On sequence Alignment system
- Research in areas like personalized medicine and agritech is being accelerated by the proliferation of genomic data.
- Major performance gains are delivered by GPUs and FPGAs in certain applications, but GPUs have a noticeable power consumption problem and FPGA can't be programmed.
- The design and evaluation of SALSA, a Domain Specific Architecture (DSA) for sequence alignment that is:
Highly Programmable Customizable Extensible
Top level Salsa Architecture
5.3D Integration for Medical Image Processing
- Medical image processing as the domain in this work to accelerate due to its growing for real-time processing demand yet inadequete performance on conventional computing architectures.
- A design flow is proposed in this work for the 3D multiprocessor-accelerator platform and a number of methods are applied to optimize the average performance of all the applications in the targeted domain under area and bandwidth constraints.
- Experiments show that the applications in this domain can gain a 7.4× speed-up and 18.8× energy savings on average running on our platform using CMP cores and domain-specific accelerators as compared to their counterparts coded in CPU only
- 3D interconnect technology, which allows vertical stacking of layers of active electronic components, has emerged as a promising technology to boost the device bandwidth
Architecture for 3D-CMP-FPGA Computing
- 3D multiprocessor-accelerator architecture is constructed by stacking a programmable fabric layer above the CMP layer, and the required communication between these two layers is provided by TSV
Conclusion
In conclusion, domain-specific architectures are neural network architectures that are specifically designed and optimized for particular applications or domains. By tailoring the architecture to the characteristics of the input data and the task at hand, domain-specific architectures can improve the efficiency and performance of deep neural networks for a wide range of applications. From convolutional neural networks for image classification to transformer architectures for language modeling, domain-specific architectures are key to unlocking the full potential of deep learning for various real-world applications.
References
- [1] A Domain-Specific Architecture for Deep Neural Networks, BY NORMAN P. JOUPPI, CLIFF YOUNG,
NISHANT PATIL, AND DAVID PATTERSON, Sept 2018
- [2] Ultra Power-Efficient CNN Domain Specific Accelerator with 9.3TOPS/Watt for Mobile and Embedded Applications Baohua Sun, Lin Yang, Patrick Dong, Wenhan Zhang, Jason Dong, Charles Young Gyrfalcon Technology Inc. 1900 McCarthy Blvd. Milpitas, CA 95035
- [3] ECG-based Authentication using Timing-Aware Domain-Specific Architecture Renato Cordeiro, Member, IEEE, Dhruv Gajaria, Graduate Student Member, IEEE, Ankur Limaye, Graduate Student Member, IEEE, Tosiron Adegbija, Senior Member, IEEE, Nima Karimian, Member, IEEE, and Fatemeh Tehranipoor, Member,IEEE
- https://ieeexplore.ieee.org/abstract/document/7973775
- https://link.springer.com/article/10.1007/s42979-021-00815-1
- https://ieeexplore.ieee.org/document/6043279
- https://passlab.github.io/CSCE513/notes/lecture26_DSA_DomainSpecificArchitectures.pdf
Thanks foryour attention
Any question?
FLOW UNIVERSITY THESIS
tulabandula yamini
Created on April 17, 2023
Start designing with a free template
Discover more than 1500 professional designs like these:
View
Newspaper Presentation
View
Audio tutorial
View
Pechakucha Presentation
View
Desktop Workspace
View
Decades Presentation
View
Psychology Presentation
View
Medical Dna Presentation
Explore all templates
Transcript
Domain Specific ARCHITECTURE
ADVANCE Computer ARCHITECTURE
presented By:
Vamsi Krishna Raju (030709040)Venkata Sai Karthikeya (030690801)Yamini (030239558) Vinay Kumar Reddy (030738290)
Domain SPECIFIC ARCHITECTURE:
IntroduCTION
WHY DOmain Specific Architecture?
- Gordon Moore predicted the number of transistors per chip would double every year or two. THE ENDING OF Moore’s Law leaves domain specific architectures as the future of computing.
Example: Google’s tensor processing unit, DRAM chip introduced in 2014- Domain Specific Architecture with domain specific
hardware acceleration was introduced to address this issue and improve the performance that cannot be achieved by improving general purpose computing.Difference between general purpose architecture and domain specific architecture
The main difference between general-purpose architecture and domain-specific architecture is their level of specialization. General-purpose architecture is designed to handle a wide range of applications and tasks, while domain-specific architecture is optimized for a specific application domain or set of related applications.
Examples of DSA
Guidelines For DSA
Guidelines of DSA
Research Focus on
1.DSA using Deep Neural Network
Tensor Processing Unit(TPU)
TPU Block Diagram
2.Mobile and Embedded applications using CNN
CNN-DSA Hardware Design
- CNN processing engines extract features out of an input image by performing multiple layers of 3x3
- convolutions with rectifications and 2x2 pooling operations.
It includes : CNN processing block The first set of memory buffers for storing imagery data. The second set of memory buffers for storing filter coefficients.Object Classification
- A face recognition system is expected to identify faces present in images and videos automatically. It can operate in either or both of two modes:
1. Face verification (or authentication): involves a one-to-one match that compares a query face image against a template face image whose identity is being claimed. 2. Face identification (or recognition): involves one-to-many matches that compares a query face image against all the template images in the database to determine the identity of the query face.Face Recognition
Face recognition is a visual pattern recognition problem.
Pipeline for Face Recognition
Methodologies
Proposed System
3.DSA For ECG based Authentication
Architecture Diagram For EBA-DSA
EBA-DSA Overview
4.DSA in Scalable Distributed Sequence Alignment System
Overview On sequence Alignment system
- Research in areas like personalized medicine and agritech is being accelerated by the proliferation of genomic data.
- Major performance gains are delivered by GPUs and FPGAs in certain applications, but GPUs have a noticeable power consumption problem and FPGA can't be programmed.
- The design and evaluation of SALSA, a Domain Specific Architecture (DSA) for sequence alignment that is:
Highly Programmable Customizable ExtensibleTop level Salsa Architecture
5.3D Integration for Medical Image Processing
Architecture for 3D-CMP-FPGA Computing
Conclusion
In conclusion, domain-specific architectures are neural network architectures that are specifically designed and optimized for particular applications or domains. By tailoring the architecture to the characteristics of the input data and the task at hand, domain-specific architectures can improve the efficiency and performance of deep neural networks for a wide range of applications. From convolutional neural networks for image classification to transformer architectures for language modeling, domain-specific architectures are key to unlocking the full potential of deep learning for various real-world applications.
References
- [1] A Domain-Specific Architecture for Deep Neural Networks, BY NORMAN P. JOUPPI, CLIFF YOUNG,
NISHANT PATIL, AND DAVID PATTERSON, Sept 2018Thanks foryour attention
Any question?