What is TPU (Tensor Processing Unit)?
A tensor processing unit (TPU) is an AI accelerator application-specific integrated circuit (ASIC) developed by Google specifically for neural network machine learning, particularly using Google’s TensorFlow software. Google began using it internally in 2015, and in 2018 made them available for third party use, both as part of its cloud infrastructure and by offering a smaller version of the chip for sale.
ASICs are optimized at the hardware level for a certain computationally-intensive application to be run very efficiently with regards to energy consumption. A cluster of GPUs can accomplish many tasks very quickly, but a cluster of ASICs will probably be faster at the 1 task they are designed for and certainly consumes much less power.
Bitcoin mining (computing hashes of SHA-256 cryptographic signatures) is probably the most notable application of ASICs at this point, although TPUs could overtake them.
TPUs are designed for machine learning related calculations that have no need for extreme precision, therefore I believe they are using a 16-bit or possibly even 8-bit ISA. (Instruction set architecture)
TPUs, unlike GPUs, was custom-designed to deal with operations such as matrix multiplications in neural network training. Google TPUs can be accessed in two forms — cloud TPU and edge TPU. Cloud TPUs can be accessed from Google Colab notebook, which provides users with TPU pods that sit on Google’s data centers. Whereas, edge TPU is a custom-built development kit that can be used to build specific applications.
Tensor processing unit designed specifically for Google’s TensorFlow framework, a symbolic math library that is used for neural networks.
Google announces TPU to the world in 2016, whereas also stated that it had already. been used over a year inside the google data centers. At first, Google’s TPUs are proprietary. Some models are commercially available but later in 2018 google allows other companies to buy access to the TPU.
The working fashion and architecture of TPU are a bit different from GPU.
Since Google announced the TPU in 2016. Google upgraded it over the time.
- The first TPU google launch in the year 2016 and named it as TPUv1 abbreviated as Tensor Processing Unit version one.
- The second TPU google launch in the year 2017 and named it as TPUv2 abbreviated as Tensor Processing Unit version two
- The third TPU google launch in the year 2018 and named it as TPUv3 abbreviated as Tensor Processing Unit version three.
- In the same year, 2018 google upgraded TPUv3 and named it Edge TPU.
How Tensor Processing Unit works?
TPU is a domain specific architecture . It is design as a matrix processor.specialized for neural network work loads. TPUs can’t run word processors, control rocket engines, or execute bank transactions, but they can handle the massive multiplications and additions for neural networks, at blazingly fast speeds while consuming much less power and inside a smaller physical footprint.
the primary task for this processor is matrix processing, hardware designer of the TPU knew every calculation step to perform that operation. So they were able to place thousands of multipliers and adders and connect them to each other directly to form a large physical matrix of those operators. This is called systolic array architecture. In the case of Cloud TPU v2, there are two systolic arrays of 128 x 128, aggregating 32,768 ALUs for 16-bit floating-point values in a single processor.
Let’s see how a systolic array executes the neural network calculations. At first, TPU loads the parameters from memory into the matrix of multipliers and adders.
Then, the TPU loads data from memory. As each multiplication is executed, the result will be passed to next multipliers while taking summation at the same time. So the output will be the summation of all multiplication result between data and parameters. During the whole process of massive calculations and data passing, no memory access is required at all.
This is why the TPU can achieve high computational throughput on neural network calculations with much less power consumption and a smaller footprint.
How different is a TPU from GPU?
Architecturally? Very different. A GPU is a processor in its own right, just one optimized for vectorized numerical code; GPUs are the spiritual successor of the classic Cray supercomputers. A TPU is a coprocessor, it cannot execute code in its own right, all code execution takes place on the CPU which just feeds a stream of microoperations to the TPU.
The main difference is that TPUs are cheaper and use a lot less power, and can thus complete really large prediction jobs cheaper than GPUs, or make it simpler to use prediction in a low-latency service.
GPU– the Graphics Processing Unit is a specialized electronic circuit designed to render 2D and 3D graphics together with a CPU. GPU also is known as Graphics Card in Gaming culture. Now GPU is being harnessed more broadly to accelerate computational workloads in areas such as financial modeling, cutting-edge scientific research, deep learning, analytics and oil, and gas exploration, etc.
TPU–Tensor Processing Unit abbreviation TPU is a custom-built integrated circuit developed specifically for machine learning and tailored for TensorFlow, Google’s open-source machine learning framework. it has been powering Google data centers since 2015, however, Google still uses CPUs and GPUs for other types of machine learning.
In the above article, we provide enough information for a beginner to understand the tensor processing unit. in our article, we almost try to keep it pretty simple to understand. It will be beneficial to someone who seeks forward to make a career in the field of artificial intelligence. In this article, we explain the term TPU and its working and some basic information.
If you are really interested and look forward to understand it better.then visit google cloud platform-https://cloud.google.com/tpu/docs/resources