Blog - RTLearner

AI Architecture 10. Padding and Pooling Hardware Issues

In the previous 3 Mappings of Conv Operations, we explored the massive trade-off (like Im2Col) of exchanging memory ...

AI Architecture 9. Three Mappings of Conv Operations: Direct vs. Im2Col vs. Winograd

In the previous post, we learned that hardware loves CNNs (Convolutional Neural Networks) because of Locality and Data Reuse. Theoretically, CNNs seem like the perfect hardware-friendly algorithm.

AI & HW Fundamentals

AI Architecture 8. CNN and Locality

In the previous post, we confirmed how inefficient MLP (Fully Connected Layer) is from a hardware perspective. Due to its structure of fetching a weight once,

AI & HW Fundamentals

AI Architecture 7. MLP and the Memory Wall

In previous posts, we learned about Quantization techniques to shave down data size to reduce hardware costs. So, why do we try so desperately to reduce data size?

AI & HW Fundamentals

AI Architecture 6. INT8 Quantization Basics

In the previous post, we examined how differences in Number Formats affect hardware area and power consumption. We established that FP32 ...

AI & HW Fundamentals

[AI Architecture] 5. The Weight of Data (Number Formats): How FP32 Impacts Hardware Area and Power

In the previous post, we explored the difference between Training and Inference, seeing how inference-only NPUs lighten the hardware structure. One of the key keywords for this optimization was 'Reduction of Precision.'

AI Architecture 10. Padding and Pooling Hardware Issues

AI Architecture 9. Three Mappings of Conv Operations: Direct vs. Im2Col vs. Winograd

AI Architecture 8. CNN and Locality

AI Architecture 7. MLP and the Memory Wall

AI Architecture 6. INT8 Quantization Basics

[AI Architecture] 5. The Weight of Data (Number Formats): How FP32 Impacts Hardware Area and Power

Sitemap

Category

Information