A Study into The Limitations of CNN Recognition on Isolated Bengali Compound Characters
View/ Open
Date
2023-07Author
Zia, Tasnim
Datta, Ankur
Noor, Mohammad Raghib
Amin, M Ashraful
Ali, Amin Ahsan
Rahman, A K M Mahbubur
Metadata
Show full item recordAbstract
There are over 265 million Bangla native and non-native speakers, however, the advancements in Bangla Optical Character Recognition is falling behind when compared with other languages because of a broader set of complex characters, multiple handwriting styles, and a lack of datasets. Convolutional Neural Network models have been highly successful in detecting the handwritten alphabet scripts. However, we found that nowadays, two staged detectors, such as CNN-RNN, Encoder-Decoders, Vision Transformers have been doing much better than pure CNNs in pattern recognition and Bengali Compound Character Recognition. In order to understand why it is so, we chose five commonly used pretrained CNN models from Pytorch: VGG-16, ResNet-50, ResNet-101, Wide ResNet-50-2, and ResNeXt-50-32x4d to classify the characters and compare their performances. Grad-CAM and Grad-CAM++ were used to generate heatmaps to see the key areas that the models focused on while classifying. We found pattern problems in Bangla compound characters along with problematic perceptions in our finetuned CNNs that we have thus listed with detailed analysis.
Collections
- 2023 [67]