A Study into The Limitations of CNN Recognition on Isolated Bengali Compound Characters

Zia, Tasnim; Datta, Ankur; Noor, Mohammad Raghib; Amin, M Ashraful; Ali, Amin Ahsan; Rahman, A K M Mahbubur

View/Open

51.pdf (1.310Mb)

Date

2023-07

Author

Zia, Tasnim

Datta, Ankur

Noor, Mohammad Raghib

Amin, M Ashraful

Ali, Amin Ahsan

Rahman, A K M Mahbubur

Metadata

Show full item record

Abstract

There are over 265 million Bangla native and non-native speakers, however, the advancements in Bangla Optical Character Recognition is falling behind when compared with other languages because of a broader set of complex characters, multiple handwriting styles, and a lack of datasets. Convolutional Neural Network models have been highly successful in detecting the handwritten alphabet scripts. However, we found that nowadays, two staged detectors, such as CNN-RNN, Encoder-Decoders, Vision Transformers have been doing much better than pure CNNs in pattern recognition and Bengali Compound Character Recognition. In order to understand why it is so, we chose five commonly used pretrained CNN models from Pytorch: VGG-16, ResNet-50, ResNet-101, Wide ResNet-50-2, and ResNeXt-50-32x4d to classify the characters and compare their performances. Grad-CAM and Grad-CAM++ were used to generate heatmaps to see the key areas that the models focused on while classifying. We found pattern problems in Bangla compound characters along with problematic perceptions in our finetuned CNNs that we have thus listed with detailed analysis.

URI

https://ar.iub.edu.bd/handle/123456789/620

Collections

2023 [67]