In the era of Big Data Analytics, information dissemination, data integrity, and identifying unique records from large pool of data poses a big challenge for analysts in entity matching and linking scenarios. Data ingestion from multiple sources of same real-world entity exhibits several data quality issues like redundancy, incorrectness, variations, etc. Also, there are data input errors like typographical/spelling mistakes as well as missing fields. In order to achieve entity resolution, uniqueness and eradicate data redundancy and improve the data quality issues, deduplication is the solution. India being a multi-lingual and multi-cultural country with vast demographic variations, there is a need to develop India-centric model for handling deduplication on various Indian structured data held by various authorities. This research proposes a novel approach catering to India-centric demographic variations, region-specific naming conventions, address standardization using a highly customizable and scalable deep learning approach, by customizing DeepMatcher algorithm along with a synthetic data generation tool reckoning Indian variations of names and addresses in a region-specific manner. © 2022, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.