Sindhi has become the first language from Pakistan to be selected for digitisation by the Universal Dependencies (UD) – a project of Stanford University and Google.
Developed in 2005, UD is an ongoing project that converts languages into machine-readable formats. To date, it has selected 100 languages, including Sindhi, out of the 6,000 languages being spoken by humans globally.
The project has also picked up Urdu – but as a language from India as contributors from the neighbouring country proposed it.
“It is a major breakthrough for Sindhi to be included in the UD digital platform,” said Dr Mazhar Dootio, a computational linguist. “Including Sindhi in the framework means it will now become a universal language.”
He was part of the team that applied for Sindhi’s registration.
Speaking to Geo.tv, Dootio said digitisation is depleting world languages faster than ever in human history. “English is already digitised therefore it is secure.”
The United Nations says 43% of the estimated 6,000 languages spoken in the world are “endangered” and the United Nations Educational, Scientific and Cultural Organization (UNESCO) has warned that half of these will be extinct by the end of this century.
One of the oldest languages in the world, Sindhi is written in right-handed Perso-Arabic script. It is linked to the Indus Civilisation with first recorded script samples found during excavation at the Mohenjo Daro in Sindh.
“Historians are working to decode the script found from Mohenjo Daro,” said Sindhi Language Authority chairman Professor Dr. Muhammad Ali Manjhi. He added that Sindhi was declared the official language of Sindh during the British Era. “Following which, the region saw prosperity. We see lots of books and literature published in Sindhi during British Raj and onwards.”
According to Britain-based World Mapper, Sindhi is spoken by approximately 24 million people in at least nine territories.
Dootio said the language will now be accessible for online translation through more than 150 treebanks. “It can be digitised in the next 10 years. If the government helps, which it is not doing at the moment, the process can be completed in six to seven years. Once Sindhi is digitised, it will be among the very few languages that have been digitised completely.”
Zulfiqar Kunbhar is a Sindh-based freelance journalists. He tweets @zulfiqarkunbhar