Open Data Editor in Action: Advancing Data Quality in Academic Library Services in India

11 Dezembro, 2025
Geral

Publicado em: Open Data Editor in Action: Advancing Data Quality in Academic Library Services in India

This text shows a real case of how the Open Data Editor (ODE) impacted the workflow of an organisation working to serve the public good.

This is a corridor in the Central Library at IIT Delhi, which has a catalogue of over 300,000 documents, including books, theses, journals and e-resources. Photo: Priyanshu Sharma.

Organisation: Indian Institute of Technology (IIT) Delhi
Location: Delhi, India 🇮🇳
Knowledge Area: Academic Research
Type of Data: Bibliographic Data, Library Catalogues

At IIT Delhi, the library is a hub of knowledge, managing vast datasets related to academic publications and library management. They integrated the Open Data Editor into their workflow to address persistent data quality issues, using it not only for internal data cleaning but also as a core tool to educate a wider academic community on the importance of clean, reliable data.

The Challenge

The library team handles complex datasets, including bibliographic information from major indexes like Scopus and Web of Science, as well as internal library management system reports. These datasets are crucial for analysing publication trends, identifying prolific authors, and understanding citation patterns. However, they were often marred by inconsistencies that hindered analysis.

Inconsistent Formatting: Key fields like journal volume numbers contained varied entries (e.g., ‘Vol. 1’, ‘Volume 1’, ‘V1’), making it difficult to standardise and analyse.
Missing Information and Blank Cells: Bibliographic data was often incomplete for some articles, leading to gaps in the dataset.
Manual, Time-Consuming Checks: The existing process involved manually filtering through spreadsheets to identify errors – a slow and imperfect method that was impractical for large datasets.

*The library needs to manage large amounts of data as shown in the above massive spreadsheet*

The Solution

The team began using the Open Data Editor to systematically clean their datasets before analysis. The process involved importing their CSV and Excel files into ODE to leverage its automated profiling capabilities.

Key features that supported their work were:

Automated Error Detection: ODE automatically flagged a range of issues, including blank cells, type mismatches, and extra labels, providing a clear and immediate visual assessment of data quality.
Data Standardisation: By identifying formatting inconsistencies, ODE provided a foundation for the team to clean and standardise fields, creating more reliable datasets for their reports and website.
A Pedagogical Tool: Beyond internal use, ODE became the centrepiece of training sessions that Mohit’s team conducted for over 200 researchers and academicisans, visually demonstrating the concepts of data quality and the importance of metadata.

*Open Data Editor helps automate error detection; above, problems with blank cells and the data types expected for a given column.*

The Results

Adopting ODE has reinforced the library’s mission and improved its operational efficiency.

Improved Data Rigour: The team can now quickly assess which variables are reliable enough for analysis, ensuring their internal reports and public-facing data are built on a solid foundation.
Community-Wide Data Literacy: Through workshops, ODE has helped shift the mindset of librarians and researchers, teaching them that data quality is as important as image quality and that managing data is an ongoing process requiring regular effort.
Stronger Advocacy for Open Data: The tool and its accompanying MOOC course have been instrumental in promoting the principles of open data and the importance of the often-ignored metadata, contributing to a broader understanding of data as a public good.

Training session on Open Data Editor with Library Team of Ashoka University and IIT Delhi. More details of another session are here.

Quote

Mohit Garg, Assistant Librarian (SS)

“The idea of cleaning data and why it is important has been appreciated in the community. We have learned that data requires continuous effort and we need to regularly know about different quality aspects.”

About the Open Data Editor

The Open Data Editor (ODE) is Open Knowledge’s open source desktop application for nonprofits, data journalists, activists, and public servants, aiming at helping them detect errors in their datasets. It’s a free, open-source tool designed for people working with tabular data (Excel, Google Sheets, CSV) who don’t know how to code or don’t have the programming skills to automatise the data exploration process.

Simple, lightweight, privacy-friendly, and built for real-world challenges like offline work and low-resource settings, ODE is part of Open Knowledge’s initiative The Tech We Want — our ambitious effort to reimagine how technology is built and used. In October 2025, ODE was recognised as a digital public good by the Digital Public Goods Alliance.

And there’s more! ODE comes with a free online course that can help you improve the quality of your datasets, therefore making your life/work easier.

↪ Take the course: Learn how to use ODE

All of Open Knowledge’s work with the Open Data Editor is made possible thanks to a charitable grant from the Patrick J. McGovern Foundation. Learn more about its funding programmes here.

Fonte: Open Data Editor in Action: Advancing Data Quality in Academic Library Services in India
Feed: Open Knowledge Blog
Url: blog.okfn.org

The Challenge

The Solution

The Results

Quote

About the Open Data Editor

Share This