Improving Access to Swiss Open Government Data through Fuzzy Human-Centered Information Retrieval
Supervisors: Janick Spycher, Edy Portmann
Contact person: Janick Spycher
Student: Deep Shukla
Project status: Ongoing
Year: 2026
Abstract:
This master’s thesis investigates how access to Swiss Open Government Data (OGD) can be improved through an explainable and human-centered information retrieval approach. Switzerland provides a national OGD portal, opendata.swiss, which aggregates thousands of datasets published by federal, cantonal, and municipal institutions. Although the portal offers extensive access to public datasets, users often face difficulties when searching for relevant information due to inconsistent metadata quality, multilingual descriptions, and vague or exploratory search queries. The project examines the use of fuzzy logic as a method for modeling vague concepts commonly expressed in dataset search queries, such as “recent,” “complete,” or “relevant.” Instead of relying solely on traditional keyword-based search techniques, the proposed approach introduces a fuzzy metadata-based ranking mechanism that evaluates multiple dataset attributes simultaneously. These attributes may include temporal recency, thematic similarity, and metadata completeness. A prototype retrieval system will be developed that operates on metadata obtained from theopendata.swiss portal through its CKAN API. The system will incorporate query processing, fuzzy inference rules, and explainable ranking outputs that allow users to understand why specific datasets are prioritized in search results. To assess the effectiveness of the proposed approach, the fuzzy ranking mechanism will be compared with traditional keyword-based retrieval methods and, where feasible, with semantic retrieval techniques based on vector embeddings. The evaluation will consider quantitative performance metrics such as precision, recall, and F1-score, as well as qualitative user-centered assessments focusing on usability, transparency, and user trust. The project aims to demonstrate that fuzzy logic-based retrieval can improve dataset discoverability while maintaining interpretability and transparency, which are essential requirements for public-sector digital infrastructures.
Required Skills: Python, JavaScript, Django Framework, HTML, CSS, React.JS, theoretical modeling, conceptual framework development, methodological design Python programming, Data analysis and data processing, Information retrieval fundamentals, Basic knowledge of machine learning or artificial intelligence, Familiarity with APIs and web data sources, Basic understanding of fuzzy logic or rule-based systems, Experience with libraries such as pandas, scikit-learn, or scikit-fuzzy is beneficial but not strictly required.
Keywords: Fuzzy Logic, Information Retrieval, Human-Centered Information Retrieval, Explainable AI, Open Government Data, Dataset Discoverability, Metadata Ranking, CKAN, Data Portals Document: Not yet available