Projects
Current Projects
InterVisions – Participatory AI for Intersectional Bias Auditing (2025–2026)
Funding: EU – CERV | Grant ID: 101214711 | Budget: €245,417.34
More info
Coordinated by: ALIA – Associació Cultural de Dones per a la Recerca i l’Acció
Participants:
– Centre de Visió per Computador (CVC-UAB), Research Organisation
– Diputació de Barcelona, Associated Partner
Goal:
InterVisions aims to build a participatory bias audit tool for vision and language foundation models. It integrates intersectional feminist theory, deep learning, and participatory AI practices to identify and mitigate social biases in large-scale multimodal AI systems.
Activities:
– Community-driven workshops to audit foundation models
– Co-creation of a technical fairness benchmark
– Development of intersectional impact assessment guidelines
– Promotion of ethical AI practices in line with the EU Charter of Fundamental Rights
Keywords: Bias in AI, Ethical AI, Participatory AI, Intersectionality, Fairness Benchmark, Vision & Language Models
Project Website: TBD
FairCLIP – Training a Fair CLIP Model with Hybrid Real and Synthetic Data (2024–2025)
Funding: EuroHPC AI & Data-Intensive Applications Access Call | Grant ID: EHPC-AI-2024A02-040 | Resources: 32,000 node hours on MareNostrum5
More info
Coordinated by: Universitat Autònoma de Barcelona / Computer Vision Center (Spain)
Team:
– Dr. Lluis Gomez (PI) • Dr. Lei Kang • Dr. Mohamed Ali Souibgui • Mr. Francesc Net • Mr. Joan Masoliver • Dr. Sonia Ruiz • Prof. Yuki M. Asano (University of Amsterdam)
Objective:
The FairCLIP project aims to mitigate bias in large-scale vision-language models by training a new CLIP model on a hybrid dataset combining real and synthetic data, ensuring balanced demographic representation. The project contributes to fairness in AI with both technical and ethical innovations.
Key Methods:
– Synthetic data generation via state-of-the-art diffusion models
– Real data from the CommonPool dataset
– OpenCLIP framework for scalable training
– Contrastive learning with demographic control
Milestones:
– Small-scale (12M samples), medium-scale (128M), and large-scale (400M) experiments
– Total: 32,000 node hours over 12 months (Aug 2024–Jul 2025)
Expected Outcomes:
– A fairness-optimized CLIP model
– A reusable hybrid dataset
– Open-source technical deliverables
Keywords: Fair AI, CLIP, Synthetic Data, Bias Mitigation, Diffusion Models, Vision-Language Models, HPC
Code: FairCLIP GitHub Repository
COELI-IA – From Text to Media: A Paradigm Shift in Cultural Heritage Management (2023–2025)
Funding: INNOTEC R+D Grant (Catalonia) | Grant ID: RDECR20/EMT/1791/2021 | Budget: €195,530.02
More info
Coordinated by: Nubilum SL (SME)
Research Partner: Centre de Visió per Computador (CVC), Universitat Autònoma de Barcelona
Objective:
COELI-IA aims to revolutionize the management and dissemination of cultural heritage content by leveraging AI techniques. The project explores automatic classification, indexing, and enhanced accessibility for digital archives through multimodal models that can understand and connect text and media data.
Key Innovations:
– Development of AI-driven cultural heritage content engines
– New interfaces and recommendation systems based on content relevance
– Fine-tuning of AI models for domain-specific archives
Funding Structure:
– Total accepted budget: €195,530.02
– CVC share: €84,446.05 (43.19%)
– Nubilum SL share: €111,083.98 (56.81%)
Team:
– Dr. Lluís Gómez (CVC Lead) – Pep Casals Pug (Nubilum Lead) – Marc Folia Campos (Nubilum) – Francesc Net Barnes (CVC research staff)
Keywords: Cultural Heritage, AI for Archives, Multimodal Indexing, Recommendation Systems, Computer Vision, NLP
More Info: Video • coeli.cat • cvc.uab.cat
Past Projects
ReadQA – Reading systems for Visual Question Answering
Funded by: Ministerio de Ciencia e Innovación (PID2020-116298GB-I00) • €89,419
Period: Jan 2021 – Dec 2023 • PIs: Lluis Gomez & Dimosthenis Karatzas
Aimed to improve scene-text-based VQA systems using advanced multimodal models.BeARS – Beyond Automatic Reading Systems
Funded by: AGAUR (Catalan University and Research Agency) • €97,000
Period: 2020–2021 • PIs: M. Russiñol & Lluis Gomez
Focused on broadening the capabilities of reading systems beyond OCR.DeepPhotoArchive
Funded by: TECNIOspring PLUS / H2020 MSCA / ACCIÓ • €113,339
Period: 2018–2020 • PI: Lluis Gomez
Applied deep learning to build semantic search engines for photo archives.READS – Reading the Scene
Funded by: Ministerio de Economía, Industria y Competitividad (TIN2017-89779P) • €81,554
Period: 2018–2020 • PIs: D. Karatzas & E. Valveny
Core research on text-in-scene interpretation and representation.Semantic Search in Digital Newspaper Libraries
Funded by: Fundación BBVA • €74,526
Period: 2018–2019 • PI: M. Russiñol • Role: Core Researcher
Developed multimodal search tools for historical digital newspapers.RAW – Reading in the Wild
Funded by: Ministerio de Economía y Competitividad (TIN2014-52072P) • €109,021
Period: 2015–2017 • PI: D. Karatzas • Role: Core Researcher
Addressed robust scene text understanding in unconstrained environments.Text and the City – Human-Centred Scene Text Understanding
Funded by: Ministerio de Ciencia e Innovación (TIN2011-24631) • €78,045
Period: 2012–2014 • PI: D. Karatzas • Role: Core Researcher
Explored user-centric models for text interpretation in urban imagery.Knowledge Extraction from Document Images with Heterogeneous Contents
Funded by: Ministerio de Ciencia e Innovación (TIN2009-14633-C03-03) • €195,000
Period: Jan 2010 – Aug 2013 • PI: J. Lladós • Role: Core Researcher
Investigated document image understanding for structured and unstructured content.HuPerText – Human Perception Inspired Text Technologies
Funded by: Ministerio de Ciencia e Innovación (TIN2008-04998) • €49,610
Period: 2009–2011 • PI: D. Karatzas • Role: Core Researcher
Focused on perceptually motivated scene text modeling and reading.