About

I’m a Senior Research Fellow (Ramón y Cajal RYC2020-030777-I) at the Universitat Autonoma de Barcelona. My research interests include a variety of different topics in machine learning and computer vision. Currently I work on embeding models, multimodal self-supervised learning, joint modelling of textual and visual information, and single shot CNN architectures for scene text understanding.

Between 2018 and 2020 I had a TECNIOspring Research Fellow position (H2020 Marie Skłodowska-Curie actions of the European Union) at the Computer Vision Center (CVC). I have done research stays at the Media Integration and Communication Center (MICC) - University of Florence, and the Intelligent Media Processing Group - Osaka Prefecture University, Japan. I have also collaborated with other prominent research groups in the organization of the ICDAR Robust Reading Competitions.

Curriculum vitae

email: lgomez {AT} cvc.uab.es
tel: +34 93 581 18 28
address: Edifici O, Campus UAB - 08193 Bellaterra (Cerdanyola). Barcelona, Spain.

Research




Single shot CNN architectures for scene text understanding

End-to-end scene text recognition pipelines are commonly based in a two-stage approach, first applying a text localization algorithm to the input image and then recognizing the text present in the cropped bounding boxes provided by the detector.

In this project we study the use of single shot CNN architectures for scene text understanding tasks. For example, a Fully Convolutional Network that, given a scene image, is able to predict at the same time bounding boxes and a compact text representation of the words within them; or a CNN architecture for text spotting that is able to directly output readings without any explicit text localization step, and can be trained in an end-to-end manner.

  • Andrés Mafla, Rubèn Tito, Sounak Dey, Lluís Gómez, Marçal Rusiñol, Ernest Valveny, and Dimosthenis Karatzas (2021). "Real-time lexicon-free scene text retrieval." Pattern Recognition. [PDF]
  • Lluis Gomez, Andres Mafla, Marçal Rusiñol, & Dimosthenis Karatzas. (2018). "Single Shot Scene Text Retrieval." European Conference on Computer Vision (ECCV). [PDF] [CODE]
  • Lluis Gomez, Marçal Rusiñol & Dimosthenis Karatzas. (2018). "Cutting Sayre's Knot: Reading Scene Text without Segmentation. Application to Utility Meters". In 13th IAPR Workshop on Document Analysis Systems (DAS). [PDF]



  • Deep-embeddings of images into text topic spaces

    Topic modeling frameworks, such as the Latent Dirichlet Allocation (LDA) algorithm, are statistical models for discovering the the latent topics that occur in a corpus of textual documents. This way, each individual text document can be represented as a probability distribution over the set of discovered topics, and thus can be projected to a point in a topic space.

    Our research puts forward the idea of embedding images into text topic spaces by mining a large scale collection of multi-modal (text and image) documents. To do so we first learn a topic model on the text corpus of a dataset composed by pairs of correlated texts and images. Then, we train a deep CNN model to predict text representations (topic-probabilities) directly from the image pixels. In other words the learned topic model teaches the CNN to predict the semantic context of images.

    This deep-embedding framework can be used to perform different tasks, such as self-supervised learning of visual features, multi-modal image retrieval, or even to generate contextualized lexicons for scene text recognition.

  • Lluis Gomez, Yash Patel, Marçal Rusiñol, Dimosthenis Karatzas, & C.V. Jawahar. (2017). "Self-supervised learning of visual features through embedding images into text topic spaces." In Proc. International Conference on Computer Vision and Pattern Recognition, CVPR 2017. [PDF] [CODE]
  • Yash Patel, Lluis Gomez, Marçal Rusiñol, & Dimosthenis Karatzas. (2016). "Dynamic Lexicon Generation for Natural Scene Images." In Proc. 2nd International Workshop on Robust Reading, ECCV Workshops 2016. [PDF]
  • Raul Gomez, Lluis Gomez, Jaume Gibert, & Dimosthenis Karatzas. (2018). "Learning to Learn from Web Data through Deep Semantic Embeddings". In 1st Multimodal Learning and Applications Workshop, ECCV Workshops 2018.
  • Raul Gomez, Lluis Gomez, Jaume Gibert, & Dimosthenis Karatzas. (2018). "Learning from #Barcelona Instagram data". In 1st Multimodal Learning and Applications Workshop, ECCV Workshops 2018.

  • Scene text detection with Fully Convolutional Networks

    Text Proposals have emerged as a class-dependent version of object proposals -- efficient approaches to reduce the search space of possible text object locations and extents in an image. Combined with strong word classifiers, text proposals currently yield top state-of-the-art results in end-to-end scene text recognition. In this paper we propose an improvement over the original Text Proposals algorithm (Gomez and Karatzas 2015), combining it with Fully Convolutional Networks to improve the ranking of proposals. Results on the ICDAR RRC and the Coco-text datasets show superior performance over the current state-of-the-art.

  • Dena Bazazian, Raul Gomez, Anguelos Nicolaou, Lluis Gomez, Dimosthenis Karatzas, & Andrew D. Bagdanov. (2017). "FAST: Facilitated and Accurate Scene Text Proposals through FCN Guided Pruning." Pattern Recognition Letters.
  • Dena Bazazian, Raul Gomez, Anguelos Nicolaou, Lluis Gomez, Dimosthenis Karatzas, & Andrew D. Bagdanov. (2016). "Improving Text Proposals for Scene Images with Fully Convolutional Networks." In Proc. 1st International Workshop on Deep Learning for Pattern Recognition, ICPR Workshops 2016. [PDF]

  • Patch-based scene text script identification

    This work focuses on the problem of script identification in scene text images. Facing this problem with state of the art CNN classifiers is not straightforward, as they fail to address a key characteristic of scene text instances: their extremely variable aspect ratio. Instead of resizing input images to a fixed aspect ratio as in the typical use of holistic CNN classifiers, we propose here a patch-based classification framework in order to preserve discriminative parts of the image that are characteristic of its class.

    We describe a novel method based on the use of ensembles of conjoined networks to jointly learn discriminative stroke-parts representations and their relative importance in a patch-based classification scheme. Our experiments with this learning procedure demonstrate state-of-the-art results in two public script identification datasets.

  • Lluis Gomez, Anguelos Nicolaou, & Dimosthenis Karatzas. (2017). "Improving Patch-based Scene Text Script Identification with Ensembles of Conjoined Networks." Pattern Recognition. [PDF | CODE]
  • Lluis Gomez, & Dimosthenis Karatzas. (2016). "A Fine Grained Classification Approach to Scene Text Script Identification." In Proc. Document Analysis Systems (DAS), 2016 12th IAPR International Workshop on. IEEE, 2016. [PDF | CODE]

  • Exploting Similarity Hierarchies for Multi-script Scene Text Understanding

    Optical Character Recognition (OCR) is nowadays considered a solved problem when a clean binarized and well formatted input image, with text in a standard font and language, is provided. On the contrary, the automated localization, extraction and recognition of "scene text" in uncontrolled environments is still an open Computer Vision problem. At the core of the problem lies the extensive variability of scene text in terms of its location, rotation, physical appearance and design.

    Scene text extraction methodologies have been traditionally based in classification of individual regions or patches, using a priori knowledge for a given script or language. Human perception of text, on the other hand, is based on perceptual organisation through which text emerges as a perceptually significant group of atomic objects. Therefore humans are able to detect text even in languages and scripts never seen before. My research revolves around these ideas and poses the text extraction problem as the detection of meaningful groups of regions. I'm working in a text detection method built around a perceptual organisation framework that exploits collaboration of proximity and similarity laws to create text-group hypotheses.

  • Lluis Gomez, & Dimosthenis Karatzas. (2017). "TextProposals: a Text-specific Selective Search Algorithm for Word Spotting in the Wild." Pattern Recognition. [PDF | CODE]
  • Lluis Gomez, & Dimosthenis Karatzas. (2016). "A Fast Hierarchical Method for Multi-script and Arbitrary Oriented Scene Text Extraction." International Journal on Document Analysis and Recognition. [PDF]
  • Lluis Gomez, & Dimosthenis Karatzas. (2015). "Object Proposals for Text Extraction in the Wild " In Proc. 13th International Conference on Document Analysis and Recognition. [PDF | CODE]
  • Lluis Gomez, & Dimosthenis Karatzas. (2014). "MSER-based Real-Time Text Detection and Tracking " In Proc. 22nd International Conference on Pattern Recognition (pp. 3110–3115). [PDF]
  • Lluis Gomez, & Dimosthenis Karatzas. (2014). "Scene Text Recognition: No Country for Old Men?" In Proc. 1st International Workshop on Robust Reading, ACCV Workshops. [CODE]
  • Lluis Gomez, & Dimosthenis Karatzas. (2013). "Multi-script Text Extraction from Natural Scenes " In Proc. 12th International Conference on Document Analysis and Recognition (pp. 467–471). [PDF | CODE]
  • Publications

    2023
    Khanh Nguyen, Ali Furkan Biten, Andres Mafla, Lluis Gomez and Dimosthenis Karatzas (2023). "Show, Interpret and Tell: Entity-aware Contextualised Image Captioning in Wikipedia" 37th AAAI Conference on Artificial Intelligence (AAAI-23).
    Mohamed Ali Souibgui, Sanket Biswas, Andres Mafla, Ali Furkan Biten, Alicia Fornés, Yousri Kessentini, Josep Lladós, Lluis Gomez and Dimosthenis Karatzas (2023). "Text-DIAE: A Self-Supervised Degradation Invariant Autoencoders for Text Recognition and Document Enhancement" 37th AAAI Conference on Artificial Intelligence (AAAI-23).
    Francesc Net, Marc Folia, Pep Casals-Puig and Lluis Gomez (2023). "Transductive Learning for Near-Duplicate Image Detection in Scanned Photo Collections." International Conference on Document Analysis and Recognition (ICDAR 2023).
    2022
    Adria Molina, Lluis Gomez, Oriol Ramos Terrades and Josep Lladós (2022). "A Generic Image Retrieval Method for Date Estimation of Historical Document Collections." IAPR Workshop on Document Analysis Systems (DAS).
    Emanuele Vivoli, Ali Furkan Biten, Andres Mafla, Dimosthenis Karatzas and Lluis Gomez (2022). "MUST-VQA: MUltilingual Scene-Text VQA." European Conference on Computer Vision (ECCV 2022) Workshops.
    Josep Brugués i Pujolràs, Lluis Gomez and Dimosthenis Karatzas (2022). "A Multilingual Approach to Scene Text Visual Question Answering." IAPR Workshop on Document Analysis Systems (DAS).
    Ali Furkan Biten, Ruben Tito, Lluis Gomez, Ernest Valveny and Dimosthenis Karatzas (2022). "OCR-IDL: OCR Annotations for Industry Document Library Dataset." European Conference on Computer Vision (ECCV 2022) Workshops.
    Ali Furkan Biten, Andres Mafla, Lluis Gómez and Dimosthenis Karatzas (2022). "Is An Image Worth Five Sentences? A New Look into Semantics for Image-Text Matching." Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV).
    Ali Furkan Biten, Lluis Gómez and Dimosthenis Karatzas (2022). "Let there be a clock on the beach: Reducing Object Hallucination in Image Captioning." Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV).
    Mohamed Ali Souibgui, Ali Furkan Biten, Sounak Dey, Alicia Fornés, Yousri Kessentini, Lluis Gomez, Dimosthenis Karatzas and Josep Lladós (2022). "One-shot Compositional Data Generation for Low Resource Handwritten Text Recognition." Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV).
    2021
    Lluis Gómez, Ali F. Biten, Ruben Tito, Andres Mafla, Marçal Rusiñol, Ernest Valveny, and Dimosthenis Karatzas (2021). "Multimodal grid features and cell pointers for scene text visual question answering." Pattern Recognition Letters (2021).
    Minesh Mathew, Lluis Gomez, Dimosthenis Karatzas, C V Jawahar (2021). "Asking Questions on Handwritten Document Collections." Int. Journal on Document Analysis and Recognition (IJDAR 2021).
    Adrià Molina, Pau Riba, Lluis Gomez, Oriol Ramos-Terrades, and Josep Lladós (2021). "Date Estimation in the Wild of Scanned Historical Photos: An Image Retrieval Approach." International Conference on Document Analysis and Recognition (ICDAR 2021).
    Pau Riba, Adrià Molina, Lluis Gomez, Oriol Ramos-Terrades, and Josep Lladós (2021). "Learning to Rank Words: Optimizing Ranking Metrics for Word Spotting." International Conference on Document Analysis and Recognition (ICDAR 2021).
    Andrés Mafla, Rubèn Tito, Sounak Dey, Lluís Gómez, Marçal Rusiñol, Ernest Valveny, and Dimosthenis Karatzas (2021). "Real-time lexicon-free scene text retrieval." Pattern Recognition 110 (2021).
    Andrés Mafla, Rafael S. Rezende, Lluís Gómez, Diane Larlus, and Dimosthenis Karatzas (2021). "StacMR: Scene-Text Aware Cross-Modal Retrieval" Winter Conference on Applications of Computer Vision (WACV 2021).
    Andrés Mafla, Sounak Dey, Ali Furkan Biten, Lluis Gomez, and Dimosthenis Karatzas (2021). "Multi-modal reasoning graph for scene-text based fine-grained image classification and retrieval." Winter Conference on Applications of Computer Vision (WACV 2021).
    2020
    Klara Janouskova, Jiri Matas, Lluis Gomez, and Dimosthenis Karatzas (2020). "Text recognition-real world data and where to find them." International Conference on Pattern Recognition (ICPR 2020).
    Raul Gomez, Jaume Gibert, Lluis Gomez and Dimosthenis Karatzas (2020). "Location Sensitive Image Retrieval and Tagging." European Conference on Computer Vision (ECCV 2020).
    Sangeeth Reddy, Minesh Mathew, Lluis Gomez, Marçal Rusinol, Dimosthenis Karatzas and C. V. Jawahar (2020). "RoadText-1K: Text Detection & Recognition Dataset for Driving Videos." International Conference on Robotics and Automation (ICRA 2020).
    Andrés Mafla, Sounak Dey, Ali Furkan Biten, Lluis Gomez, Dimosthenis Karatzas (2020). "Fine-grained Image Classification and Retrieval by Combining Visual and Locally Pooled Textual Features." IEEE Winter Conference on Applications of Computer Vision (WACV 2020).
    Raul Gomez, Jaume Gibert, Lluis Gomez, Dimosthenis Karatzas (2020). "Exploring Hate Speech Detection in Multimodal Publications." IEEE Winter Conference on Applications of Computer Vision (WACV 2020).
    2019
    Ali Furkan Biten, Ruben Tito, Andres Mafla, Lluis Gomez, Marçal Rusiñol, Ernest Valveny, C.V. Jawahar, Dimosthenis Karatzas. (2019). "Scene Text Visual Question Answering." In Proc. International Conference on Computer Vision (ICCV).
    Marçal Rusiñol, Lluis Gomez, Adriaan Landman, Miguel Silva-Constenla, Dimosthenis Karatzas. (2019). "Automatic Structured Text Reading for License Plates and Utility Meters" In Proc. BMVC 2019 - Workshop on Visual AI and Entrepreneurship (VAIE).
    Yash Patel, Lluis Gomez, Marçal Rusiñol, Dimosthenis Karatzas, CV Jawahar. (2019). "Self-Supervised Visual Representations for Cross-Modal Retrieval." In Proc. of the 2019 on International Conference on Multimedia Retrieval (ICMR).
    Raul Gomez, Ali Furkan Biten, Lluis Gomez, Jaume Gibert, Marçal Rusiñol, Dimosthenis Karatzas. (2019). "Selective Style Transfer for Text." In Proc. of the 15th International Conference on Document Analysis and Recognition (ICDAR).
    Ali Furkan Biten, Rubèn Tito, Andres Mafla, Lluis Gomez, Marçal Rusiñol, Minesh Mathew, CV Jawahar, Ernest Valveny, Dimosthenis Karatzas. (2019). "ICDAR 2019 Competition on Scene Text Visual Question Answering." In Proc. of the 15th International Conference on Document Analysis and Recognition (ICDAR).
    Ali Furkan Biten, Lluis Gomez, Marçal Rusinol, Dimosthenis Karatzas. (2019). "Good News, Everyone! Context driven entity-aware captioning for news images." In Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
    Raul Gomez, Lluis Gomez, Jaume Gibert, Dimosthenis Karatzas. (2019). "Self-Supervised Learning from Web Data for Multimodal Retrieval" Book Chapter. In Multimodal Scene Understanding 1st Edition: Algorithms, Applications and Deep Learning (2019). Eds: M. Y. Yang, B. Rosenhahn, V. Murino. ISBN: 9780128173596. Elsevier / Academic Press.
    2018
    Lluis Gomez, Andres Mafla, Marçal Rusiñol, & Dimosthenis Karatzas. (2018). "Single Shot Scene Text Retrieval." European Conference on Computer Vision (ECCV).
    Raul Gomez, Lluis Gomez, Jaume Gibert, & Dimosthenis Karatzas. (2018). "Learning to Learn from Web Data through Deep Semantic Embeddings". In 1st Multimodal Learning and Applications Workshop, ECCV Workshops 2018.
    Raul Gomez, Lluis Gomez, Jaume Gibert, & Dimosthenis Karatzas. (2018). "Learning from #Barcelona Instagram data". In 1st Multimodal Learning and Applications Workshop, ECCV Workshops 2018.
    Lluis Gomez, Marçal Rusiñol & Dimosthenis Karatzas. (2018). "Cutting Sayre's Knot: Reading Scene Text without Segmentation. Application to Utility Meters". In 13th IAPR Workshop on Document Analysis Systems (DAS).
    Dimosthenis Karatzas, Lluis Gomez, Marçal Rusiñol & Anguelos Nicolaou. (2018). "The Robust Reading Competition Annotation and Evaluation Platform". In 13th IAPR Workshop on Document Analysis Systems (DAS).
    2017
    Lluis Gomez, Marçal Rusiñol, & Dimosthenis Karatzas. (2017). "LSDE: Levenshtein Space Deep Embedding for Query-by-string Word Spotting." In 14th International Conference on Document Analysis and Recognition (ICDAR).
    Raul Gomez, Baoguang Shi, Lluis Gomez, Lukas Numann, Andreas Veit, Jiri Matas, Serge Belongie, & Dismosthenis Karatzas. (2017). "ICDAR2017 Robust Reading Challenge on COCO-Text." In 14th International Conference on Document Analysis and Recognition (ICDAR).
    Masakazu Iwamura, Naoyuki Morimoto, Keishi Tainaka, Dena Bazazian, Lluis Gomez, & Dimosthenis Karatzas. (2017). "ICDAR2017 Robust Reading Challenge on Omnidirectional Video." In 14th International Conference on Document Analysis and Recognition (ICDAR).
    Dimosthenis Karatzas, Lluis Gomez, & Marçal Rusiñol. (2017). "The Robust Reading Competition Annotation and Evaluation Platform." In 1st International Workshop on Open Services and Tools for Document Analysis.
    Dena Bazazian, Raul Gomez, Anguelos Nicolaou, Lluis Gomez, Dimosthenis Karatzas, & Andrew D. Bagdanov. (2017). "FAST: Facilitated and Accurate Scene Text Proposals through FCN Guided Pruning." Pattern Recognition Letters.
    Lluis Gomez, Yash Patel, Marçal Rusiñol, Dimosthenis Karatzas, & C.V. Jawahar. (2017). "Self-supervised learning of visual features through embedding images into text topic spaces." IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017.
    Lluis Gomez, & Dimosthenis Karatzas. (2017). "TextProposals: a Text-specific Selective Search Algorithm for Word Spotting in the Wild." Pattern Recognition.
    Lluis Gomez, Anguelos Nicolaou, & Dimosthenis Karatzas. (2017). "Improving Patch-based Scene Text Script Identification with Ensembles of Conjoined Networks." Pattern Recognition.
    2016
    Lluis Gomez, & Dimosthenis Karatzas. (2016). "A Fast Hierarchical Method for Multi-script and Arbitrary Oriented Scene Text Extraction." International Journal on Document Analysis and Recognition.
    Dena Bazazian, Raul Gomez, Anguelos Nicolaou, Lluis Gomez, Dimosthenis Karatzas, & Andrew D. Bagdanov. (2016). "Improving Text Proposals for Scene Images with Fully Convolutional Networks." In Proc. 1st International Workshop on Deep Learning for Pattern Recognition, ICPR Workshops 2016.
    Yash Patel, Lluis Gomez, Marçal Rusiñol, & Dimosthenis Karatzas. (2016). "Dynamic Lexicon Generation for Natural Scene Images." In Proc. 2nd International Workshop on Robust Reading, ECCV Workshops 2016.
    Lluis Gomez, & Dimosthenis Karatzas. (2016). "A Fine Grained Classification Approach to Scene Text Script Identification." In Proc. Document Analysis Systems (DAS), 2016 12th IAPR International Workshop on. IEEE, 2016.
    Anguelos Nicolaou, Lluis Gomez, & Dimosthenis Karatzas. (2016). "Visual Script and Language Identification." In Proc. Document Analysis Systems (DAS), 2016 12th IAPR International Workshop on. IEEE, 2016.
    2015
    Lluis Gomez, & Dimosthenis Karatzas. (2015). "Object Proposals for Text Extraction in the Wild " In Proc. 13th International Conference on Document Analysis and Recognition.
    Dimosthenis Karatzas, Lluis Gomez, A.Nicolaou, Suman Ghosh, Andrew Bagdanov, Masakazu Iwamura, et al. (2015)."ICDAR 2015 Competition on Robust Reading" In Proc. 13th International Conference on Document Analysis and Recognition (pp. 1156–1160).
    Suman Ghosh, Lluis Gomez, Dimosthenis Karatzas, & Ernest Valveny. (2015). "Efficient indexing for Query By String text retrieval " In Proc. 6th IAPR International Workshop on Camera Based Document Analysis and Recognition (pp. 1236–1240).
    2014
    Dimosthenis Karatzas, Sergi Robles, & Lluis Gomez. (2014). "An on-line platform for ground truthing and performance evaluation of text extraction systems " In Proc. 11th IAPR International Workshop on Document Analysis and Systems (pp. 242–246).
    Lluis Gomez, & Dimosthenis Karatzas. (2014). "MSER-based Real-Time Text Detection and Tracking " In Proc. 22nd International Conference on Pattern Recognition (pp. 3110–3115).
    Lluis Gomez, & Dimosthenis Karatzas. (2014). "Scene Text Recognition: No Country for Old Men?" In Proc. 1st International Workshop on Robust Reading, ACCV Workshops.
    2013
    Dimosthenis Karatzas, Faisal Shafait, Seiichi Uchida, Masakazu Iwamura, Lluis Gomez, Sergi Robles, et al. (2013). "ICDAR 2013 Robust Reading Competition " In Proc. 12th International Conference on Document Analysis and Recognition (pp. 1484–1493).
    Lluis Gomez, & Dimosthenis Karatzas. (2013). "Multi-script Text Extraction from Natural Scenes " In Proc. 12th International Conference on Document Analysis and Recognition (pp. 467–471).

    Code

    If you are looking for code implementations of my research papers, you may be good visiting my GitHub repositories.

    I am an enthusiast of Free Software in general and an advocate of the GNU/Linux project. I have contributed to several Open Source projects, e.g. the Pure Data visual programming language, the OpenCV Computer Vision library, the Giss open broadcast platform, and the FreeJ video mixer.

    I have been selected by the OpenCV Foundation for participation in the Google Summer of Code program as a developer (2013 and 2014 editions) and as a mentor (2016).

    Teaching

    2018/19: 104389 - Object-Oriented Programming (B.Sc. Computational Mathematics and Data Analytics), Assistant Professor at Universitat Autonoma de Barcelona (UAB).

    2018/19: 104338 - Advanced Programming (B.Sc. Data Engineering), Assistant Professor at Universitat Autonoma de Barcelona (UAB).

    2018/19: M3 - Machine Learning for Computer Vision (MSc. Computer Vision), Invited Lecturer at Universitat Autonoma de Barcelona (UAB).

    2018/19: M5 - Visual Recognition (MSc. Computer Vision), Assistant Professor at Universitat Autonoma de Barcelona (UAB).

    2018/19: Artificial Intelligence (B.S. Interanctive Digital Content), Lecturer at ENTI (School of New Interactive Technologies) - Universitat de Barcelona (UB).

    2017/18: M3 - Machine Learning for Computer Vision (MSc. Computer Vision), Invited Lecturer at Universitat Autonoma de Barcelona (UAB).

    2017/18: M5 - Visual Recognition (MSc. Computer Vision), Assistant Professor at Universitat Autonoma de Barcelona (UAB).

    2017/18: Artificial Intelligence (B.S. Interanctive Digital Content), Lecturer at ENTI (School of New Interactive Technologies) - Universitat de Barcelona (UB).

    2016/17: M5 - Visual Recognition (MSc. Computer Vision), Assistant Professor at Universitat Autonoma de Barcelona (UAB).

    2014/15: 43340 - Pattern Recognition (MSc. Computer Engineering), Assistant Professor at Universitat Autonoma de Barcelona (UAB).

    2011/12: Computer Vision (MSc. Visual Arts), Invited Lecturer at Universidad Politécnica de Valencia (UPV).

    Others

    Reviewer for CVPR, ICCV, ECCV, ICLR, International Journal on Computer Vision, Transactions on Pattern Analysis and Machine Intelligence, Computer Vision and Image Understanding, International Journal on Document Analysis and Recognition, Neourocomputing, Transactions on Image Processing, Packt Publishing.
    Workshop Chair, International Workshop on Camera Based Document Analysis (2019).
    Workshop Chair, International Workshop on Robust Reading (2018).
    Programme Committee member, Document Analisys Systems (2018).
    Programme Committee member, First Workshop on Computer Vision for Fashion, Art and Design (2018).
    Area Chair, International Conference on document Analysis and Recognition (2017).
    Workshop Chair, International Workshop on Camera Based Document Analysis (2017).
    Programme Committee member, International Workshop on Robust Reading (2014, 2016).
    Programme Committee member, Workshop on Camera Based Document Analysis (2013, 2015).
    Member, Computer Vision Foundation, 2017-Present.
    Member, International Association of Pattern Recognition, 2013–Present.

    I have developed part of my career in the crossroads between Arts and Computer Science. Between 2005 and 2010 I had an engineer position in Hangar, a Visual Arts Production and Research Center in Barcelona, where I was responsible for the area of Software Development in the MediaLab, directly involved in free software development for art projects. During that time I had the opportunity to work with amazingly creative persons like Antoni Abad, Daniel G. Andújar, Ricardo Iglesias, Shu Lea Cheang, Salud Lopez, Straddle3, Simona Levi, Xavi Manzanares, Hackitectura, Denis Roio (a.k.a. Jaromil), minipimer, Oscar Martin, PlayModes, Ramiro Cosentino, Pedro Soler, Sergi Lario, Telenoika, Yves Degoyon, among many others.

    I love music. I play the guitar and recently I started to learn piano in my spare time.

    I love mountains and natural environment in general, when I have time I like to visit Montserrat, Pirineus, or Costa Brava.