OPUS 4 | Suchen

The DECIMER (Deep lEarning for Chemical IMagE Recognition) project (2022)

Rajan, Kohulan ; Brinkhaus, Henning Otto ; Zielesny, Achim ; Steinbeck, Christoph

The DECIMER (Deep lEarning for Chemical IMagE Recognition) project (2022)

Rajan, Kohulan ; Brinkhaus, Henning Otto ; Zielesny, Achim ; Steinbeck, Christoph

Advancements in Hand-Drawn Chemical Structure Recognition through an Enhanced DECIMER Architecture (2024)

Rajan, Kohulan ; Brinkhaus, Henning Otto ; Zielesny, Achim ; Steinbeck, Christoph

Accurate recognition of hand-drawn chemical structures is crucial for digitising hand-written chemical information found in traditional laboratory notebooks or for facilitating stylus-based structure entry on tablets or smartphones. However, the inherent variability in hand-drawn structures poses challenges for existing Optical Chemical Structure Recognition (OCSR) software. To address this, we present an enhanced Deep lEarning for Chemical ImagE Recognition (DECIMER) architecture that leverages a combination of Convolutional Neural Networks (CNNs) and Transformers to improve the recognition of hand-drawn chemical structures. The model incorporates an EfficientNetV2 CNN encoder that extracts features from hand-drawn images, followed by a Transformer decoder that converts the extracted features into Simplified Molecular Input Line Entry System (SMILES) strings. Our models were trained using synthetic hand-drawn images generated by RanDepict, a tool for depicting chemical structures with different style elements. To evaluate the model's performance, a benchmark was performed using a real-world dataset of hand-drawn chemical structures. The results indicate that our improved DECIMER architecture exhibits a significantly enhanced recognition accuracy compared to other approaches.

Molecule Set Comparator (MSC): a CDK-based open rich‐client tool for molecule set similarity evaluations (2021)

Rajan, Kohulan ; Hein, Jan-Mathis ; Steinbeck, Christoph ; Zielesny, Achim

Performance of chemical structure string representations for chemical image recognition using transformers (2021)

Rajan, Kohulan ; Steinbeck, Christoph ; Zielesny, Achim

Performance of chemical structure string representations for chemical image recognition using transformers (2022)

Rajan, Kohulan ; Steinbeck, Christoph ; Zielesny, Achim

The use of molecular string representations for deep learning in chemistry has been steadily increasing in recent years. The complexity of existing string representations, and the difficulty in creating meaningful tokens from them, lead to the development of new string representations for chemical structures. In this study, the translation of chemical structure depictions in the form of bitmap images to corresponding molecular string representations was examined. An analysis of the recently developed DeepSMILES and SELFIES representations in comparison with the most commonly used SMILES representation is presented where the ability to translate image features into string representations with transformer models was specifically tested. The SMILES representation exhibits the best overall performance whereas SELFIES guarantee valid chemical structures. DeepSMILES perform in between SMILES and SELFIES, InChIs are not appropriate for the learning task. All investigations were performed using publicly available datasets and the code used to train and evaluate the models has been made available to the public.

DECIMER: towards deep learning for chemical image recognition (2020)

Rajan, Kohulan ; Zielesny, Achim ; Steinbeck, Christoph

DECIMER 1.0: deep learning for chemical image recognition using transformers (2021)

Rajan, Kohulan ; Zielesny, Achim ; Steinbeck, Christoph

DECIMER 1.0: Deep Learning for Chemical Image Recognition using Transformers (2021)

Rajan, Kohulan ; Zielesny, Achim ; Steinbeck, Christoph

STOUT: SMILES to IUPAC names using neural machine translation (2021)

Rajan, Kohulan ; Zielesny, Achim ; Steinbeck, Christoph

Autor(en)
Titel
Weitere Person(en)
Gutachter
Zusammenfassung
Volltext

Filtern

Autor

Erscheinungsjahr

Dokumenttyp

Schlagworte

22 Treffer