MOESM1 of Database fingerprint (DFP): an approach to represent molecular databases

Fernández-de Gortari, Eli; García-Jacas, César; Martinez-Mayorga, Karina; Medina-Franco, José

doi:10.6084/m9.figshare.c.3697138_D1.v1

13321_2017_195_MOESM1_ESM.pdf (1008.46 kB)

MOESM1 of Database fingerprint (DFP): an approach to represent molecular databases

journal contribution

posted on 2017-02-06, 05:00 authored by Eli Fernández-de Gortari, César García-Jacas, Karina Martinez-Mayorga, José Medina-Franco

Additional file 1:Table S1. DFPs of representative data sets used in this work. Table S2. Inter-set relationship computed with the newly developed database fingerprint using DFP/Tanimoto coefficient. Fig. S1 Distributions of MACCS keys (166-bits) of selected data sets studied in this work (others are shown in the main text). Fig. S2 Visual representation of the distance matrix comparing inter-set relationships of the compound data sets computed with the database fingerprint (DFP) and city block distance. Fig. S3 Relationship between inverse normalized city block distance and Tanimoto similarity using the DFP. Fig. S4 Inter-set relationships of the compound data sets computed with MACCS keys and the Tanimoto coefficient. Fig. S5 Relationship between mean similarities computed with MACCS keys and DFP. Fig. S6 Relationship Shannon Entropy and DFP/Tanimoto similarity and k-mean Euclidean clustering for the ten compound data sets in Table 2 at threshold of 0.6. Fig. S7 Probability distribution of the 198 significant bit positions recovered from the original databases represented by PubChem fingerprint at threshold of 0.6.Fig. S8 Relationship Shannon Entropy and DFP/Tanimoto similarity and k-mean Euclidean clustering for the ten compound data sets in Table 2 at threshold of 0.7. Fig. S9 Probability distribution of the 198 significant bit positions recovered from the original databases represented by PubChem fingerprint at threshold of 0.7.

Funding

Universidad Nacional Autónoma de México

History

Usage metrics

Keywords

Diversity Information content Molecular fingerprints Similarity Shannon entropy

Licence

CC BY + CC0

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

MOESM1 of Database fingerprint (DFP): an approach to represent molecular databases

Funding

Universidad Nacional Autónoma de México

History

Usage metrics

Categories

Keywords

Licence

Exports