Springer Nature
Browse
13321_2017_195_MOESM1_ESM.pdf (1008.46 kB)

MOESM1 of Database fingerprint (DFP): an approach to represent molecular databases

Download (1008.46 kB)
journal contribution
posted on 2017-02-06, 05:00 authored by Eli Fernández-de Gortari, César García-Jacas, Karina Martinez-Mayorga, José Medina-Franco
Additional file 1:Table S1. DFPs of representative data sets used in this work. Table S2. Inter-set relationship computed with the newly developed database fingerprint using DFP/Tanimoto coefficient. Fig. S1 Distributions of MACCS keys (166-bits) of selected data sets studied in this work (others are shown in the main text). Fig. S2 Visual representation of the distance matrix comparing inter-set relationships of the compound data sets computed with the database fingerprint (DFP) and city block distance. Fig. S3 Relationship between inverse normalized city block distance and Tanimoto similarity using the DFP. Fig. S4 Inter-set relationships of the compound data sets computed with MACCS keys and the Tanimoto coefficient. Fig. S5 Relationship between mean similarities computed with MACCS keys and DFP. Fig. S6 Relationship Shannon Entropy and DFP/Tanimoto similarity and k-mean Euclidean clustering for the ten compound data sets in Table 2 at threshold of 0.6. Fig. S7 Probability distribution of the 198 significant bit positions recovered from the original databases represented by PubChem fingerprint at threshold of 0.6.Fig. S8 Relationship Shannon Entropy and DFP/Tanimoto similarity and k-mean Euclidean clustering for the ten compound data sets in Table 2 at threshold of 0.7. Fig. S9 Probability distribution of the 198 significant bit positions recovered from the original databases represented by PubChem fingerprint at threshold of 0.7.

Funding

Universidad Nacional Autónoma de México

History