BrCa: Results Summary (target Selection Process)

Target Selection Process

The focus of this section is set on the molecular targets where the active molecules against BrCa models in the ChEMBL database have significant impact. We will tackle the question from the empirical point of view (i.e the  ChEMBL recorded impact) and the predictive analysis, where predictions from machine learning algorithms will be used for target ID. Results will be further compared to determine whether the predictive analysis in the whole dataset affords any novelty upon the empirical annotations. Targets involved in killing particular BrCa cell lines will also be determined by hyerarchical clustering.

BRCA2 protein. By Filip em – self created from PDB entry with KiNG tool http://www.rcsb.org/pdb/cgi/explore.cgi?pdbId=1n0w, CC BY 2.0, https://commons.wikimedia.org/w/index.php?curid=3237307

Global Identification Process

The target identification process is initiated with compounds screened in breast cancer phenotypic assays. The results from such compounds in molecular target based assays are identified, and those with an active ChEMBL score are selected and pivoted to yield unique raw data by target. Each row, will contain, the actual ratio between the number of events per active target in all BrCa assays and the total count of events per target (countRatio). CountRatio is equivalent to the proportion of times that molecules with a positive hit on a particular protein activates BrCa as well. If always = 1, if never =0. This parameter is associated in the chart below with the total number of events per target (count(chemblActivityScore)) as a measure of confidence. Average potencies per target in Non BrCa molecular assays (Avg(chemblActivityScore)), and BrCa assays (Avg(BrCaScore)) are depicted as color and size gradients, respectively. As an example, targets with countRatio >0.65 and total events >1 are marked in the chart. This is: targets where >65% of molecular events with a positive chemblActivityScore also have active events in BrCa phenotypic assays, and results in the identification of 61 putative BrCa targets.

 

This selection is carried out on database with 170k events on molecular assays, but given that there are predictions from BrCa prediction of activity section, these could be used to look at potential targets among 13M events, the entire database content. In this particular case, the random forest regression output will be used as an indicator of BrCaScore. With this target dataset, by using predictions, we increment the chances of identifying unforeseen targets.

 

It appears that some of the targets identified from the actual data are present in the plot from predicted data, but few of the targets from the prediction are in the BrCa compound dataset. Below results are plotted in bar charts, sorted by potency.

From actual data…

 

From predicted data…

 

Little dashboard where targets marked on the left graph (actual) are displayed on the right one (prediction). Although most of the targets are also identified in in the right chart, there is an evident shift to lower countRatio values.

In this second dashboard, targets identified from predicted BrCa score are in the left, and marking them we make them to be displayed in the right plot. Just the two more frequent appear in the plot of actual targets. Should we trust on the value of prediction?

Breast Cancer is one of the subjects of research with higher amount of information published, so, let’s see in the table below how the predicted targets with an average score >6 do in literature
proteinNameAvg(chemblScore)countRatioOrganism(s)EventsRef
Serine/threonine protein phosphatase 2A- 56 kDa regulatory subunit- alpha isoform10.0133331Homo sapiens3https://www.ncbi.nlm.nih.gov/pubmed/14534748
Pituitary adenylate cyclase-activating polypeptide9.761Homo sapiens2http://cancerres.aacrjournals.org/content/canres/56/15/3486.full.pdf
Biotin--acetyl-CoA-carboxylase ligase9.3290.76923076923077Mycobacterium tuberculosis (strain ATCC 25618 / H37Rv)13https://www.nature.com/articles/1205915.pdf?origin=ppub
HIV-1 protease8.9750.86363636363636Human immunodeficiency virus44https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2682221/
Peptidyl-prolyl cis-trans isomerase F- mitochondrial8.431Homo sapiens5https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2912272/
Serotonin receptor 2a and 2b (5HT2A and 5HT2B)8.290.66666666666667Rattus norvegicus12http://ar.iiarjournals.org/content/33/2/363.full
Arginyl-tRNA synthetase8.2351Homo sapiens2http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0040960
Hypoxia-inducible factor prolyl hydroxylase 18.070.56626506024096Homo sapiens83https://www.ncbi.nlm.nih.gov/pubmed/28038470
Kininogen-18.060.62096774193548Homo sapiens124https://www.ncbi.nlm.nih.gov/pubmed/27324523
Casein kinase II alpha/beta80.75Homo sapiens16https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2916697/
Tubulin beta-3 chain7.901Homo sapiens9https://www.ncbi.nlm.nih.gov/pubmed/17285590
Alpha-1A adrenergic receptor7.870.71428571428571Cavia porcellus, Sus scrofa28https://www.nature.com/articles/4500561
Fibronectin7.80.75Homo sapiens4https://www.ncbi.nlm.nih.gov/pubmed/24117661
Ileal sodium/bile acid cotransporter7.850.64285714285714Mus musculus, Rattus norvegicus14http://cebp.aacrjournals.org/content/10/9/931
Proteasome subunit beta type-77.691Homo sapiens8https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2816652/
Niemann-Pick C1-like 1 protein7.61Canis lupus familiaris, Macaca mulatta4http://www.sciencedirect.com/science/article/pii/S0002944010609601
Cytochrome b-245 light chain7.5680.6Bos taurus5
Fe(3+)-pyochelin receptor7.4980.83333333333333Pseudomonas aeruginosa (strain ATCC 15692 / PAO1 / 1C / PRS 101 / LMG12228)6
Integrin alpha1/beta1 complex7.481Homo sapiens4https://www.nature.com/articles/1204554
Lysophosphatidic acid receptor 1/lysophosphatidic acid receptor 37.481Rattus norvegicus8http://cancerres.aacrjournals.org/content/69/13/5441.short
Bis(5'-adenosyl)-triphosphatase (FHIT)7.460.6Homo sapiens5http://cancerres.aacrjournals.org/content/63/6/1183.short
Mu-opioid receptor7.440.7Cavia porcellus20http://onlinelibrary.wiley.com/doi/10.1002/(SICI)1097-4644(19990501)73:2%3C204::AID-JCB6%3E3.0.CO;2-V/full
Serine/threonine protein phosphatase PP1-gamma catalytic subunit7.2960.74Homo sapiens50http://cancerres.aacrjournals.org/content/63/22/7777.short
Subtilisin/kexin type 67.280.62790697674419Homo sapiens43https://link.springer.com/article/10.1007/s13277-012-0630-x
CDK9/Cyclin K7.1270.66666666666667Homo sapiens12https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5636177/
Appetite-regulating hormone7.0940.7027027027027Homo sapiens74https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5220482/
Prostate specific antigen7.0500.59493670886076Homo sapiens79https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2033864/
Toxin ParE7.0410.56140350877193Escherichia coli57
Placenta growth factor6.950.82539682539683Homo sapiens, Mus musculus63http://www.sciencedirect.com/science/article/pii/S0959804905007938
Glucose-6-phosphate translocase6.750.57142857142857Homo sapiens7https://www.ncbi.nlm.nih.gov/pubmed/19894109/
Arachidonate 5-lipoxygenase-activating protein6.7480.83333333333333Mus musculus, Rattus norvegicus6https://academic.oup.com/abbs/article/45/9/709/1146
Prostasin6.707820.60869565217391Homo sapiens23http://onlinelibrary.wiley.com/doi/10.1002/ijc.1601/full
Thymidylate synthase (EC 2.1.1.45) (TS) (TSase)6.7010.75Escherichia coli8https://www.researchgate.net/publication/313662677_TYMS_Dysregulation_is_a_Molecular_Switch_and_Tumorigenic_Driver_in_Early_Breast_Cancer
Secreted frizzled-related protein 16.640.59162303664922Homo sapiens, Mus musculus, Rattus norvegicus191https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5316967/
1-3-beta-glucan synthase component GLS26.6350.6551724137931Saccharomyces cerevisiae S288c29https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3977804/
GP416.5620.575Human immunodeficiency virus 140
Plasmepsin IV6.360.66666666666667Plasmodium falciparum 3D73
PI3-kinase class I6.30.65675990675991Homo sapiens27456https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5237710/
Esterase D6.3320.57142857142857Sus scrofa7https://www.ncbi.nlm.nih.gov/pmc/articles/PMC386498/
26S proteosome6.320.61596298438404Homo sapiens32851https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3242858/
Myelin transcription factor 16.280.57142857142857Homo sapiens7https://academic.oup.com/jnen/article/56/7/772/2610765
Arginase-2- mitochondrial6.2101Homo sapiens49https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3818427/
Arginase-16.10.84375Bos taurus, Homo sapiens, Rattus norvegicus64http://www.fasebj.org/content/31/1_Supplement/lb529.short
RocR6.1210.66666666666667Pseudomonas aeruginosa (strain ATCC 15692 / PAO1 / 1C / PRS 101 / LMG12228)6Arginine utilization protein , as human arginases
Adenylate kinase 3 alpha like 16.0830.6Rattus norvegicus5https://www.nature.com/articles/ncomms15308
Splicing factor 3B subunit 36.00.691811320754717Homo sapiens53https://www.ncbi.nlm.nih.gov/pubmed/25431237
Tyrosine aminotransferase6.030.7037037037037Homo sapiens, Rattus norvegicus27http://onlinelibrary.wiley.com/doi/10.1002/hep.23540/pdf
S-adenosylmethionine synthetase alpha and beta forms6.0180.54545454545454Rattus norvegicus11https://link.springer.com/article/10.1007/BF01806157

We can see how most of the predicted targets have a link to a literature reference that relates them to breast cancer biomarkers or therapies. Those that don’t, pertain to species other than human, i.e: E. coli, Pseudomonas, HIV virus or plasmodium. So the final target selection will include proteins selected by their actual and predicted scores of activity. The plot below shows the selection criterion for targets (>0.5 count ratio & > event of activity, predicted or actual).

 

Here, the corresponding bar charts for the average potencies and countRatios with the best values:

 

 

 

And here, the table with the corresponding values:

Targets identified from predicted and actual BrCa scores.

proteinNameAvg(chemblScoreC)Count(chemblScoreC)countRatioOrganisms(s)proteinClassDescriptionBrCaActivities
CDK6/cyclin D24.30102999566441Homo sapienscytosolic other, enzyme kinase protein kinase cmgcPrediction
DNA topoisomerase type IB small subunit4.3931Leishmania majorenzymeActual
DNA topoisomerase4.394823996531250.6Bos taurus, Yersinia pestisenzymeActual
Orexin receptor 24.39794000867221Homo sapiensmembrane receptor 7tm1 peptide short peptide orexin receptorActual
Histone deacetylase 54.409852857762370.57142857142857Homo sapiensepigenetic regulator eraser hdac hdac class iiaActual
Heat shock protein HSP904.43178612971692080.94230769230769Homo sapienscytosolic otherActual
Folylpoly-gamma-glutamate synthetase4.5006826451167190.68421052631579Homo sapiens, Mus musculusenzymeActual
MAP kinase p384.57059087075231600.6Homo sapiensenzyme kinase protein kinase cmgc mapk p38Actual
Type 1 fimbiral adhesin FimH4.613100.6Escherichia coli (strain UTI89 / UPEC)unclassifiedPrediction
Cytochrome P450 165B34.61521Amycolatopsis orientalisenzymePrediction
Histone deacetylase 104.62521Homo sapiensepigenetic regulator eraser hdac hdac class iibActual
Phosphate system positive regulatory protein PHO814.66541Saccharomyces cerevisiae S288cunclassifiedPrediction
Hexokinase type III4.679400378061331Rattus norvegicusenzymePrediction
Hexokinase type II4.685391699421450.6Rattus norvegicusenzymePrediction
Protection of telomeres protein 14.693333333333331Homo sapiensunclassifiedPrediction
Polypyrimidine tract-binding protein 14.721Homo sapiensunclassifiedActual
Glutathione synthetase4.717003614566690.77777777777778Loa loaunclassifiedPrediction
Plasminogen4.725164039065921Homo sapiensenzyme protease serine pas s1aActual
Protein farnesyltransferase beta subunit4.7415593203602240.79166666666667Homo sapiensenzymePrediction
Coagulation factor X/antithrombin III4.7996460756131960.58333333333333Homo sapiensenzyme protease serine pas s1a, secretedPrediction
Alpha trans-inducing protein (VP16)4.8371659980844340.58823529411765Herpes simplex virus (type 1 / strain 17)nuclear otherActual
Protein-arginine deiminase type-24.8631Homo sapiensenzymeActual
DNA topoisomerase 14.89448130412651Chlorocebus aethiops, Saccharomyces cerevisiae S288cenzymeActual
Histone deacetylase 84.9265463679975130.76923076923077Homo sapiensepigenetic regulator eraser hdac hdac class iActual
Pho80/Pho85/Pho81 complex4.94176470588241531Saccharomyces cerevisiae S288cenzyme kinase protein kinase cmgc cdk cdk5, unclassifiedPrediction
Multidrug resistance-associated protein 74.953556593886421Homo sapienstransporter ntpase atp binding cassette mrpActual
Adenosine deaminase4.982650343142370.57142857142857Bos taurus, Homo sapiensenzyme, enzyme hydrolaseActual
DNA topoisomerase I4.98614200709152260.85398230088496Homo sapiens, Leishmania donovani donovani, Mus musculusenzyme, enzyme isomerase, unclassifiedActual
DNA topoisomerase II4.99036303907791010.8019801980198Drosophila melanogaster, Homo sapiensenzyme, enzyme isomeraseActual
Uroporphyrinogen-III synthase5110.90909090909091Homo sapiensenzymePrediction
Candidapepsin-1530.66666666666667Candida albicansenzyme protease aspartic aa a1aPrediction
V-type proton ATPase subunit B- brain isoform521Homo sapiensenzyme hydrolase, transporter ntpase f-type and v-type v-type atpasePrediction
Complex of retinoic acid binding (CRABPII) and inhibitor of apoptosis (cIAP1) proteins5360.77777777777778Homo sapiensauxiliary transport protein fabp, enzymePrediction
Kallikrein 8521Homo sapiensenzyme protease serine pas s1aPrediction
Thioredoxin reductase 2521Homo sapiensenzymePrediction
Pho80/Pho85541Saccharomyces cerevisiae S288cenzyme kinase protein kinase cmgc cdk cdk5, unclassifiedPrediction
Chitinase-3-like protein 3541Mus musculusunclassifiedPrediction
Alpha-1-antiproteinase521Mus caroliunclassifiedPrediction
Pro-cathepsin H521Mus musculusenzymePrediction
Lipase590.55555555555556Thermomyces lanuginosusenzymePrediction
Toll-like receptor 4/MD-251201Homo sapiensmembrane receptor, surface antigenPrediction
Toll-like receptor 4/MD-2/CD14591Homo sapiensmembrane receptor, surface antigenPrediction
Proteasome subunit beta type-2521Mus musculusenzymePrediction
JNK1/JNK2541Mus musculusenzymePrediction
Heme oxygenase 1531Mus musculusenzymeActual
Mitogen-activated protein kinase; ERK1/ERK2580.5Homo sapiensenzyme kinase protein kinase cmgc mapk erkActual
ORAI 1/2/35.00761904761915670.76190476190476Homo sapiension channel other misc crac-cPrediction
Latent membrane protein 15.0106697555004160.5625Human herpesvirus 4 (strain B95-8)unclassifiedActual
Histone deacetylase5.0258640869132103640.59108452335006Homo sapiens, Plasmodium falciparum, Rattus norvegicusenzyme, epigenetic regulator eraser hdac hdac class i, epigenetic regulator eraser hdac hdac class iia, epigenetic regulator eraser hdac hdac class iib, epigenetic regulator eraser hdac hdac class ivActual
Serine protease SplB5.033333333333330.66666666666667Staphylococcus aureusenzymePrediction
von Willebrand factor5.039188911175220.63636363636364Homo sapiensunclassifiedPrediction
Integrin alpha M5.0441385047014120.75Homo sapiens, Mus musculusmembrane receptorPrediction
Heat shock factor protein 15.0550526765924300.6Homo sapiens, Mus musculusunclassifiedActual
Glyceraldehyde-3-phosphate dehydrogenase cytosolic5.0567857142857280.57142857142857Leishmania mexicanaenzymePrediction
Scavenger receptor type A5.0694024163614250.52Mus musculusmembrane receptorPrediction
Glucan synthase5.0835447546725191Candida albicansenzymePrediction
Nuclear receptor subfamily 0 group B member 15.1003520615228160.625Homo sapienstranscription factor nuclear receptor nr0 nr0b nr0b1Actual
Uridine-cytidine kinase 15.106246066231150.6Mus musculusenzymeActual
Interleukin-1 beta5.107142857142970.71428571428571Homo sapienssecretedPrediction
3-beta-hydroxysteroid dehydrogenase/delta 5-->4-isomerase type II5.1130.66666666666667Homo sapiensenzyme reductasePrediction
Glutamate dehydrogenase5.1192896051331110.63636363636364Bos taurusenzymePrediction
Leucyl-tRNA synthetase- cytoplasmic5.147541Saccharomyces cerevisiae S288cenzymePrediction
Protein disulfide-isomerase5.164163820031790.77777777777778Homo sapiensenzymeActual
Galectin-75.1875160.625Homo sapiensunclassifiedPrediction
Hepatitis C virus NS3 protease/helicase5.189167587983340.75Hepatitis C virusenzyme protease serine pas s29Actual
Mucosa-associated lymphoid tissue lymphoma translocation protein 15.213333333333330.66666666666667Homo sapiensenzyme hydrolaseActual
Tubulin beta chain5.22603696419051030.55339805825243Bos taurus, Sus scrofastructuralActual
Protein Wnt-3a5.266695489603690.55555555555556Mus musculusunclassifiedActual
Galectin-15.2686148439979370.64864864864865Homo sapienscytosolic otherPrediction
Heat shock protein HSP 90 (HSP82)5.28068666377631Saccharomyces cerevisiae S288ccytosolic otherActual
Hypoxanthine-guanine phosphoribosyltransferase5.32070949509550.6Homo sapiensenzymeActual
Ribonucleoside-diphosphate reductase M2 chain5.375164097171360.66666666666667Homo sapiens, Mus musculusenzymeActual
Serum amyloid P-component5.414645637304321Homo sapienssecretedPrediction
Caspase-55.4219904501308460.5Homo sapiensenzyme protease cysteine cd c14Prediction
Tubulin beta-1 chain5.448571428571470.85714285714286Homo sapiensstructuralActual
Nuclear factor NF-kappa-B p65 subunit5.4488205633045120.58333333333333Homo sapienstranscription factorActual
Tubulin5.4564145465435614090.88350241821231Bos taurus, Homo sapiens, Sus scrofastructuralActual
Heat shock protein HSP 90-beta5.4607941389923380.94736842105263Homo sapiens, Oryctolagus cuniculuscytosolic other, unclassifiedActual
NADP-dependent leukotriene B4 12-hydroxydehydrogenase5.465810198280621Rattus norvegicusenzymeActual
Heterogeneous nuclear ribonucleoprotein A15.5241Homo sapiensunclassifiedActual
Tubulin polymerization-promoting protein5.5621Bos taurusunclassifiedActual
Tubulin alpha chain5.58068428444311150.80869565217391Sus scrofastructuralActual
Prolactin receptor5.6121Homo sapiensunclassifiedPrediction
Proteasome Macropain subunit5.653333333333331Homo sapiensenzyme protease threonine pbt t1aActual
Histone deacetylase (HDAC1 and HDAC2)5.741Homo sapiensepigenetic regulator eraser hdac hdac class iActual
Sphingosine 1-phosphate receptor Edg-55.7421Homo sapiensmembrane receptor 7tm1 smallmol lipid-like ligand receptor edg receptorActual
Histone deacetylase 35.7411602165932160.75Homo sapiensepigenetic regulator eraser hdac hdac class iActual
Beta tubulin5.75551499783241Leishmania donovanistructuralActual
G-protein coupled receptor 555.783333333333330.66666666666667Homo sapiensmembrane receptor 7tm1 smallmol lipid-like ligand receptor lysophosphatidylinositol receptorActual
Ras-related protein Rap-1A5.7950.6Homo sapiensunclassifiedPrediction
Solute carrier organic anion transporter family member 1A35.807502733387631Rattus norvegicusunclassifiedActual
CREB-binding protein5.81875160.625Homo sapiensepigenetic regulator reader brd, epigenetic regulator writer hat p300 cbpActual
Small conductance calcium-activated potassium channel protein 35.836666666666760.66666666666667Homo sapiens, Rattus norvegicusion channel vgc k ca act kActual
Orexin receptor 15.84521Homo sapiensmembrane receptor 7tm1 peptide short peptide orexin receptorActual
Histone deacetylase 25.8451682435651190.78947368421053Homo sapiensepigenetic regulator eraser hdac hdac class iActual
ATP-binding cassette sub-family A member 15.8495454545455220.72727272727273Mus musculusunclassifiedPrediction
Major capsid protein L15.892857142857170.57142857142857Human papillomavirus type 16, Human papillomavirus type 58unclassifiedPrediction
Chymase5.892917866984330.66666666666667Homo sapiensenzyme protease serine pas s1aActual
Small conductance calcium-activated potassium channel5.898450.8Rattus norvegicusion channel vgc k ca act kActual
Adenylate kinase 25.9039390478148130.61538461538461Rattus norvegicusenzymePrediction
Heat shock protein HSP 90-alpha5.9333613952336800.5625Homo sapienscytosolic otherActual
Cannabinoid CB1 receptor/orexin receptor 1 complex5.9335384862971720.72222222222222Homo sapiensmembrane receptor 7tm1 peptide short peptide orexin receptor, membrane receptor 7tm1 smallmol lipid-like ligand receptor cannabinoid receptorPrediction
Muscarinic acetylcholine receptor M1/M5 chimeric protein5.93787559876361040.5Homo sapiensmembrane receptor 7tm1 smallmol monoamine receptor acetylcholine receptorPrediction
Nociceptin/mu opioid receptor5.95200.8Rattus norvegicusmembrane receptor 7tm1 peptide short peptide opioid receptorPrediction
Nuclear receptor coactivator 35.9765180062302110.63636363636364Homo sapiensepigenetic regulator writer hat srcActual
Tyrosine aminotransferase6.0339980979378270.7037037037037Homo sapiens, Rattus norvegicusenzymePrediction
Paired box protein Pax-86.0378473568484670.71641791044776Homo sapiensunclassifiedActual
Splicing factor 3B subunit 36.0816268533896530.69811320754717Homo sapiensunclassifiedPrediction
Adenylate kinase 3 alpha like 16.083057029074750.6Rattus norvegicusenzymePrediction
Proto-oncogene protein Wnt-36.08846413948311420.51408450704225Homo sapiensunclassifiedPrediction
Tubulin beta-2 chain6.095399215298921Homo sapiensstructuralActual
RocR6.121666666666760.66666666666667Pseudomonas aeruginosa (strain ATCC 15692 / PAO1 / 1C / PRS 101 / LMG12228)unclassifiedPrediction
Arginase-16.126875640.84375Bos taurus, Homo sapiens, Rattus norvegicusenzymePrediction
Histone deacetylase 16.1430918371049470.85106382978723Homo sapiens, Mus musculus, Rattus norvegicusenzyme, epigenetic regulator eraser hdac hdac class iActual
Lactoylglutathione lyase6.166100.5Saccharomyces cerevisiae S288cenzymePrediction
Arginase-2- mitochondrial6.2108163265306491Homo sapiensenzymePrediction
26S proteosome6.25172201Homo sapiensenzyme, enzyme protease threonine pbt t1a, unclassifiedActual
COUP transcription factor 26.289952038267760.83333333333333Homo sapienstranscription factor nuclear receptor nr2 nr2f nr2f2Actual
Esterase D6.332694453516270.57142857142857Sus scrofaenzymePrediction
PI3-kinase class I6.3521212121211274560.65675990675991Homo sapiensenzyme, unclassifiedPrediction
Plasmepsin IV6.3630.66666666666667Plasmodium falciparum 3D7unclassifiedPrediction
CDK2/Cyclin A16.3750775913315240.5Homo sapienscytosolic other, enzyme kinase protein kinase cmgc cdk cdc2Prediction
Leukocyte adhesion glycoprotein LFA-1 alpha6.3841Homo sapiensadhesion, membrane receptorActual
Proteasome component C56.3931Homo sapiensenzyme protease threonine pbt t1aActual
1-3-beta-glucan synthase6.5116666666667300.53333333333333Aspergillus, Candida, Candida glabrataenzyme, enzyme transferasePrediction
Proteasome subunit beta type-86.573605303151141Homo sapiensenzyme protease threonine pbt t1aActual
Lysine-specific demethylase 5A6.621Homo sapiensepigenetic regulator eraser kdm jumonji, epigenetic regulator reader phdActual
1-3-beta-glucan synthase component GLS26.6358975860574290.6551724137931Saccharomyces cerevisiae S288cenzymePrediction
Secreted frizzled-related protein 16.64352642094371910.59162303664922Homo sapiens, Mus musculus, Rattus norvegicusunclassifiedPrediction
Solute carrier organic anion transporter family member 4C16.670108201691621Homo sapienstransporter electrochemical slc slc21Actual
ATP-sensitive inward rectifier potassium channel 16.67508503506284540.53303964757709Homo sapiens, Rattus norvegicusion channel vgc k kir, unclassifiedPrediction
Hemagglutinin6.684444444444490.55555555555556Influenza A virus (A/Puerto Rico/8/1934(H1N1)), Influenza A virus (strain A/Aichi/2/1968 H3N2)unclassifiedPrediction
Dihydroorotate dehydrogenase6.702516485419761Homo sapiens, Plasmodium falciparum, Saccharomyces cerevisiae S288cenzyme, enzyme reductaseActual
Prostasin6.7078260869565230.60869565217391Homo sapiensenzyme protease serine pas s1aPrediction
Arachidonate 5-lipoxygenase-activating protein6.748333333333360.83333333333333Mus musculus, Rattus norvegicusunclassifiedPrediction
Glucose-6-phosphate translocase6.7570.57142857142857Homo sapienstransporter electrochemical slc slc37Prediction
Inositol phosphorylceramide synthase catalytic subunit AUR16.839100.5Saccharomyces cerevisiae S288cenzymePrediction
Sodium/potassium-transporting ATPase alpha-1 chain6.961Canis lupus familiarisenzyme hydrolase, transporter ntpase p-type atpase na k atpaseActual
Placenta growth factor6.9579365079365630.82539682539683Homo sapiens, Mus musculusunclassifiedPrediction
Hepatic lipase6.98749466476813580.5586592178771Homo sapiensenzymePrediction
Toxin ParE7.0414035087719570.56140350877193Escherichia coliunclassifiedPrediction
Dihydrofolate reductase7.04326027016424880.5922131147541Bacillus anthracis, Bos taurus, Candida albicans, Enterococcus faecium, Escherichia coli, Gallus gallus, Homo sapiens, Lactobacillus casei, Leishmania major, Mus musculus, Plasmodium falciparum K1, Pneumocystis carinii, Rattus norvegicus, Staphylococcus aureus, Toxoplasma gondiienzyme, enzyme reductase, unclassifiedActual
Prostate specific antigen7.0506517627305790.59493670886076Homo sapiensenzyme protease serine pas s1aPrediction
Appetite-regulating hormone7.0941068748776740.7027027027027Homo sapiensunclassifiedPrediction
Potassium-transporting ATPase alpha chain 27.121Homo sapiensenzyme hydrolase, transporter ntpase p-type atpase h k atpaseActual
CDK9/Cyclin K7.1273533304427120.66666666666667Homo sapiensenzyme kinase protein kinase cmgc cdk cdk9, unclassifiedPrediction
Heat shock protein 90 beta7.2121Canis lupus familiarismembrane otherActual
Baculoviral IAP repeat-containing protein 37.22051181714212270.55066079295154Homo sapiensenzymePrediction
Sodium/potassium-transporting ATPase7.2659601Homo sapiensenzyme hydrolase, ion channel other plm, transporter ntpase p-type atpase na k atpaseActual
Subtilisin/kexin type 67.2809302325581430.62790697674419Homo sapiensenzyme protease serine sb s8bPrediction
Mu-opioid receptor7.44200.7Cavia porcellusmembrane receptor 7tm1 peptide short peptide opioid receptorPrediction
Bis(5'-adenosyl)-triphosphatase7.4650.6Homo sapiensenzymePrediction
Integrin alpha-V/beta-37.4741Homo sapiensmembrane receptorActual
Endothelial lipase7.47053858557495770.49046793760832Homo sapiens, Mus musculus, Rattus norvegicusenzyme, enzyme hydrolasePrediction
Integrin alpha1/beta1 complex7.4841Homo sapiensmembrane receptorPrediction
Lysophosphatidic acid receptor 1/lysophosphatidic acid receptor 37.4881Rattus norvegicusmembrane receptor 7tm1 smallmol lipid-like ligand receptor edg receptorPrediction
Fe(3+)-pyochelin receptor7.498333333333360.83333333333333Pseudomonas aeruginosa (strain ATCC 15692 / PAO1 / 1C / PRS 101 / LMG12228)unclassifiedPrediction
Niemann-Pick C1-like 1 protein7.63541Canis lupus familiaris, Macaca mulattaunclassifiedPrediction
Proteasome subunit beta type-77.69897000433681Homo sapiensunclassifiedPrediction
Proteasome Macropain subunit MB17.7030826128131111Homo sapiensenzyme protease threonine pbt t1aActual
Proteinase-activated receptor 27.74541Homo sapiensmembrane receptor 7tm1 peptide protease-activated receptor, membrane receptor 7tm1 peptide protease-activated receptor protease-activated receptorActual
Ileal sodium/bile acid cotransporter7.8514732027989140.64285714285714Mus musculus, Rattus norvegicusunclassifiedPrediction
Fibronectin7.862540.75Homo sapiensadhesionPrediction
Alpha-1A adrenergic receptor7.8765613696552280.71428571428571Cavia porcellus, Sus scrofamembrane receptor 7tm1 smallmol monoamine receptor adrenergic receptorPrediction
NADH-ubiquinone oxidoreductase chain 17.97521Bos taurusenzymeActual
Casein kinase II alpha/beta8160.75Homo sapiensenzyme kinase protein kinase other ck2, enzyme kinase regPrediction
Kininogen-18.06629032258071240.62096774193548Homo sapiensunclassifiedPrediction
Tubulin beta-3 chain8.167541Homo sapiensstructuralActual
Arginyl-tRNA synthetase8.23521Homo sapiensenzymePrediction
Opioid receptors; delta & kappa8.26356613665931760.52272727272727Homo sapiensmembrane receptor 7tm1 peptide short peptide opioid receptorPrediction
Mitochondrial complex I; NADH oxidoreductase8.2941Bos taurusenzyme, unclassifiedActual
Serotonin receptor 2a and 2b (5HT2A and 5HT2B)8.2966666666667120.66666666666667Rattus norvegicusmembrane receptor 7tm1 smallmol monoamine receptor serotonin receptorPrediction
Peptidyl-prolyl cis-trans isomerase F- mitochondrial8.432369749923351Homo sapiensunclassifiedPrediction
Somatostatin receptor8.66251Homo sapiensmembrane receptor 7tm1 peptide short peptide somatostatin receptorActual
HIV-1 protease8.975440.86363636363636Human immunodeficiency virusunclassifiedPrediction
5-hydroxytryptamine receptor9.0969811320755530.54716981132076Cricetulus griseusmembrane receptor 7tm1 smallmol monoamine receptor serotonin receptorPrediction
Biotin--acetyl-CoA-carboxylase ligase9.3292307692308130.76923076923077Mycobacterium tuberculosis (strain ATCC 25618 / H37Rv)unclassifiedPrediction
Pituitary adenylate cyclase-activating polypeptide9.7621Homo sapiensunclassifiedPrediction
Serine/threonine protein phosphatase 2A- 56 kDa regulatory subunit- alpha isoform10.01333333333331Homo sapiensenzyme phosphatase protein phosphatase regPrediction
All targets with more than 50% events with a positive chembl score inhibiting BrCa cell lines growth either on real or virtual experiments.

 

 

 

Target Selection Upon Specific Cell Lines.

So far  we have been using the average BrCa scores from experiments carried out with 8 BrCa cell lines, but it is well known that there are genotypic and phenotypic differences among them that translate to differences in pharmacology. This may be relevant for discriminative treatments of tumors, for which the cell lines are representative, and the procedure is applicable on clinical databases with extensive and updated tumor treatment outcomes.

For this purpose, let’s do a simple classification of the tumors via hierarchical clustering performed on the global results of the BrCa compounds in the whole ChEMBL database, having removed all relatived to tumors or phenotypic (non molecular target) assays. The chart below shows the results of the clustering with a focus on three cell lines (MCF7, MDA-MB-435, and MDA-MB-231). The cell line specific clusters are areas where the activity on the cell line is the greatest (red) and minimal for the rest (yellow). Experiments with specific areas of activity are marked in red on the chart.

 

Once the results are collected, we proceed with the cell line specific results similar to the global target identification procedure. Results are pivoted by target, and the average potency in the specific cell line is calculated alongside the number of experimental data (countChemblScore). This is compared to the average potency of such experimental events at the average BrCaScore calculated from all BrCa cell lines. To facilitate selection views and interpretation, a selectivity index between the specific cell line and the averageBrCaScore is added to the plots.

Charts below represent the dashboards used for selection. On the left, the selectivity index is compared to the number of events in the DB. Compounds with the highest indexes are then marked, which makes them to be plotted in the right chart, that compares the potency in each particular cell line to the average BrCaAverageScore with the y=x, y=x+1 and y=x-1 lines.

 

 

And the corresponding tables containing the selected targets:

PROTEIN_NAMEAvg(BrCaScore)Count(ChemblActivityScore)AvgMCF7ScoreselectivityIndex
Dihydrofolate reductase3.924233283413577.8333.9087667165865
Dihydrofolate reductase5.8513333333333157.8331.9816666666667
Dihydrofolate reductase5.6658823529412177.8332.1671176470588
Thymidylate synthase4.2058770028771137.8333.6271229971229
Thymidylate synthase5.0226639688564227.8332.8103360311436
Thymidylate synthase4.416666666666767.23491666666672.81825
GAR transformylase4.682233306928447.8333.1507666930716
Dihydrofolate reductase4.138704122986847.8333.6942958770132
6-O-methylguanine-DNA methyltransferase316.0193.019
Thymidylate synthase4.717.8333.133
Dihydrofolate reductase3.638272163982417.8334.1947278360176
Acetylcholinesterase3.09087893653115.87546153846152.7845826019316
GAR transformylase6.652146457546767.8331.1808535424533
Folate transporter 15.779218251811157.8332.0537817481889
Arachidonate 5-lipoxygenase3.995803787472326.2512.2551962125276
Monoamine oxidase A216.1464.146
Monoamine oxidase B224.24452.2445
Cytochrome P450 3A42.978947368421196.147653.168702631579
DNA topoisomerase II alpha4.6126.4361.826
Indoleamine 2,3-dioxygenase3.862469368304225.66451.8020306316959
Proton-coupled folate transporter6.0120986299462127.8331.8209013700538
Folate receptor alpha5.03457.8332.799
Folate receptor beta5.273333333333337.8332.5596666666667
Cytochrome P450 1A23.9766666666667126.046718752.0700520833333
Quinone reductase 23.7726.482.71
Menin/Histone-lysine N-methyltransferase MLL4.3625166.852.4875
Thyroid hormone receptor beta-14.237499951477147.20752.9700000485229
Endoplasmic reticulum-associated amyloid beta-peptide-binding protein4.9526.82351.8735
Lysine-specific demethylase 4D-like4.526.82352.3235
Signal transducer and activator of transcription 65.466666666666737.6172.1503333333333
Ras-related protein Rab-9A4.2526.81852.5685
Survival motor neuron protein285.9973.997
Pyruvate kinase3.52527.6174.092
Pyruvate kinase isozymes M1/M23.42527.6174.192
Aldehyde dehydrogenase 1A14.7546.658751.90875
Nonstructural protein 12.466479527630656.75084.2843204723694
MAP kinase ERK22.2181818181818115.86445454545463.6462727272727
Caspase-12.3181818181818115.86009090909093.5419090909091
Neuropeptide S receptor526.81851.8185
Cytochrome P450 2D62.8666666666667155.94156666666673.0749
Cytochrome P450 2C93.2166.046281252.84628125
Nitric oxide synthase, inducible2.1763001963315115.72627272727273.5499725309412
Matrix metalloproteinase 92.3226974474426115.72627272727273.4035752798302
Matrix metalloproteinase-12.3204329899232115.72627272727273.4058397373496
Beta-glucocerebrosidase4.400000186056617.6173.2169998139434
Lysine-specific demethylase 4A4.3549999847922306.13921.7842000152078
Beta-2 adrenergic receptor2.2227273160387115.86445454545463.6417272294159
Putative hexokinase HKDC15.45282.55
Tumor susceptibility gene 101 protein4.693331638256427.082.3866683617436
Cytochrome P450 2C192.4285714285714145.93939285714293.5108214285714
Serine hydroxymethyltransferase, cytosolic4.425355507029347.8333.4076444929707
Thymidylate synthase (EC 2.1.1.45) (TS) (TSase)4.1217.8333.713
Chromobox protein homolog 14.300000020253176.56814285714292.2681428368898
Delta opioid receptor2.1849346390687116.04445454545453.8595199063858
DNA polymerase iota4.5881903578905266.45242307692311.8642327190326
Glucose-6-phosphate 1-dehydrogenase216.2324.232
mRNA interferase MazF4.2882736040518483.7117263959482
Thioredoxin reductase 1, cytoplasmic4.7692325381619136.47423076923081.7049982310688
Regulator of G-protein signaling 44.302646598662796.28877777777781.9861311791151
Serotonin 5a (5-HT5a) receptor4.0342810297558183.9657189702442
Opioid receptors; mu & delta5.01282.99
DNA polymerase kappa4.6227151804016136.55569230769231.9329771272907
Luciferin 4-monooxygenase4.945980051559826.8181.8720199484402
Adenosine A1 receptor2105.84893.8489
Adenosine A2a receptor2105.84893.8489
Adenosine A3 receptor2.676105.84893.1729
Alpha-1a adrenergic receptor2105.84893.8489
Alpha-1b adrenergic receptor2105.84893.8489
Alpha-1d adrenergic receptor2105.84893.8489
Alpha-2a adrenergic receptor2105.84893.8489
Alpha-2b adrenergic receptor2105.84893.8489
Alpha-2c adrenergic receptor2105.84893.8489
Beta-1 adrenergic receptor2105.84893.8489
Beta-3 adrenergic receptor2105.84893.8489
Norepinephrine transporter2.506105.84893.3429
Aldose reductase2105.84893.8489
Angiotensin II type 2 (AT-2) receptor2105.84893.8489
Bradykinin B2 receptor2105.84893.8489
Calcitonin receptor2105.84893.8489
Cannabinoid CB1 receptor2105.84893.8489
Carbonic anhydrase II2105.84893.8489
C-C chemokine receptor type 22105.84893.8489
C-C chemokine receptor type 42105.84893.8489
C-C chemokine receptor type 52105.84893.8489
Interleukin-8 receptor A2105.84893.8489
Interleukin-8 receptor B2105.84893.8489
Cholecystokinin A receptor2105.84893.8489
Cyclooxygenase-12105.84893.8489
Cyclooxygenase-22105.84893.8489
Cytochrome P450 2A62105.84893.8489
Cytochrome P450 2E12105.84893.8489
Dopamine D1 receptor2105.84893.8489
Dopamine D2 receptor2105.84893.8489
Dopamine D3 receptor2105.84893.8489
Dopamine D4 receptor2105.84893.8489
Dopamine transporter2105.84893.8489
Endothelin receptor ET-A2105.84893.8489
Estrogen receptor alpha2105.84893.8489
Estrogen receptor beta2105.84893.8489
Glucocorticoid receptor2105.84893.8489
Glycine receptor2405.84893.8489
Histamine H1 receptor2105.84893.8489
Histamine H2 receptor2105.84893.8489
HMG-CoA reductase2105.84893.8489
Insulin receptor2105.84893.8489
Leukotriene C4 synthase2105.84893.8489
Cysteinyl leukotriene receptor 12105.84893.8489
Arachidonate 15-lipoxygenase2105.84893.8489
Melanocortin receptor 32105.84893.8489
Melanocortin receptor 42105.84893.8489
Melanocortin receptor 52105.84893.8489
Monoamine oxidase A2.357777777777895.84893.4911222222222
Muscarinic acetylcholine receptor M12105.84893.8489
Muscarinic acetylcholine receptor M22105.84893.8489
Muscarinic acetylcholine receptor M32105.84893.8489
Muscarinic acetylcholine receptor M42105.84893.8489
Muscarinic acetylcholine receptor M52105.84893.8489
Neuropeptide Y receptor type 12105.84893.8489
Neuropeptide Y receptor type 22105.84893.8489
Nitric-oxide synthase, brain2105.84893.8489
Kappa opioid receptor2105.84893.8489
Mu opioid receptor2.1849346390687116.04445454545453.8595199063858
Phosphodiesterase 5A2105.84893.8489
Platelet activating factor receptor2105.84893.8489
HERG2105.84893.8489
Progesterone receptor2105.84893.8489
Angiotensin-converting enzyme2105.84893.8489
Cathepsin G2105.84893.8489
Leukocyte elastase2105.84893.8489
Protein kinase C alpha2105.84893.8489
MAP kinase ERK12.535125.877253.34225
MAP kinase p38 alpha2105.84893.8489
Serine/threonine protein phosphatase 2B catalytic subunit, alpha isoform2105.84893.8489
Epidermal growth factor receptor erbB12105.84893.8489
Tyrosine-protein kinase FYN2.647585.84893.2014
Receptor protein-tyrosine kinase erbB-22.264444444444495.84893.5844555555556
Tyrosine-protein kinase LCK2.251111111111195.84893.5977888888889
Leukocyte common antigen2105.84893.8489
Serotonin 1a (5-HT1a) receptor2105.84893.8489
Serotonin 1b (5-HT1b) receptor2105.84893.8489
Serotonin 2a (5-HT2a) receptor2105.84893.8489
Serotonin 2b (5-HT2b) receptor2105.84893.8489
Serotonin 2c (5-HT2c) receptor2105.84893.8489
Serotonin 4 (5-HT4) receptor2105.84893.8489
Serotonin 6 (5-HT6) receptor2105.84893.8489
Serotonin transporter2105.84893.8489
Sigma opioid receptor2105.84893.8489
Neurokinin 1 receptor2105.84893.8489
Neurokinin 2 receptor2105.84893.8489
Androgen Receptor2105.84893.8489
Thromboxane-A synthase2105.84893.8489
Vascular endothelial growth factor receptor 12105.84893.8489
Vasoactive intestinal polypeptide receptor 12205.84893.8489
Vasopressin V1a receptor2105.84893.8489
Serine/threonine-protein kinase Chk1436.2262.226
Serine/threonine-protein kinase Chk24.1836.2262.046
DNA topoisomerase I216.6964.696
Polyadenylate-binding protein 13.090228032229126.1773.0867719677709
Rac GTPase-activating protein 14.466772202910346.368251.9014777970897
Envelope polyprotein GP1604.173012194645236.3972.2239878053548
Sphingomyelin phosphodiesterase4.466666421375437.01733333333332.550666911958
Alpha-galactosidase A4.349999603463217.6173.2670003965368
Werner syndrome ATP-dependent helicase4.6208026697196106.49151.8706973302804
Peptidyl-prolyl cis-trans isomerase NIMA-interacting 14.246987099659886.512252.2652629003402
Aberrant vpr protein4.739999097875456.31241.5724009021246
Acetylcholinesterase3.475118808244626.49953.0243811917555
Cholinesterase3.756577894056926.49952.7429221059431
Probable global transcription activator SNF2L23.497238153663426.51553.0182618463366
Protein skinhead-1226.1774.177
AICAR transformylase527.8332.833
Ubiquitin carboxyl-terminal hydrolase 14.741665189119866.69951.9578348108802
Rap guanine nucleotide exchange factor 44.0499999698534183.9500000301466
Parathyroid hormone receptor4.7214280892991146.61785714285711.8964290535581
Rap guanine nucleotide exchange factor 34.314285825933776.81328571428572.4989998883521
Serine/threonine-protein kinase PLK15.036321941687846.954751.9184280583122
Bloom syndrome protein4.800437218218536.72033333333331.9198961151148
PROTEIN_NAMECount(ChemblActivityScore)Avg(BrCaScore)Avg(MDA-MB-435)ScoreselectivityIndex
Thyroid stimulating hormone receptor43.756.072.32
Cytochrome P450 3A434.06666666666676.072.0033333333333
Lysine-specific demethylase 4A94.38333301051146.3932.0096669894886
Tumor susceptibility gene 101 protein14.25000025114367.0122.7619997488564
Cytochrome P450 2C9126.074.07
Cytochrome P450 2C19126.074.07
Vitamin D receptor147.0123.012
Chromobox protein homolog 124.27500022679456.5412.2659997732055
DNA polymerase iota24.5250002034656.57052.045499796535
Flap endonuclease 134.40699447956296.7072.3000055204371
DNA polymerase eta24.72500054783836.57051.8454994521617
Regulator of G-protein signaling 414.10000012833427.0122.9119998716658
Rac GTPase-activating protein 114.37191773900937.0122.6400822609907
Sphingomyelin phosphodiesterase13.90000014205767.0123.1119998579424
Werner syndrome ATP-dependent helicase34.74999983430166.40366666666671.6536668323651
Peptidyl-prolyl cis-trans isomerase NIMA-interacting 134.27098287772036.40366666666672.1326837889463
Fatty acid synthase24.98204909910057.0122.0299509008995
Aberrant vpr protein24.52499992972456.55452.0295000702755
Isocitrate dehydrogenase [NADP] cytoplasmic34.85288106685856.41266666666671.5597855998082
Ubiquitin carboxyl-terminal hydrolase 124.70000012891346.55451.8544998710866
Tyrosyl-DNA phosphodiesterase 213.89195617167447.0123.1200438283256
Parathyroid hormone receptor25.09999903484657.0121.9120009651535
PROTEIN_NAMECountChemblActivityScoreAvgMDA-MB-231ScoreAvgBreastCancerScoreselectivityIndex
Taq polymerase 117.8532.97552.613
Glutamate NMDA receptor77.1722.73885714285713.6491212547197
Receptor protein-tyrosine kinase erbB-227.8573.17142.087
DNA polymerase iota16.5642.57052.5140000301466
DNA dC->dU-editing enzyme APOBEC-3F16.5642.57052.1139995235747
Rac GTPase-activating protein 116.5642.57052.3602967400518
Ataxin-216.5642.57052.2140003965368
Glucagon-like peptide 1 receptor27.2822.71382142857141.8820045838536
Guanine nucleotide-binding protein G(s), subunit alpha27.2822.71382142857142.4570002567322
Isocitrate dehydrogenase [NADP] cytoplasmic16.5642.57051.9277867905357
Parathyroid hormone receptor282.85714285714292.3346145711794

 

 

This sort of analysis is expanded in the BrCa pathways analysis section, to which you can access by clicking in this text.