OpenSourceMalaria:Story so far

From OpenWetWare

(Difference between revisions)
Jump to: navigation, search
(Why Take Part?: final bits plus licence)
(updated summary)
(45 intermediate revisions not shown.)
Line 1: Line 1:
{{OSDDMalaria}}
{{OSDDMalaria}}
-
''This is a human-readable summary of the first open source drug discovery for malaria project. A less readable collection of all the relevant data can be found [[OSDDMalaria:GSK_Arylpyrrole_Series|here]].''
+
''This is a human-readable summary of the open source drug discovery for malaria project. A less readable collection of all the relevant data can be found [[OSDDMalaria:GSK_Arylpyrrole_Series|here]].''
-
 
+
-
'''Note: This page is currently (May 2 2012) being built, so some links are missing.'''
+
=The story so far in the open source drug discovery for malaria project=
=The story so far in the open source drug discovery for malaria project=
Line 9: Line 7:
==History==
==History==
-
Last year [http://openwetware.org/wiki/Todd my lab] received seed funding for a pilot project in open source drug discovery from the Medicines for Malaria Venture (MMV). Our project champion at the outset was [http://www.mmv.org/about-us/our-team/tim-wells Tim Wells], who had clearly been thinking along similar lines to me - rather than debate the idea of open source drug discovery any more, let's just try it. [http://www.mmv.org/about-us/our-team/jeremy-burrows Jeremy Burrows] quickly came on board and led the suggestion to go after a few of the actives that had been [http://www.nature.com/nature/journal/v465/n7296/abs/nature09107.html placed in the public domain in 2010] by GlaxoSmithKline and others. We got [http://malaria.ourexperiment.org/tcmdc_ap/1/PaalKnorr_Synthesis_of_1aryl25dimethyl_Pyrrole_Core_PMY_11.html started in the lab] in August 2011. We were [http://sydney.edu.au/research_support/funding/arc/linkage_2012.shtml successful] in securing further funding from an ARC Linkage grant - a scheme where funds from an external agency are matched by the Australian Government. This has given us funding from May 2012 for three years - so we have enough money to drive this project for the moment to see if it works.
+
Last year [http://openwetware.org/wiki/Todd the Todd lab] at The University of Sydney received funding for a pilot project in open source drug discovery from the Medicines for Malaria Venture (MMV). The project champion at the outset was [http://www.mmv.org/about-us/our-team/tim-wells Tim Wells]. [http://www.mmv.org/about-us/our-team/jeremy-burrows Jeremy Burrows] came on board and led the suggestion to go after a few of the actives that had been [http://www.nature.com/nature/journal/v465/n7296/abs/nature09107.html placed in the public domain in 2010] by GlaxoSmithKline and others. Work got [http://malaria.ourexperiment.org/tcmdc_ap/1/PaalKnorr_Synthesis_of_1aryl25dimethyl_Pyrrole_Core_PMY_11.html underway in the lab] in August 2011. The team were [http://sydney.edu.au/research_support/funding/arc/linkage_2012.shtml successful] in securing further funding from an Australian Research Council Linkage grant - a scheme where funds from an external agency are matched by the Australian Government. This has generated funding from May 2012 for three years.
-
Naturally we're all really excited about this. The scientific idea behind the project is familiar medicinal chemistry territory - we need to find a small molecule that is effective for the treatment of malaria, and we will do that by making molecules (my lab's primary responsibility) and evaluating them (with collaborators). Based on those results, we make analogs, or ditch the series and pick another. We started with the [http://openwetware.org/wiki/OSDDMalaria:GSK_Arylpyrrole_Series arylpyrrole series] that was one of the most attractive sets in the [https://www.ebi.ac.uk/chemblntd/#tcams_dataset original GSK dataset], but there are [http://pubs.acs.org/doi/abs/10.1021/ml200135p plenty of other series that are also very attractive] from a medchem perspective.
+
The scientific idea behind the project is familiar medicinal chemistry methodology - the aim is to find a small molecule that is effective for the treatment of malaria, and that involves generating molecules (the Todd lab's primary responsibility) and evaluating them (with other members of the project). Based on the biological results, analogs are made, or the series might be ditched and another one selected. The first series to be tried is based on an [http://openwetware.org/wiki/OSDDMalaria:GSK_Arylpyrrole_Series arylpyrrole] that was one of the most attractive hits in the [https://www.ebi.ac.uk/chemblntd/#tcams_dataset original GSK dataset], but there are [http://pubs.acs.org/doi/abs/10.1021/ml200135p plenty of other series that are also very attractive] from a medchem perspective.
-
The difference with this project though (as we previously described in the [http://www.thesynapticleap.org/node/343 6 Laws]) is that everything is open, meaning all the experiments go on the web (including the ones that did not turn out well). All the data are available. Anyone can do anything they wish with the compounds, with the proviso we are cited - the licence for the project is CC-BY-3.0, though this is sometimes not yet clear on all the various websites we use. The main difference is that anyone can take part - that people may make molecules, offer guidance and input in other ways that change the direction of the project as it is happening. i.e. rather than releasing all our data at the end of the project we release as the project is happening so that people can really become involved in the research. Thus the iterative cycle of analog synthesis in response to biological data that is normally guided by a kind of medchem intuition is now guided by the intuition of the collective. Similarly, since the biological data are all open too, it should be easier to form an objective assessment of a molecule's performance divorced from the judgement of those closest to the compounds. In the same way that in software development "[http://en.wikipedia.org/wiki/Linus%27_Law with enough eyeballs all bugs are shallow]" we hope that the open nature of the research makes the science better and faster. As it did with our [http://www.nature.com/nchem/journal/v3/n10/full/nchem.1149.html previous synthetic project with praziquantel].
+
The difference with this project though (as described in the [http://www.thesynapticleap.org/node/343 6 Laws]) is that everything is open, meaning all the experiments go on the web (including the ones that did not turn out well). All the data are available. Anyone can do anything they wish with the compounds, with the proviso the project is cited (see licence conditions below). The main difference is that anyone can take part - people may make molecules, offer guidance and input in other ways that ''change the direction of the project as it is happening'', i.e. rather than the release of all data at the end of the project, data are released as the project is happening so that people can become genuinely involved in the research. Thus the iterative cycle of analog synthesis in response to biological data that is normally guided by luck and medchem intuition is now guided by the intuition of the collective. Similarly, since the biological data are all open too, it should be easier to form an objective assessment of a molecule's performance divorced from the judgement of those closest to the compounds. In the same way that in software development "[http://en.wikipedia.org/wiki/Linus%27_Law with enough eyeballs all bugs are shallow]" the open nature of the research makes the science better and faster. This was found to be the case in a [http://www.nature.com/nchem/journal/v3/n10/full/nchem.1149.html previous synthetic chemistry project involving the drug praziquantel].
-
==Two Rounds of Synthesis and Evaluation==
+
==Rounds of Synthesis and Evaluation Completed to Date==
-
Paul Ylioja started by resynthesising the two known active compounds from the GSK set (OSM-S-5 and OSM-S-6), plus a few simple derivatives, and [http://www.thesynapticleap.org/node/367 confirming that they were active]. The current list of all the compounds made thus far in this part of the project is kept in [http://bit.ly/OSDDcompounds this spreadsheet]. The [http://malaria.ourexperiment.org/biological_data/month/1325376000 biological evaluation] was carried out by three separate labs to ensure we were on a solid footing. The original compounds contained an ester which was thought likely to hydrolyze ''in vivo'', so various versions of the "lower half" of these leads were also evaluated to check whether the original hits were prodrugs, but all these compounds were found to be inactive. Our MMV project champion [http://www.mmv.org/about-us/our-team/paul-willis Paul Willis], who has been working closely with us ever since, [http://www.thesynapticleap.org/node/349 recommended] a few "near neighbor" compounds that also looked interesting, and we made a number of these too, which were evaluated in this first round, and one, OSM-S-9, was found to be more active than the original compounds. [https://plus.google.com/u/0/111817031902595048944/posts Sanjay Batra] came on board the project and his student [https://plus.google.com/u/0/113151089809892205923/posts Soumya] made [http://malaria.ourexperiment.org/cdriarylpyrroles some analogs] varying in the position of the fluorine, though the activity of those tested to date is low. (Sanjay works at the [http://www.cdriindia.org/analytical.htm CDRI] in Lucknow, India, where [https://plus.google.com/u/0/116616719379353298385/posts Saman Habib] also works - Saman is leading the [http://malaria.osdd.net/ Indian OSDDm project] which will shortly get started). The outcome was that the original hits remained interesting (because of their reasonable potency and logP values) but that we were clearly also generating highly potent novel antimalarials in this class. Thus a [http://www.thesynapticleap.org/node/381 second round of compounds were synthesized and evaluated], and this gave rise to several new highly potent compounds, one of which (OSM-S-39) displayed a picomolar IC50 value. This is quite impressive given the small number of compounds made to date, and is perhaps testament to the quality of the hits contained in the original GSK set.
+
[https://plus.google.com/u/0/b/114702323662314783325/115627447826173336765/posts Paul Ylioja] started by resynthesising the two known active compounds from the GSK set (OSM-S-5 and OSM-S-6 - structures below), plus a few simple derivatives, and [http://www.thesynapticleap.org/node/367 confirming that they were active]. The current list of all the compounds made thus far in this part of the project is kept in [http://bit.ly/OSDDcompounds this spreadsheet]. The [http://malaria.ourexperiment.org/biological_data/month/1325376000 biological evaluation] was carried out by three separate labs ([http://www.discoverybiology.org/team/vicky-avery Vicky Avery], [http://www.bio21.org/group-leaders/bio-chemistry/stuart-ralph Stuart Ralph] and the [http://www.gsk.com/collaborations/tres-cantos.htm original GSK Tres Cantos Lab led by Javier Gamo]) to ensure a solid footing of repeatability. The original compounds contained an ester which was thought likely to hydrolyze ''in vivo'', so various versions of the "lower half" of these leads were also evaluated to check whether the original hits were prodrugs, but all these compounds were found to be inactive. The project champion from MMV, [http://www.mmv.org/about-us/our-team/paul-willis Paul Willis], [http://www.thesynapticleap.org/node/349 recommended] a few "near neighbor" compounds that also looked interesting, and a number of these were made too and evaluated in this first round. One compound, OSM-S-9, was found to be more active than the original GSK hits. [https://plus.google.com/u/0/111817031902595048944/posts Sanjay Batra] came on board the project and his student [https://plus.google.com/u/0/113151089809892205923/posts Soumya] made (and is making) [http://malaria.ourexperiment.org/cdriarylpyrroles some analogs] varying in the position of the fluorine atom, though the activity of those tested to date is low. (Sanjay works at the [http://www.cdriindia.org/analytical.htm CDRI] in Lucknow, India, where [https://plus.google.com/u/0/116616719379353298385/posts Saman Habib] also works - Saman is leading the [http://malaria.osdd.net/ Indian OSDDm project] which will hopefully get started soon). The outcome was that the original hits remained interesting (because of their reasonable potency and logP values) but that highly potent novel antimalarials were also being generated in this class. Thus a [http://www.thesynapticleap.org/node/381 second set of compounds was synthesized and evaluated], giving rise to several new highly potent compounds, one of which (OSM-S-39) displayed a picomolar IC50 value. This is impressive given the small number of compounds made to date, and is testament to the quality of the hits contained in the original GSK set.
[[Image:Compounds of Interest in Update.png|thumb|center|500px| '''Original and Representative Potent Compounds in the Arylpyrrole Series]]
[[Image:Compounds of Interest in Update.png|thumb|center|500px| '''Original and Representative Potent Compounds in the Arylpyrrole Series]]
-
At this point the decision was taken to take the most promising compounds on to advanced biological evaluation, to see what the promise of this class really is (rather than continue to increase potency through analog synthesis). To date the evaluations have involved:
+
At this point the decision was taken to take the most promising compounds on to advanced biological evaluation, to see what the promise of this class really is (rather than continue to increase potency through analog synthesis). Evaluations to date are as follows.
-
* Metabolic assays on the two original GSK compounds and six other compounds made in this project were performed by [http://www.pharm.monash.edu.au/staff/sacharman.html Sue Charman's lab at Monash]. The raw data are [http://malaria.ourexperiment.org/biological_data/3101 here] and can be discussed [http://www.thesynapticleap.org/node/401 here]. The originals displayed good solubility but moderate degradation rates. The other compounds were degraded more slowly but at a cost of low solubility.
+
* '''Metabolic and solubility assays''': The two original GSK compounds (OSM-S-5 and OSM-S-6) and six other compounds made in this project were evaluated by [http://www.pharm.monash.edu.au/staff/sacharman.html Sue Charman's lab at Monash] for their stability in phosphate buffer. The raw data are [http://malaria.ourexperiment.org/biological_data/3101 here] and can be discussed [http://www.thesynapticleap.org/node/401 here]. The GSK originals displayed good solubility but moderate degradation rates. The other compounds were degraded more slowly but at a cost of low solubility. Subsequently the original GSK compound OSM-S-5 was evaluated for stability in human and mouse plasma. The compound was stable in human plasma but susceptible to hydrolysis in mouse plasma. Esterase activity is known to be higher in rodents than in other species which was confirmed using a control compound in this assay (p-nitrophenol acetate) so the results are not too surprising. Lab book page [http://malaria.ourexperiment.org/biological_data/3598/Human_and_Mouse_Plasma_Stability_of_OSMS5PMY_106.html here].
-
* One of the original GSK compounds (OSM-S-5) plus one of the most potent novel compounds identified to date (OSM-S-35) were subjected to the [http://malaria.ourexperiment.org/biological_data/2999 hERG assay and passed], perhaps implying that this class of compounds should not exhibit undesirable cardiac side effects. Discussion page [http://www.thesynapticleap.org/node/402 here].
+
* '''hERG''': One of the original GSK compounds (OSM-S-5) plus one of the most potent novel compounds identified to date (OSM-S-35) were subjected to the [http://malaria.ourexperiment.org/biological_data/2999 hERG assay and passed], perhaps implying that this class of compounds should not exhibit undesirable cardiac side effects. Discussion page [http://www.thesynapticleap.org/node/402 here].
-
* Four of the compounds have also been [http://malaria.ourexperiment.org/biological_data/3066 evaluated in a late stage gametocyte assay] with very interesting results indicating unusually high activity in blocking the transmission of the parasite. The original GSK compound OSM-S-5 was inactive. Discussion page [http://www.thesynapticleap.org/node/403 here].
+
* '''Late Stage Gametocyte Assay''': Four of the compounds have also been [http://malaria.ourexperiment.org/biological_data/3066 evaluated in a late stage gametocyte assay] with very interesting results indicating unusually high activity in blocking the transmission of the parasite. The original GSK compound OSM-S-5 was inactive. Discussion page [http://www.thesynapticleap.org/node/403 here].
-
* However, the two original GSK compounds as well as one of the most promising novel compounds have been evaluated in mice and found to possess zero oral efficacy (spreadsheet coming as soon as cleared by creator).
+
* '''In vivo''': However, the two original GSK compounds as well as one of the most promising near neighbor compounds (OSM-S-35) were evaluated in mice and found to possess zero oral efficacy (Results available [http://malaria.ourexperiment.org/uri/ed here]). Subsequent analysis of the plasma samples from the trial with one of the GSK compounds (OSM-S-5) showed that the compound was indeed orally available, but levels in the blood were not being maintained. Raw data [http://malaria.ourexperiment.org/biological_data/3825 here]. 
-
==What Are the Compounds Doing?==
+
Other discussions of the biological results described above can be found [http://www.thesynapticleap.org/node/405 here]. The biological data requires careful consideration about whether to change the focus of the project to another series or whether to continue to alter the structures of the best compounds to overcome the ''in vivo'' roadblock.
-
What might this series of compounds be doing to the parasite to kill it so effectively? We're not sure yet. The original screens were whole-cell assays, so while we know the compounds are effective, we don't know what they're doing in any detail. Iain Wallace from ChEMBL has done [http://www.thesynapticleap.org/node/387 a very neat prediction] of the biological role of these compounds (as well as predictions for the whole "[http://www.mmv.org/malariabox Malaria Box]", which is a set of compounds MMV are providing to people for antimalarial screening and which are the focus of a [http://www.grandchallenges.org/Explorations/Topics/Pages/AntimalarialCompoundsRound9.aspx current round] of Gates requests for proposals). Iain clustered the compounds as a similarity map, which is a neat way of visualizing the correlation between structure and predicted activity. Discussed also [https://plus.google.com/u/0/b/114702323662314783325/114702323662314783325/posts/F5B9nA2sJLr here]. One of the predictions was that the compounds hit an enzyme known as DHODH, and GSK are at the time of writing screening some of the compounds against this enzyme. These predictions are made using informatics - a comparison of the structures of our compounds with other known compounds that have known activities. It's an argument based on extrapolation. We're seeking to examine whether the prediction is correct by sending a subset of compounds to [http://chemogenomics.med.utoronto.ca/hiplab/index.php Corey Nislow] at the University of Toronto for examination in a yeast-based assay he's developed which does not identify for sure what the compounds are doing (which is very difficult) but provides harder biological evidence for a role.
+
A decision was taken to carry out a third round of analog synthesis and evaluation on the arylpyrrole series, with an emphasis on analogs a) with low logP and b) that lack the thiazolidinone heterocycle. A consultation [http://www.thesynapticleap.org/node/412 occurred] asking for suggestions for the ten most appropriate compounds to make, and the ten most interesting for commercial procurement. The final stage of the consultation took place [http://www.youtube.com/watch?v=ooM8kuo14Bg live on the web]. As a result the lists were [http://www.thesynapticleap.org/node/416 finalised]; commercial compounds were ordered and synthesis commenced. In early November, the team received results from the [http://www.thesynapticleap.org/node/430 biological evaluation] of the commercial compounds and the the synthetic compounds that had been completed, along with some analogues. This set of compounds were found to possess low to negligible levels of activity. This surprised the team but also provided some interesting insights. For example, a forked series, the [http://openwetware.org/wiki/OSDDMalaria:Arylpyrazole_Series pyrazoles], looked attractive but so far, all examples tested (e.g., OSM-S-92) were found to be inactive. The next batch of 'third round compounds were synthesised and evaluated in December 2012. As with the first batch, all of the compounds were essentially inactive (OSM-S-103 showed mild activity).  
-
==How do we Obtain Other Compounds?==
+
[[Image:tssfar.png|thumb|center|700px| '''Representative Examples from the Third Round]]
-
The original compounds from the GSK assay were commercially available, arising from libraries that are provided to larger companies by smaller specialised companies. In a medicinal chemistry project like this one starts with a set of compounds, then one sources further compounds that are similar. Novel compounds need to be made. Other compounds may be available by other means, however. Typically in academia we just make all the compounds in-house because we worked as a closed system, which is inefficient if relevant compounds exist elsewhere on the Earth and can be sourced by other means more quickly. In this project organic synthesis of ''novel compounds'' is currently being performed in Sydney and Lucknow. But for known compounds we need to do the following:
+
Currently (Dec 2012) there remain a [http://www.thesynapticleap.org/node/429 few compounds] which need synthetic input from any labs interested in contributing. The project has to date had no luck in securing donations of the compounds from commercial suppliers. On considering some structurally 'similar' active compounds from the TCAMS set, the team have also decided to synthesise a few [https://plus.google.com/b/114702323662314783325/114702323662314783325/posts/LzxZxukacpn extra compounds] plus hybrids, in order to assess their biological activity. An open consultation is set to take place on Monday 17th December in order to decide next steps.
-
1) '''Identify Commercial Compounds:''' What if some compounds we require are already commercially available? How can these be found? Iain Wallace [http://www.thesynapticleap.org/node/399 was able to do a search] of databases such as eMolecules for relevant compounds above a certain threshold of similarity and filter compounds by supplier, generating the "hitlist" quickly and with no manual human input. These can be converted into spreadhseets for quick visualisation - see [https://plus.google.com/u/0/b/114702323662314783325/115975655197247500095/posts/X2Djh2szx9F here] for examples on Jimmy Cronshaw's series (see below for these).
+
==What Are the Compounds Doing?==
-
2) '''Get Commercial Compounds:''' With the compounds identified, the relevant suppliers need to be contacted to ask for donations. That will probably not be trivial since it needs a human interaction. Failing that the compounds can just be bought.
+
What might this series of compounds be doing to the parasite to kill it so effectively? It's not clear. The original screens were whole-cell assays, so while it is known that the compounds are effective, it's not known what they are doing in any detail. Iain Wallace from ChEMBL has performed [http://www.thesynapticleap.org/node/387 a prediction] of the biological role of these compounds (as well as predictions for the whole "[http://www.mmv.org/malariabox Malaria Box]", which is a set of compounds MMV are providing to people for antimalarial screening and which are the focus of a [http://www.grandchallenges.org/Explorations/Topics/Pages/AntimalarialCompoundsRound9.aspx current round] of Gates requests for proposals). Iain clustered the compounds as a similarity map, allowing visualization of the correlation between structure and predicted activity. Discussed also [https://plus.google.com/u/0/b/114702323662314783325/114702323662314783325/posts/F5B9nA2sJLr here]. One of the predictions was that the compounds hit an enzyme known as DHODH, and GSK are at the time of writing screening some of the project's compounds against this enzyme. These predictions are made using informatics - a comparison of the structures of our compounds with other known compounds that have known activities. The argument is based on extrapolation. To evaluate whether the prediction is correct, a subset of compounds has been sent to [http://chemogenomics.med.utoronto.ca/hiplab/index.php Corey Nislow] at the University of Toronto for screening in a yeast-based assay that does not identify for sure what the compounds are doing (which is very difficult) but provides harder biological evidence for a role.
-
3) '''Identify Other/academic Compounds:''' What about compounds that could be useful to the project but which are not commercially available, e.g. compounds sitting in academic lab fridges. Some can be found using resources like SciFinder, though these require expensive subscriptions. Many compounds that might be perfect for the project may not even be in the published literature, which is another argument for openness in science.
+
==How to Obtain Other Compounds==
-
4) '''Get Other/academic Compounds:''' This will be a case of manual contact with interested groups. [https://plus.google.com/u/0/b/114702323662314783325/114702323662314783325/posts/VmPupxxrWEr One such enquiry] has already been submitted, to see how it goes. Naturally contributions are rewarded by possible authorship on resulting papers.
+
The original compounds from the GSK assay were commercially available, arising from libraries that are provided to larger companies by smaller specialised companies. In a medicinal chemistry project like this one starts with a set of compounds, then one sources further compounds that are similar. Novel compounds need to be made. Other compounds may be available by other means, however. Typically in academia required compounds are made in-house because academia works as a closed system, which is inefficient if relevant compounds exist elsewhere on the planet and can be sourced by other means more quickly. In this project organic synthesis of ''novel compounds'' is currently being performed in Sydney and Lucknow. But for ''known'' compounds the following needs to be done.
-
Ultimately, as a species, we are not very efficient at sharing valuable chemical resources. A user-driven list would be helpful - "I need compound X. I can buy it, or you can give it to me, or you can make it for me or with me, but I need the compound in timeframe Y." This is like a [http://intermolecular.wordpress.com/2012/04/20/molecular-craigslist/ Molecular Craigslist], and would reduce some of the supply-demand barriers.
+
1) '''Identification of Commercial Compounds:''' What if some compounds we require are already commercially available? How can these be found? Iain Wallace [http://www.thesynapticleap.org/node/399 was able to do a search] of databases such as eMolecules for relevant compounds above a certain threshold of similarity and filter compounds by supplier, generating the "hitlist" quickly and with no manual human input. These can be converted into spreadhseets for quick visualisation - see [https://plus.google.com/u/0/b/114702323662314783325/115975655197247500095/posts/X2Djh2szx9F here] for examples on Jimmy Cronshaw's series (see below for these).
-
==Related Series==
+
2) '''Obtaining Commercial Compounds:''' With the compounds identified, the relevant suppliers need to be contacted to ask for donations. That will probably not be trivial since it needs a human interaction. Failing that the compounds can just be bought.
-
The arypyrrole series is the first to be examined. A forked series, the [http://openwetware.org/wiki/OSDDMalaria:Arylpyrazole_Series pyrazoles], looks attractive but has not yet received a great deal of input. Two other interesting-looking starting points from the GSK set, based on a [http://openwetware.org/wiki/OSDDMalaria:GSK_Triazolourea_Singleton triazolourea] and a [http://openwetware.org/wiki/OSDDMalaria:GSK_Amino-thienopyrimidine_Series thienopyrimidine], are currently being resynthesised by [https://plus.google.com/u/0/b/114702323662314783325/115975655197247500095/posts James Cronshaw] in Sydney to confirm their activity, upon which time it will be necessary to decide which, if either, are to be taken on further. In general a strategy for the project is to decide as early as possible which series are looking attractive biologically, because there are still a lot of leads to pursue. If any of these structures look synthetically attractive to you based on your previous experience, please consider joining the project.
+
3) '''Identification of Other/academic Compounds:''' What about compounds that could be useful to the project but which are not commercially available, e.g. compounds sitting in academic lab fridges? Some can be found using resources like SciFinder, though these require expensive subscriptions. Many compounds that might be perfect for the project may not even be in the published literature, which is another argument for openness in science.
-
==Open Source Drug Discovery More Generally==
+
4) '''Get Other/academic Compounds:''' This will be a case of manual contact with interested groups. [https://plus.google.com/u/0/b/114702323662314783325/114702323662314783325/posts/VmPupxxrWEr One such enquiry] has already been submitted, to see what happens. Naturally contributions are rewarded by possible authorship on resulting papers.
-
A [http://openwetware.org/wiki/OSDDMalaria:OSDD_Malaria_Meeting_Sydney_2012 one-day meeting on open source drug discovery for malaria] was held in February 2012. General issues surrounding the feasibility of open source drug discovery were discussed, followed by more specific malaria-related ideas. These talks are gradually [http://www.youtube.com/playlist?list=PL84A4E62C3C72863D going up on YouTube] with [http://www.thesynapticleap.org/node/390 annotations], and they frame many of the relevant issues, for example the landscape of drug discovery in neglected diseases, and whether patents are necessary in drug discovery. An important message is that open ''source'' drug discovery is where anyone may participate in driving the research, which is different from a more general use of the word "open" where data are made freely available, but perhaps after a delay which essentially prevents participation by others.
+
Ultimately, as a species, we are not very efficient at sharing valuable chemical resources. A user-driven list would be helpful - "I need compound X. I can buy it from you, or you can give it to me, or you can make it for me or with me, but I need the compound in timeframe Y." This is like a [http://intermolecular.wordpress.com/2012/04/20/molecular-craigslist/ Molecular Craigslist], and would reduce some of the supply-demand barriers in chemical research.
-
==How We Run the Project==
+
==Related Series==
-
 
+
-
The way the project is run is one of the novelties, though as with everything in this project nothing is static and advice is always welcome on improvements. Raw experimental data are recorded in an online, openly-readable [http://malaria.ourexperiment.org/ electronic lab notebook]. [http://www.thesynapticleap.org/malaria/community The Synaptic Leap] is being used to discuss ideas and results, as well as plan future work. The project's [https://plus.google.com/u/0/b/114702323662314783325/114702323662314783325/posts Google+ page] is a light way to keep up with developments and discuss. The project's [https://twitter.com/#!/OSDDMalaria Twitter feed] is a broadcast mechanism for updates. LinkedIn as used in the past on another project as a way of connecting with relevant experts, but has not been used much so far in this project. A [http://openwetware.org/wiki/Open_Source_Drug_Discovery_-_Malaria wiki] (that includes this page) is used to host the current overall project status. If you wish to participate in this project, you can sign up to all these sites, and you would then be sent the Twitter/G+ passwords so you can used the same accounts. A Facebook page is needed next but doesn't exist yet.
+
-
 
+
-
==Why Take Part?==
+
-
 
+
-
What of motivations? Why would people want to contribute to this project? Partly to solve a problem. Partly to be involved with quality science that is open, and hence subject to the most brutal form of ongoing peer-review. Partly for academic credentials since regular peer-reviewed papers will come from the project. Partly to demonstrate competence publicly. Perhaps a mixture of all these things.
+
-
A competition is possible in the future, i.e. with a cash prize. Progress towards a very promising lead compound series has been rapid, but there is a long road to a compound that looks sufficiently promising that it moves towards clinical trials. There's a lot of tweaking, and perhaps even the move to another series. It's not obvious what will happen. It's likely the project will need a lot more input than it has received to date. A prize may increase traffic and input. The competition would be teamless, however, awarded based on performance of individuals within a group where everything is shared. This is difficult to judge, difficult to award, and hence worth doing. More about this is [http://intermolecular.wordpress.com/2011/11/13/open-science-funding-government-grants-and-cash-incentives/ here].
+
The arypyrrole series was the first to be examined. Two other interesting-looking starting points from the GSK set, based on a [http://openwetware.org/wiki/OSDDMalaria:GSK_Triazolourea_Singleton triazolourea] and a [http://openwetware.org/wiki/OSDDMalaria:GSK_Amino-thienopyrimidine_Series thienopyrimidine], were the focus of [http://openwetware.org/wiki/User:Jim_Cronshaw James Cronshaw's] honours [http://figshare.com/articles/ThesisforUpload.pdf/102049 thesis] at Sydney University. The triazolourea (OSM-S-56) was found to be [https://plus.google.com/b/114702323662314783325/114702323662314783325/posts/QfynoBafbXt even more active] than the original GSK hit (TCMDS 134395). James has synthesised the [http://malaria.ourexperiment.org/aminotpseries/5827/Suzuki_coupling_using_the_same_conditions_that_are_used_to_generate_boronate_esters.html thienopyrimidine] series lead, and we are awaiting confirmation of the compounds antimalarial activity. If the lead compound is confirmed as active then it will be necessary to decide which, if either, of the series are to be taken on further. In general a strategy for the project is to decide as early as possible which series are looking attractive biologically, because there are still a lot of leads to pursue. If any of these structures look synthetically attractive to you based on your previous experience, please consider joining the project. Some of these molecules are quite straightforward to make and would be suitable for undergraduate lab classes.
-
==Ownership==
+
==Comments==
-
A final point - the project is open. Nobody owns it. Those people most active in the project lead it while they are active. If you wish to contribute, in any capacity, please do so. There is no need to "clear" anything with existing project members by email first. It's often the case that current participants will receive questions/suggestions by email. In the development of Linux, the need for Linus Torvalds to approve everything caused a bottleneck, and the observation that "Linus doesn't scale". Nobody scales, but the team does. So it's more efficient if all the project discussions are held publicly. Many people do not like this idea. In science the idea of "beta testing" something is alien. When data are released in science there is an expectation that the data are correct, and usually accompanied by an explanation. This project eschews this view. All data are released immediately, all discussions are public, anyone can participate.
+
Comments and input can be by [http://twitter.com/#!/osddmalaria Twitter], [https://plus.google.com/u/0/b/114702323662314783325/114702323662314783325/posts Google+], [http://www.thesynapticleap.org/node/342 The Synaptic Leap] or directly on the relevant [http://malaria.ourexperiment.org/ Electronic Lab Notebooks]. This page also has a "talk" tab for input. Or please write your own input somewhere (''e.g.'', your own blog) and link to the project. Anonymity is perfectly acceptable, just less useful. Please avoid email.
==Licence==
==Licence==
The project's licence unless otherwise stated is [http://creativecommons.org/licenses/by/3.0/ CC-BY-3.0] meaning you can use whatever you want for whatever purpose, provided you cite the project.
The project's licence unless otherwise stated is [http://creativecommons.org/licenses/by/3.0/ CC-BY-3.0] meaning you can use whatever you want for whatever purpose, provided you cite the project.

Revision as of 06:15, 19 January 2013

Malaria Home        OSM So Far        Compound Series        Links        Open Source Research Home        Tech Ops        FAQ       


This is a human-readable summary of the open source drug discovery for malaria project. A less readable collection of all the relevant data can be found here.

Contents

The story so far in the open source drug discovery for malaria project

History

Last year the Todd lab at The University of Sydney received funding for a pilot project in open source drug discovery from the Medicines for Malaria Venture (MMV). The project champion at the outset was Tim Wells. Jeremy Burrows came on board and led the suggestion to go after a few of the actives that had been placed in the public domain in 2010 by GlaxoSmithKline and others. Work got underway in the lab in August 2011. The team were successful in securing further funding from an Australian Research Council Linkage grant - a scheme where funds from an external agency are matched by the Australian Government. This has generated funding from May 2012 for three years.

The scientific idea behind the project is familiar medicinal chemistry methodology - the aim is to find a small molecule that is effective for the treatment of malaria, and that involves generating molecules (the Todd lab's primary responsibility) and evaluating them (with other members of the project). Based on the biological results, analogs are made, or the series might be ditched and another one selected. The first series to be tried is based on an arylpyrrole that was one of the most attractive hits in the original GSK dataset, but there are plenty of other series that are also very attractive from a medchem perspective.

The difference with this project though (as described in the 6 Laws) is that everything is open, meaning all the experiments go on the web (including the ones that did not turn out well). All the data are available. Anyone can do anything they wish with the compounds, with the proviso the project is cited (see licence conditions below). The main difference is that anyone can take part - people may make molecules, offer guidance and input in other ways that change the direction of the project as it is happening, i.e. rather than the release of all data at the end of the project, data are released as the project is happening so that people can become genuinely involved in the research. Thus the iterative cycle of analog synthesis in response to biological data that is normally guided by luck and medchem intuition is now guided by the intuition of the collective. Similarly, since the biological data are all open too, it should be easier to form an objective assessment of a molecule's performance divorced from the judgement of those closest to the compounds. In the same way that in software development "with enough eyeballs all bugs are shallow" the open nature of the research makes the science better and faster. This was found to be the case in a previous synthetic chemistry project involving the drug praziquantel.

Rounds of Synthesis and Evaluation Completed to Date

Paul Ylioja started by resynthesising the two known active compounds from the GSK set (OSM-S-5 and OSM-S-6 - structures below), plus a few simple derivatives, and confirming that they were active. The current list of all the compounds made thus far in this part of the project is kept in this spreadsheet. The biological evaluation was carried out by three separate labs (Vicky Avery, Stuart Ralph and the original GSK Tres Cantos Lab led by Javier Gamo) to ensure a solid footing of repeatability. The original compounds contained an ester which was thought likely to hydrolyze in vivo, so various versions of the "lower half" of these leads were also evaluated to check whether the original hits were prodrugs, but all these compounds were found to be inactive. The project champion from MMV, Paul Willis, recommended a few "near neighbor" compounds that also looked interesting, and a number of these were made too and evaluated in this first round. One compound, OSM-S-9, was found to be more active than the original GSK hits. Sanjay Batra came on board the project and his student Soumya made (and is making) some analogs varying in the position of the fluorine atom, though the activity of those tested to date is low. (Sanjay works at the CDRI in Lucknow, India, where Saman Habib also works - Saman is leading the Indian OSDDm project which will hopefully get started soon). The outcome was that the original hits remained interesting (because of their reasonable potency and logP values) but that highly potent novel antimalarials were also being generated in this class. Thus a second set of compounds was synthesized and evaluated, giving rise to several new highly potent compounds, one of which (OSM-S-39) displayed a picomolar IC50 value. This is impressive given the small number of compounds made to date, and is testament to the quality of the hits contained in the original GSK set.

Original and Representative Potent Compounds in the Arylpyrrole Series
Original and Representative Potent Compounds in the Arylpyrrole Series

At this point the decision was taken to take the most promising compounds on to advanced biological evaluation, to see what the promise of this class really is (rather than continue to increase potency through analog synthesis). Evaluations to date are as follows.

  • Metabolic and solubility assays: The two original GSK compounds (OSM-S-5 and OSM-S-6) and six other compounds made in this project were evaluated by Sue Charman's lab at Monash for their stability in phosphate buffer. The raw data are here and can be discussed here. The GSK originals displayed good solubility but moderate degradation rates. The other compounds were degraded more slowly but at a cost of low solubility. Subsequently the original GSK compound OSM-S-5 was evaluated for stability in human and mouse plasma. The compound was stable in human plasma but susceptible to hydrolysis in mouse plasma. Esterase activity is known to be higher in rodents than in other species which was confirmed using a control compound in this assay (p-nitrophenol acetate) so the results are not too surprising. Lab book page here.
  • hERG: One of the original GSK compounds (OSM-S-5) plus one of the most potent novel compounds identified to date (OSM-S-35) were subjected to the hERG assay and passed, perhaps implying that this class of compounds should not exhibit undesirable cardiac side effects. Discussion page here.
  • Late Stage Gametocyte Assay: Four of the compounds have also been evaluated in a late stage gametocyte assay with very interesting results indicating unusually high activity in blocking the transmission of the parasite. The original GSK compound OSM-S-5 was inactive. Discussion page here.
  • In vivo: However, the two original GSK compounds as well as one of the most promising near neighbor compounds (OSM-S-35) were evaluated in mice and found to possess zero oral efficacy (Results available here). Subsequent analysis of the plasma samples from the trial with one of the GSK compounds (OSM-S-5) showed that the compound was indeed orally available, but levels in the blood were not being maintained. Raw data here.

Other discussions of the biological results described above can be found here. The biological data requires careful consideration about whether to change the focus of the project to another series or whether to continue to alter the structures of the best compounds to overcome the in vivo roadblock.

A decision was taken to carry out a third round of analog synthesis and evaluation on the arylpyrrole series, with an emphasis on analogs a) with low logP and b) that lack the thiazolidinone heterocycle. A consultation occurred asking for suggestions for the ten most appropriate compounds to make, and the ten most interesting for commercial procurement. The final stage of the consultation took place live on the web. As a result the lists were finalised; commercial compounds were ordered and synthesis commenced. In early November, the team received results from the biological evaluation of the commercial compounds and the the synthetic compounds that had been completed, along with some analogues. This set of compounds were found to possess low to negligible levels of activity. This surprised the team but also provided some interesting insights. For example, a forked series, the pyrazoles, looked attractive but so far, all examples tested (e.g., OSM-S-92) were found to be inactive. The next batch of 'third round compounds were synthesised and evaluated in December 2012. As with the first batch, all of the compounds were essentially inactive (OSM-S-103 showed mild activity).

Representative Examples from the Third Round
Representative Examples from the Third Round

Currently (Dec 2012) there remain a few compounds which need synthetic input from any labs interested in contributing. The project has to date had no luck in securing donations of the compounds from commercial suppliers. On considering some structurally 'similar' active compounds from the TCAMS set, the team have also decided to synthesise a few extra compounds plus hybrids, in order to assess their biological activity. An open consultation is set to take place on Monday 17th December in order to decide next steps.

What Are the Compounds Doing?

What might this series of compounds be doing to the parasite to kill it so effectively? It's not clear. The original screens were whole-cell assays, so while it is known that the compounds are effective, it's not known what they are doing in any detail. Iain Wallace from ChEMBL has performed a prediction of the biological role of these compounds (as well as predictions for the whole "Malaria Box", which is a set of compounds MMV are providing to people for antimalarial screening and which are the focus of a current round of Gates requests for proposals). Iain clustered the compounds as a similarity map, allowing visualization of the correlation between structure and predicted activity. Discussed also here. One of the predictions was that the compounds hit an enzyme known as DHODH, and GSK are at the time of writing screening some of the project's compounds against this enzyme. These predictions are made using informatics - a comparison of the structures of our compounds with other known compounds that have known activities. The argument is based on extrapolation. To evaluate whether the prediction is correct, a subset of compounds has been sent to Corey Nislow at the University of Toronto for screening in a yeast-based assay that does not identify for sure what the compounds are doing (which is very difficult) but provides harder biological evidence for a role.

How to Obtain Other Compounds

The original compounds from the GSK assay were commercially available, arising from libraries that are provided to larger companies by smaller specialised companies. In a medicinal chemistry project like this one starts with a set of compounds, then one sources further compounds that are similar. Novel compounds need to be made. Other compounds may be available by other means, however. Typically in academia required compounds are made in-house because academia works as a closed system, which is inefficient if relevant compounds exist elsewhere on the planet and can be sourced by other means more quickly. In this project organic synthesis of novel compounds is currently being performed in Sydney and Lucknow. But for known compounds the following needs to be done.

1) Identification of Commercial Compounds: What if some compounds we require are already commercially available? How can these be found? Iain Wallace was able to do a search of databases such as eMolecules for relevant compounds above a certain threshold of similarity and filter compounds by supplier, generating the "hitlist" quickly and with no manual human input. These can be converted into spreadhseets for quick visualisation - see here for examples on Jimmy Cronshaw's series (see below for these).

2) Obtaining Commercial Compounds: With the compounds identified, the relevant suppliers need to be contacted to ask for donations. That will probably not be trivial since it needs a human interaction. Failing that the compounds can just be bought.

3) Identification of Other/academic Compounds: What about compounds that could be useful to the project but which are not commercially available, e.g. compounds sitting in academic lab fridges? Some can be found using resources like SciFinder, though these require expensive subscriptions. Many compounds that might be perfect for the project may not even be in the published literature, which is another argument for openness in science.

4) Get Other/academic Compounds: This will be a case of manual contact with interested groups. One such enquiry has already been submitted, to see what happens. Naturally contributions are rewarded by possible authorship on resulting papers.

Ultimately, as a species, we are not very efficient at sharing valuable chemical resources. A user-driven list would be helpful - "I need compound X. I can buy it from you, or you can give it to me, or you can make it for me or with me, but I need the compound in timeframe Y." This is like a Molecular Craigslist, and would reduce some of the supply-demand barriers in chemical research.

Related Series

The arypyrrole series was the first to be examined. Two other interesting-looking starting points from the GSK set, based on a triazolourea and a thienopyrimidine, were the focus of James Cronshaw's honours thesis at Sydney University. The triazolourea (OSM-S-56) was found to be even more active than the original GSK hit (TCMDS 134395). James has synthesised the thienopyrimidine series lead, and we are awaiting confirmation of the compounds antimalarial activity. If the lead compound is confirmed as active then it will be necessary to decide which, if either, of the series are to be taken on further. In general a strategy for the project is to decide as early as possible which series are looking attractive biologically, because there are still a lot of leads to pursue. If any of these structures look synthetically attractive to you based on your previous experience, please consider joining the project. Some of these molecules are quite straightforward to make and would be suitable for undergraduate lab classes.

Comments

Comments and input can be by Twitter, Google+, The Synaptic Leap or directly on the relevant Electronic Lab Notebooks. This page also has a "talk" tab for input. Or please write your own input somewhere (e.g., your own blog) and link to the project. Anonymity is perfectly acceptable, just less useful. Please avoid email.

Licence

The project's licence unless otherwise stated is CC-BY-3.0 meaning you can use whatever you want for whatever purpose, provided you cite the project.

Personal tools