Flat trendy vector illustration with hand drawn textures depicting pharmacy and medical research concept.

As a drug discovery research scientist tasked with finding new ideas for drug programs in a particular therapy area, one of your prime responsibilities is to be on top of all the science and literature, all the time. Innovation comes from knowing what everyone else knows, and thinking differently to exploit that knowledge for a proprietary edge. But how do you keep on top of what everyone else knows with the ever-increasing rate of publication, data release, patent filings, competitor, and market aspects? How do you ensure you don’t miss something obvious? The fear of missing out on vital information is a significant concern.

Drug discovery is a long and complex process, and the single most important step is right at the start: choosing the right target. How do you gather and consider the data and evidence that is vital in setting out the plan and turn that idea into a drug program? Drug discoverers engaged in this work have more than one thing to do. Finding ideas is only the start; they have to also work out how to test those ideas, map a path to therapeutic design and testing, and get them to candidates and to the clinic and beyond. All of this will need to be presented to internal governance meetings for milestone and investment decisions, and ultimately the regulators. These all take time, add complexity and potential distraction, but are essential.

It takes more than having a good idea to find a drug, so the more time a drug discoverer can save in comprehensively identifying all the relevant information, evidence and data, the better.

Semantics in Biomedical Data

Most scientists in this early drug discovery role face a range of public, proprietary, and other sources of information that they need to navigate and query successfully. The reality of this is a browser window full of tabs where the researcher is trying to enter similar queries to gather the results to interpret and build out the evidence that they’re looking for. This is already complex enough, but not all the platforms a researcher has access to work in the same way, there is little transparency around gaps in coverage, methods of searching can vary, and naming conventions and data structures can be significantly different between platforms.

This latter point is important. The semantics of naming in biomedical data is complex, confusing, conflicting, evolving, and handled in different ways by different evidence and data resources. As an example, the liver condition MASH – metabolic dysfunction-associated steatohepatitis – was until very recently known as NASH, non-alcohol related steatohepatitis. MASH can also be referred to as MASLD, metabolic dysfunction-associated steatotic liver disease. In disease indications where research is rapidly advancing, the naming and relationships of the processes involved change too. For researchers in the field this may be obvious, but to be able to query different data and evidence sources with all the possible combinations and permutations of synonyms to comprehensively return all the relevant data and content becomes very complex indeed.

Building out what a possible new drug program looks like

For most users gathering the data and evidence for early drug discovery is a very time consuming and frustrating process. How do you know that you’ve found all the evidence that is out there? How do you know whether you’ve formulated your query correctly for that platform? How can you be sure that you haven’t missed anything ‘obvious’? You had some training on this platform six months ago, how confident are you that you have made the best of what is available? Why can’t you do this in one place that covers all the literature and evidence you can access?

Well, there is one place, and it’s RightFind Navigate. Through bringing together  publicly available resources (e.g., ClinicalTrials.gov), licensed content sources, and internally created proprietary information a researcher may have access to in one place, and utilizing      semantics and synonym management from SciBite, it is possible to systematically search  multiple sources of evidence with comprehensive queries in one place. There’s no longer a need for ten other tabs and trying to reconstruct complex queries hoping that the right words have been used in the right order.

A researcher can have more confidence that everything that can be found has been found, and can switch from finding evidence, to understanding it and building out what a possible new drug program looks like. RightFind Navigate helps save time and frustration and allows researchers to get on with drug program building with confidence, and helps maximizes company ROI for proprietary data provision to impact exactly where it is needed.

Topic:

Author: Bryn Williams-Jones

Bryn Williams-Jones is a drug discoverer, and COO of Connected Discovery. He has over 30 years of experience in biochemistry, bioinformatics, applied AI, machine learning and data science. Previously SVP of Drug Discovery at Benevolent AI, he was also CEO of the OpenPHACTS Foundation providing open semantic data services to drug discovery, and spent 20 years at Pfizer Global R&D.