In October, an NIH team, led by Ian Hutchins, published their exciting new work, “Predicting Translational Progress in Biomedical Research.” The team built a machine learning system that detects whether a paper and its underlying research are likely to be clinically successful, i.e., lead to commercial products for human use. Specifically, the team created an AI-based, big-data model that evaluates research papers based on multiple metrics to determine if that particular study or innovation is likely to lead to “transformative clinical impact.” That likelihood of success is expressed as “Approximate Potential to Translate (APT).”
This technology has important implications that may shape how IP is valued. That is because early publications often serve as the primary evidence for patent protection. Accordingly, just as this machine learning system can reliably predict how successful a publication will be, the corresponding APT scores have great value in predicting how successful the resulting patent will be.
There is an outstanding need in the pharmaceutical industry for a reliable means of identifying which recent scientific discoveries, e.g., target proteins and mechanisms, have the greatest potential. Current approaches are speculative and uncertain; and, even in the best circumstances, tracking the value of a new discovery “required manual curation by subject matter experts.” Unfortunately, those determinations are further “complicated by the fact that it can take decades for a fundamental discovery to translate” into downstream commercial success.
To address that problem, Hutchins’ team developed, and trained, machine learning models to predict whether publications, and the inventive developments contained therein, will ultimately translate into successful therapeutics. The model is based both on the substance of the publication and its citation metrics. The substantive aspects include categorization and identification of the content of the publication itself, e.g., Drug or Disease at issue. While, the citation metrics include the rate of citation of the publication (a “field- and time-normalized measurement of influence, [termed] the Relative Citation Ratio (RCR)”) and the substantive categorization of those citing publications. The resulting APT score represents the likelihood that a clinical product will result from the underlying research, e.g., scores ranging from a >95% to a <5% chance of success.
But how reliable is this automated valuation of potential clinical, and thus market, impact? The authors focused on older papers which had historical information about downstream clinical success, but confined the machine learning models to only information available within the first 2 years of publication. Those predictions, based only on that constrained dataset, were then tested through comparison with the actual impact of the paper. The results evidenced “surprisingly strong predictive power.”
Moreover, to validate the approach more generally, Hutchins and colleagues compared those predictions to post-publication peer reviews, to see if their predictions agreed with academic predictions on the same publications. Again, the machine learning method was validated, actually outperforming manual expert predictions.
The researchers, however, cautioned against over-interpreting such results, as “the experts and the machine learning system are not answering exactly the same question, experts might identify translational progress not reflected in clinical citations. Nevertheless, by these limited measures, machine learning performs at least as well as expert peer-review ratings and has the advantage of being scalable to the entire PubMed database.” But that cautionary note reflects an advantage to the associational reasoning of machine learning, it may pick up on indicators and other trends indicative of success that are either counter-intuitive or non-obvious. And, while machine learning typically is a black-box, where those counter-intuitive factors “are not immediately obvious[,]” the specific system used here – Random Forest machine learning – allows for the identification of those hidden features indicative of predicted success.
Impacts
Drug Development: Hutchins’ system, and more broadly the class of automated, big data-driven predictive models soon to follow, has strong potential to impact the arena of drug discovery. “Few signals of success have been available at early stages of [the research] process, when attention from researchers and/or intervention by funding agencies could have the greatest impact. By integrating information about the number and type of citations a paper receives, we demonstrate here that a machine learning system can reliably predict the successful transfer of knowledge to clinical applications.”
In short, this new technology provides a new metric for, and the first robust quantitative evaluation of, recently discovered drugs, proteins, and target mechanisms. New drug candidates often pose the greatest risk due to the large requisite R&D investment coupled with highly uncertain future commercial success. But this system addresses that problem, the APT scores “serve as a particularly valuable means of assessing the potential outcomes of fundamental research, which takes longer to be cited by a clinical article.” Expectedly, companies may use this data to target IP and expand their portfolio with nascent, but promising developments. Alternatively, pharmaceutical companies may use it to guide their own research and development programs.
Transactional: IP valuation methods traditionally rely on various methods of economic valuation paired with expert analysis. But such methods, especially if directed at recent developments, can be somewhat speculative. The machine learning approach acts as a sword, to cut down overly “hyped” IP, and as a shield, protecting a valuable contribution even if fundamental in nature (early in research development). That implies far-reaching effects in transactional IP practice, and along the same reasoning, evaluation of IPOs. Beyond that added value, such automated practices may prove significantly less costly than identifying, hiring, and soliciting expert opinions.
Litigation: This type of quantitative assessment provides a novel metric that is ripe for use in litigation. It provides additional evidence in an offensive expert report, e.g., determination of a reasonable royalty. On the other hand, it provides a concrete, statistical means to challenge an opposing expert. Because robust predictive accuracy can be achieved with only two years of post-publication data, even litigation that arises a few years after initial discovery (or patent protection), can take advantage of this resource. APT scores may become a standard part of the expert practice tool-kit.
What to look forward to: A machine learning system geared specifically towards “first-in-class” drugs, those innovations which upset the status quo, is the likely next step. By re-training the model on the “more granular properties of translation,” block-buster drugs, or more precisely, those early mechanisms or compounds with the potential to become block-buster drugs, can be quickly targeted.
Predicting exactly how this model will evolve in the next few years is difficult, but its uses will grow and its reliability along with it. Because the “machine learning framework was designed to be flexible enough to tailor data profiles that enable prediction for a wide variety of biomedical research outputs and outcomes,” it can be tuned to whatever a company’s specific needs may be, discovery or defense.
Lastly, while the system in Hutchins’ work does not grapple with likelihood of invalidity, it nonetheless serves as a strong quantitative tool to value a patent’s content. But, future systems may do just that, incorporating indicators of a patent’s likely success. In fact, the paper itself contemplated training a system on “alternative data profiles” to make predictions specifically for “the development of intellectual property.”