Enhancing the Application of Real-World Evidence In Regulatory Decision-Making DAY 1



– Good morning everyone, we're
going to try to get started in just a minute, so
I'd like to ask you all to head to your seats please. This group got very quiet
very quickly (chuckles). Alright, well good morning, as I said. Welcome to today's conference
on enhancing the application of real world evidence in
regulatory decision making. I'm Mark McClellan, I'm the director of the Duke Margolis
Center for Health Policy and we appreciate all of you
making the effort to be here in person, I know many people
are also joining us online, we thank you all for joining us as well. The topic that we're dealing with today is a critical one, It's about advancing the nation's health
information infrastructure in a way that supports better
evidence in a wide range of clinical and health
policy decision issues. Today focusing on the use
of real world evidence in regulatory decision making. For some time there has
been growing interest and growing capabilities for
using the electronic data that are increasingly part of
our healthcare delivery system as well as data that are
increasingly generated by patients and others touching
on our healthcare system to improve our understanding of what works and what doesn't work
in healthcare delivery, and in terms of medical technologies. Evidence generated from these data, often called real world
evidence, is helping to bridge the knowledge gap
between clinical research and clinical practice to
provide a much richer basis, potentially a much richer evidence base for healthcare decision making. And a range of groups have
developed a growing experience with applying real world evidence in the post-market setting for
a number of different uses, such as payment for clinical care and quality improvement
activities, but the application of real world evidence in
the development of drugs and devices and in
regulatory review is an area that hasn't been as much of a
focus, and where there appears to be a considerable
opportunity for innovation. Real world evidence has been used by medical product regulators
at FDA, mainly in the context of safety surveillance
activities, but more efforts are needed to explore
how real world evidence could be incorporated into
the regulatory framework and become a stronger source of evidence for regulatory decisions. The application of existing
and emerging approaches to real world evidence
generation consequently may offer new pathways to
address regulatory requirements, and as well, important
means of meeting the needs of post-market regulatory decision makers through better evidence. The conference that we're going
to hold today and tomorrow to discuss these issues
builds on an expert workshop that we convened earlier this
year, focused specifically on how real world evidence
can be used to approve a new indication for an
already marketed drug. This is certainly not the
only regulatory application of real world evidence, many
activities are underway, as I mentioned earlier in
supporting safety surveillance, and potentially other
insights about drug labels. And there are a lot of activities
underway at this center for devices and radiologic health as well, but this is a good focus
area for illustrating some of the challenges
and the opportunities to build on existing systems
and knowledge and opportunities to make real world evidence more relevant to regulatory decision making. During today's discussion,
and in follow on meetings, we intend to cover not only this example, but other examples where
real world evidence could be applied in
regulatory decision making to promote the safe and effective
use of medical products. We expect that coming out of this effort will be a additional
papers, further development of case-studies for
applying real world evidence in regulatory decision making,
and other activities as well, so we appreciate you
joining us this morning for this important step
on what we think will be a very productive and extensive
path towards better use of real world evidence within
FDA's regulatory framework, and the exploration of policy options to enable further progress. There is a lot going on in
this phase, as I mentioned, a lot of potential to improve
the lives of patients, and having you all with
us today is going to be an important step in
making all that happen. Before we get started I'd
like to briefly go over our agenda, and then a
few housekeeping items, we're going to have four sessions today. Session one is going to
provide a broad overview of the current landscape
of real world evidence and identify some potential applications in regulatory decision making. It's gonna consist of a
couple of presentations and then some time for
discussion with a range of expert perspectives and all of you. We're gonna use that
opportunity to explore the different views that stakeholders have on real world evidence as well as areas of common understanding
around terminology, what we mean by real world evidence, the evidentiary gaps
that real world evidence can help address, and
ways that it can be used more routinely to support
evidence development in the regulatory context. Then in session two, we'll
introduce a use case, a regulatory use case for
approving a new indication of an already marketed
therapeutic, along with some other potential
use cases, all focusing on when randomized designs, as opposed to other methodologic
approaches, should be employed to generate real world evidence
for regulatory requirements. Session three is gonna
build on session two, with a focus on the data, study design and other methodologic considerations for research protocol
development and implementation. And again, session two is
going to look at the question of when randomized approaches can be used with real world evidence. Session three will focus more on how to carry out these studies. Then in the last session
today, session four, we'll identify some
areas of drug development and regulatory science that
could potentially be enhanced through better, more routine application of non-randomized real world data. Altogether the goal of these
sessions is to highlight key opportunities and practical challenges for our policy development, and
that's going to be the focus of tomorrow's discussion,
on policies related to real world evidence use. Before we start, as well,
a few housekeeping items, this is a public meeting,
the event is being broadcast online right now, so everything that we discuss here today
is part of a public record. As you'll note in the
agenda, there are a number of panelists who are going to help us lead each of the sessions with
some opening comments. We are going to set aside
time for broader discussion with these participants in each
session, so for those of you who are here in attendance
there will be a couple of microphones setup, I
think towards the front, actually I guess we've got
microphones on the tables, so let's be sure to use
those when you have a comment to make sure that everybody in the room and online can hear your comment. For those of you who are up here speaking, one of our staff members
Pranav Auror, Pranav we say hi, is going to help us keep track of time during the presentations. And I think that covers all
of our logistical issues to get started, so now I'd
like to actually get going and introduce my colleague
Greg Daniel who's going to start session one with
an overview presentation on what real world evidence
is, some definitional issues and opportunities for
applying this kind of evidence in regulatory decision making. Thanks Greg. – Thank you Mark, and let
me echo Mark's sentiment on welcoming all of you
here to what will be an important discussion
over the next two days. I'd like to kick things off with some, just basically, what
is real world evidence? And what do we mean by that term? As it will inform a lot of the discussion that we'll be having today. So first of all, so why are
we talking about it now? The concept of utilizing
observational research data from across the continuum of healthcare to inform decision making
is not new, but recently there's been a growing importance of using real world evidence
and growing opportunities to do so, so what tends to be
driving this is that we have the availability of
digital health data across the healthcare setting, from claims to electronic medical records, data coming from patients themselves,
are increasingly growing, in parallel to that we're
also struggling with costs across our healthcare
system, so drug discovery and development is longer and more costly, which is resulting in a
growing public attention on the cost of drug development
but also on prices as well. Providers and payers are
struggling with figuring out how to best design the
healthcare system from a payment and care perspective on moving
towards value and outcomes and quality, versus volume of care. And then patients themselves
are becoming increasingly more sophisticated, they're
becoming more educated about their own conditions
and they're wanting more information, and so all
of these things are driving the need for more
evidence, better evidence that can support decision making. And so learning from the
real world experiences of patients as they pass
through the healthcare system can help begin to fill some
of these gaps that oftentimes are left after clinical trials are done, and there are questions
about how products perform in terms of quality and safety in a range of patient populations. So real world evidence has
value for stakeholders. And, opp– Oh this is a (chuckles). Opportunities to obtain
more precise information on which products perform
better in which patients. Real world evidence can
also help bring forth important evidence on the
costs, value and safety across the healthcare system,
and generally developing real world evidence can come
at a lower cost potentially, but I think you'll hear some
discussion around that today. And each stakeholder group
has an opportunity to benefit from some of these, from
real world evidence, for example on the
industry side, we'll hear some of that today, but generally
learning new information about how available therapies
that are already on the market might perform, and might benefit
other patient populations, as well as identifying gaps
where perhaps innovation is needed, real world evidence
can get a great assessment of how current available
therapies are addressing the needs out there, and where there are opportunities
for new innovation. Payers are also leveraging,
already, this data to help inform coverage and
reimbursement decisions, to help guide programs that are
aimed at quality improvement in support for increased
value in payers and providers as we move more towards payment
models that align payment with value, data in the real world setting can be critically important
to help understand how these products are
delivering on value. And then patients themselves,
and the interaction between patients and their
providers, can be informed with better evidence on
what patients can expect from available therapies. And then regulators, you
know we'll hear a lot today about the opportunities
for real world evidence to benefit FDA and the
potential uses of that. But FDA's already using
real world evidence as well, safety, surveillance, through
the Sentinel Initiative is leveraging large
amounts of data from across the healthcare system
to more rapidly identify safety issues, and
increasingly there's interest in looking at how these
data can inform other kinds of decisions that FDA engages in. So there's generally consensus about what does real world evidence mean, groups like ISPOR and NEHI and others have put forth definitions,
have begun to define terms like real world data
versus real world evidence. And the baseline definition essentially is evidence generated from data collected outside of conventional
randomized controlled trials through appropriate
real world study designs and methodologies. And generally most agree that
that's a reasonable definition for real world evidence,
but the challenge is that this definition is broad and limited and doesn't necessarily
capture all of the complexity of evidence development,
the users of the data and the methodologic considerations
that go into developing robust reliable real world evidence. So that's where I'd like to
start, to perhaps unpack, what we mean by real world
evidence and go through with a cleverly developed figure here on how best to look at it. So real world evidence is
not data in themselves, you need data from the real world setting to begin developing evidence, and so where's this data coming from? In this figure we can see we sort of have outpatient clinics,
health plans, hospitals and increasingly patient
networks, so these exist across the United States in various
pockets, but these data are routinely generated
and are made available for developing evidence. So the users of these data include, individual stakeholder activities, so health plans themselves,
patient networks themselves and hospitals themselves
can use their own data to develop evidence and
they're increasingly doing so. What we're also seeing are
large networks or systems that are beginning to, and
have been for some time, linking data from multiple health plans to hospitals bringing in
patient data, in the examples I have here, NIH Collaboratory,
PCORnet, Sentinel and also on the far right
IMEDS, are using these data to develop very specific applications for generating evidence and
developing infrastructure with common data models and
the ability to more routinely tap into these data systems
and generate valuable evidence. So one opportunity for
real world data themselves to inform evidence development is actually by making traditional clinical
trials more efficient, and so what I depict here on
the bottom is the spectrum of evidence that many
of us have seen before, but ranging from traditional
adaptive clinical trials all the way through to
retrospective database analysis and case reports, and all of these studies lie on a spectrum of
evidence that these data can be useful in supporting,
so as I've mentioned, these data can help traditional
randomized controlled trials by becoming more efficient,
but new applications of these data, using large simple trials, pragmatic clinical trials, randomization in the clinical setting,
prospective observational studies, retrospective database
studies, are being generated by these large networks,
and so, on Sentinel, don't get hung up on these
cones, they're my attempt at at least illustrating right
now, where these applications or the general types of study designs that these applications are utilizing. So Sentinel routinely leverages databases doing retrospective
analysis on safety concerns, but even Sentinel, as you
look at sequential analysis over time, were getting
more into prospective looks at these data, so you can
see, I'm sort of trying to overlap here a little bit, and PCORnet and NIH Collaboratory, while they can also do retrospective database analysis, my point here is just that the focus is on doing more prospective
observational studies as well as randomized designs. But over on the far
right I want to emphasize that you don't need these large networks in order to do these kinds of studies and to bring viable evidence
forth, but the growing interest in leveraging multiple data
systems and developing this, quote unquote, learning healthcare system that can have multiple
users and multiple uses to develop evidence is growing. I think you can perhaps see, okay, so data, users, methods,
all of these encompass what we mean by real world
evidence, and I'm not sure if this is shown on the screen,
actually the most important part of this figure can't
be seen on the screen. But I can see it.
(audience chuckles) (laughs) 'Cos it's way at the bottom. And so what it shows is that
what's real world evidence and what's not, okay, and that's the critical important piece here. And so what we have is that
generally, what we mean by real world evidence is
inclusive of, so it's basically a bar that goes under everything from the large simple trials
and pragmatic clinical trials all the way through
retrospective database analysis and case reports. Real world evidence does not
generally include traditional and adaptive randomized controlled trials, even if real world data elements
can support those trials, those trials themselves generally are still tightly controlled,
have more narrow populations and don't necessarily reflect what happens in the real world setting,
whereas pragmatic trials, observational studies and
retrospective database analysis and case reports generally
do a better job of reflecting what's really happening
across the healthcare system in the real world setting. So as I mentioned, even
within traditional randomized controlled trials, real
world data elements can provide fast, perhaps more targeted, more efficient recruitment of participants in the study, potentially
reduce burden on data collection depending on what the
data are being collected and if they're valid and reliable methods for collecting data
that would be sufficient for randomized controlled
trials, and potentially reducing the time because
of those efficiencies, and time and cost of trials. But data can also be used
to generate new evidence on products as I depicted
in the previous figure. So as I leave this presentation
I wanna touch briefly on, so when we talk about
the FDA or regulators having increased opportunities
to leverage real world data, we're not necessarily just
saying that FDA can approve a brand new drug on the market
utilizing real world data, there are a range of decisions
that regulators make, that are decisions on
products that are even already on the market, so for example, a product that's already available on the market, and has evidence supporting its use, might have new populations
and so a new indication might be one kind of
decision that the FDA makes. Changes to the label even,
including more information or removing information, a
range of post-market commitments and phase four confirmatory evidence, and also safety surveillance. So there's lots of opportunities
that FDA can look at, where, if and when real world evidence might support those decisions. But it's important to go
through, and we'll do that today and tomorrow, on what
are the considerations with regard to the data
themselves, the validity and reliability of those data
and how they're collected. The outcomes or endpoints
that are being sought, if there are endpoints
that can be measured with these data sources,
or perhaps they can't. It's those kinds of considerations
need to be addressed when thinking about
potential opportunities. And also randomization versus
not, when is that important? When should we do it? Do we always have to do it? Those are the very
important considerations and worthy of, certainly, the discussion that we'll have today. But I also wanna foreshadow into tomorrow, that even as we discuss
opportunities for FDA, they're not the only stakeholder
utilizing these data, potentially utilizing as I've
mentioned, and making sure that we're building a
sustainable infrastructure to support its use will be important. And so I have four points here
that came from a recent paper by Rich Plaid and Jeff Brown
that talk through what are the fundamental components
of an ideal infrastructure, it needs to be trusted and
valued by all stakeholders, it does need an appropriate governance and sustainable business
model for its use, and it also needs to be
adaptive and evolving so it can be responsive to the
needs of the decision makers. And also capable of
engendering what they've called a virtuous cycle of health improvement, so it's one thing to
generate the evidence, and even my figure kind
of stops at the evidence, but the most important thing
is how do we use that evidence? How do we use that evidence
to inform decisions and improve care? So as we go through that,
there are lots of questions that need to be tackled just on that, what are the appropriate business models? How can we improve the data
systems to bring more value back to the stakeholders
that are participating? And are there specific
policy levers or incentives that can encourage improved
and increased development of such an infrastructure? Thank you. (audience applause) – I wanna thank Greg for helping us set the context and
provide some definitions. For those of you who could not
see the whole slides earlier, I think we've got the slides
re-sized now on the screens, and the ones that Greg
presented will be available as part of the resources
for this meeting afterwards. As Greg noted, it is
possible through so called real world evidence, which
he thoughtfully defined, to obtain data, real world
data, on a broader range of patients and care
settings and outcomes, including long term outcomes,
that might be possible in traditional, tightly
controlled trial settings, that leads to though a
spectrum of design issues, and the opportunities
for learning about issues that cannot be addressed as easily in tightly controlled
trial settings, where those are not feasible, but there are some very important methodologic considerations and that's what we're
gonna focus on today. It's how to develop and use this evidence as effectively as possible. Right now I'd like to Janet Woodcock, the director of the Center for
Drug Evaluation and Research at the FDA, to provide some contextual, context setting remarks
as well, thanks Janet. – Thank you Mark, and
good morning everyone. I'm gonna talk about FDA's
use of real world evidence, both sort of current, and then future. And Greg gave you some nice definitions, the definition that was put
up there is more or less the classic definition,
where real world evidence is gathered from the data
generated in healthcare. But I think we have to realize
we're in a evolving situation with regard to data
gathered from healthcare. And data gathered from
healthcare has always had one characteristic,
it's not been very good, okay (chuckles). It's been confounded because
of it's used in billing, it's had all sorts of other problems, it's had a lot of different
problems with the data standards or what people are calling things. But we're moving away with
that, with the implementation of electronic health records and so forth. We actually have an opportunity to sort of redefine real world evidence
and make it more inclusive as these other types of
data become available. Now real world evidence,
as classically defined, has long been used by the
FDA in regulatory decisions, of course, and the most obvious
one adverse event reports, and case reports, and that are coming out, or reported out of healthcare
that happened in healthcare. But also, we've used natural
history data for a long time, including as controls,
and that may have come from registries or just
from data gathered out of the charts and experience of patients who had a rare disease for example. There we'd be use observational studies, both retrospective types
and prospective types of observational studies, where those data are collected, deliberately collected, but from health records. And we've used registries extensively, especially the Device Center. So we are very familiar
with these data, both the, you know, warts and all shall we say. More recently we've had
the Sentinel system, which is an attempt to institutionalize and use emerging digital data that exists, primarily the claims data, but with some electronic health
records put in there. And we're closing in on
about up to 200 million logs, of that kind of data,
alright, claims data. And that is both
retrospective type of analysis and prospective analysis,
and we're trying to more institutionalize the
prospective analysis out of that. But now FDA is trying to think
about, okay, well we built this infrastructure with
Sentinel common data model, the Corey's built some as well,
we have the collaboratory. Can we use the infrastructure
built by all this to answer, or investigate at least, other problems, maybe answer other questions. So we are going to step into that as well, it won't be Sentinel,
but it will be utilizing the infrastructure that we
have that links these data, the common data model and so
forth, to experiment on how this type of evidence performs
in answering questions other than safety questions. And also because we'd
like to extend this more to the electronic health
data and what's in there, we're investigating can we link that with the Corey core net, and
with other information that's around, and devices of
course, is really exploring linking it to registries,
other sorts of data, there's also death records,
there's all sorts of things that aren't captured, particularly
in the healthcare system all ways, but we'd like to know as outcomes that could be linked. So an important new area
that we're just starting to explore I think, and
it has been explored with the collaboratory more
than anywhere, is really, and this is of the hybrid,
it's on the far right, or left, depending on how you're
looking at it (chuckles), of Greg's diagram, and okay it's not classic randomized controlled
trials that have their own case records, over here
there are case report forms and their own protocols,
and they pull the people out of the healthcare system basically, they have investigators doing this things to them and so forth. This is can we randomize people
within the healthcare system and do a trial inside
the healthcare system, utilizing the data collection methods of the healthcare system,
and then put that information together, randomized information,
where we have a better way to link the causal inference
because we've randomized the people, and learn that way. And so that's what we're
really interested in, I think we're also interested
in talking about here today. Because it's wonderful,
it's terrible if drugs have really terrible side effects, but it's easy to find usually,
because they're very dramatic and increased, and it's
wonderful if drugs have really huge treatment effects,
where the people are cured, and that really isn't the
problem with most drugs in determining whether
they work, the most of them only have an incremental
effect, they don't, not everyone who's exposed
gets the benefit of them, and so that's where
you need randomization, because you don't just have something that hits you in the face. So it's really clear though,
when drugs are developed, and other medical products
as well, and then once they're approved and get on the market, despite the huge investment
pre-market in these trials, and learning as much as
possible, there's a huge number of questions that are still not answered, about how to use the products, and this is very
disappointing to all of us. And of course the answer
can't be, well we'll answer all of this pre-market,
because the market, the healthcare system is
changing, and where new drugs are coming on, other new interventions, and that's the system that
you put the new drug into. So you really need to be
able to answer questions in a robust way, after
drugs are on the market, how to use the drug better,
the place in the armamentarium, and that's who it works
in, who should start on a different drug, what
combination should be used, often we don't know all that,
any of that information. And then, particularly for this
conference, additional uses, so what additional uses can it be put to, typically we approve a drug out
there, people start using it for all, sort of related,
but not exactly the people who were in the trial kind of conditions, right, and not indicated on the label. Now how can we quickly
and efficiently find out is it really worth while for
those people to take that drug? Or should they be doing something else? And that's where we think
that perhaps randomized trials within the healthcare
system, if we can get that up and working, could efficiently
answer such questions without costing, you
know, 500 million dollars and taking a very long
time, and actually give us the answers that we want,
instead of having a huge number of people so impairantly exposed to a drug before we ever find out,
after ten years or so, whether the juice is worth the squeeze. So that's different
really then what quote, real world evidence, has
been used for before. But here we're talking about a new kind of real world evidence that
we haven't been talking about before, where we perform
randomization or other activities that allow us to trace
back the drug effect and clearly isolate it. But it doesn't have all
the apparatus and trimmings of a conventional
randomized clinical trial, but it also has enough rigor that we know the data actually has some validity, and I'll talk about
that a little bit more. So currently for drugs,
have we ever done this for an indication? Yes, (chuckles) but it's very rare, usually for rare diseases, but
we may get convincing reports out of healthcare about
a response that are well enough documented, that we would say, add an indication, something like that. Devices happens more
often, there's a lot more, for some of the devices
there's a lot more face of validity (chuckles)
to the effect, right, and devices also have a
lot of rapid turnaround in their lifecycle, and they're evolving, and so this is very pertinent
for the medical devices. So, in the future, maybe the
second and third et cetera indication, if it's
related or it's smaller or it's something that's
generated out of healthcare that the drug might be
used in, you could do a randomized controlled trial, but do it, but do it in really utilizing the tools of the healthcare system,
that's the thought. And what has to happen in
order for that to be acceptable for all of us, acceptable evidence. For somethings, as the
collaboratory has shown, cluster randomization is a helpful tool, where you're doing comparative study, and you wanna know should
we use this intervention and these people or should we use that intervention and these people? Both of them are out there, they're both considered acceptable, as
I said at the beginning, often we just don't know
which one might work better. So different types of
cluster randomization often work very well in this comparing two different standards of care. Although there's always
controversy about this when one turns out better than the other, people then say, well,
they all should have been on the better on (chuckles). But of course you didn't
know that at the start, that's the precondition for equipoise that you would actually do this study. So it's also do, I think
possible, in some settings where the natural history
is pretty clear to do prospective open label
studies for new indications, that's where we don't need
that control necessarily, you can use what you know
about the natural history, so these are typically
done in the same settings pre-market, so we have for oncology, we have pre-market uncontrolled trials, and it's perfectly feasible
that you could do the same for new indications post-market, the only caveat there is as
I said the very beginning of this, the problem with
real world evidence's not been very reliable,
and in cancer for example, Laura Esserman told me
that, who's a cancer doctor who's working in this area,
that often in the charts of breast cancer patients,
in their medical records, there's the wrong stage
of disease, alright, and so that's a really really
important thing for whether you're gonna get this
treatment or that treatment, but it's wrong in the
record, and so she's trying to pioneer methods where you
get it right the first time, where you document and agree
upon, maybe another party agreement before you enter
that diagnosis in the chart. And I think that's a really
good idea, 'cos number one it will improve quality of
healthcare, right, (chuckles) and number two, it will make
that data reliable that we can, you know, and valid, that
we can use it as we do this kind of study work. So that's critically
important, that those data, and all key data points
that are used in the use of real world evidence,
be valid and reliable. And it hasn't been so much,
it's been a terrible problem with claims data, because they're used for another purpose primarily,
and so the relationship to the patient may be more
tenuous then (chuckles), and that's why we've used
it mainly for safety issues, right, 'cos the safety problem
is maybe described in there, and there's been a huge
amount of methodologic work to kind of work around this
problem of claims data, but that's gonna have to
happen, we're gonna have to have good documentation
of the key data points and really focus on that. And also I think, more use of registries that are embedded in
healthcare rather than having to be pulled out somewhere
and do a separate registry and go those registry visits,
and all that may enable a lot more, us to understand
the natural history of disease better, these would
be like virtual registries where we pull the information out. But again, of the measures
are not somewhat standardized and carefully documented in the records, whatever the records are,
we're still going be, have a lot of noise in the
signals that we're trying to discern, and that I think is one of the fundamental problems. So data reliability,
number one I would say, and form consent, I think
we can lick this one up, whether you're gonna be
talking about it next two days, but people do need to
consent, and there's been a lot of uproar about
consent, for example, they randomized those
recent studies published on randomizing residents
hours, to see whether, because people were having
to do the hour restrictions on residents, are having to walk out in the middle of surgery, and hand it over to another resident, okay,
they weren't the primary person doing the surgery, (chuckles)
they were the person learning the surgery,
but they'd have to leave because their time was up. So they randomize a more flexible, well there's a big uproar over that, about whether the patient
should have been consented that the residents might
be one schedule of hours versus another schedule
of hours (chuckles), so yeah, so we can't
underestimate this challenge. Methodologies, traditional
methodologies can be applied if we can get these other things right, and also, additional
methodologies are being developed. And then I think the
final problem that we face is linking all these different
types of data collection, sites and systems together. For example, whether somebody died or not, and that's kept here,
and then the registries are kept there, and the
claims data is kept here, and as Greg said, the
clinics keep their records and these things are not inner-operable nor are they linked right now. Now this is a more straight
forward methodologic problem, although it does raise
privacy issues that are gonna have to be dealt with
as we work through this. So because we're all really
motivated to do this, because we still don't
know what we need to know about medical products, and also all healthcare interventions,
then we should know in order to use these things. Now the good news is, I think
we know more about the use of medical products than we do about many other interventions
(chuckles) that are done in healthcare, right, so it's
not like medical products are falling along behind,
but the goal of many of us is to try to improve, as
people call it learning or knowledge, accumulation in healthcare, and this effort will
probably lead the way, because it's ripe, there's
a lot of motivated people coming together, we can
figure out how to do this, and this could lead to
greater understanding, so that when you or I or
anyone else goes into hospital, or goes to the doctors office, you know there's a better body of evidence there about how to treat us and
what not to treat us with. So thanks very much and I look
forward to the conference. (audience applause) – Alright, as I mentioned,
all of our sessions are going to have a good
deal of discussion in them, and I'd like to go into that
discussion format today, with Cliff Goodman, Bill Chin
and Greg I think joining us onstage as well as Janet, so
if you all could make your way back up, you've already
been introduced this morning to Greg Daniel and Janet Woodcock. I'd like to just briefly introduce two other distinguished leaders
that we have with us today, who've been thinking about these issues of real world evidence extensively. Cliff Goodman is senior
Vice President and Director of the Center for Comparative
Effectiveness Research at the Lewin Group. And Bill Chin is the Chief Medical Officer and Executive Vice President at the Pharmaceutical Research
and Manufacturers of America. So what we wanna do now is
spend some more time addressing the key real world evidence concepts, and some of the issues that Janet and Greg have already pointed out
this morning, in terms of how to identify and use
effectively real world evidence, and addressing patient need,
and again keeping in mind that we're gonna have a lot of focus today on potential for using randomization in the real world evidence
context for developing evidence for regulatory decision making. We'll just start out with
perhaps some initial thoughts from those of you haven't spoken yet, so Cliff maybe I can turn to you first. – Thanks Mark, glad to start. Real world evidence is
kind of a wake-up call, and I'm gonna tell you in just a minute about the wake-up that occurred. First, the real world
evidence does support the basic concept of
effectiveness, how well does something work in
the real world as opposed to the idealized world which is efficacy. And the best story about this
that I've heard was one told by Hans-Georg Eichler, who matured the European Medicines Agency,
in about which he wrote a beautiful article (muffled) in 2010 regulators, scientists and so forth, and he realized that the
EMA was putting things, allowing things to go on market. Oh it's on.
(sound interference buzzing) – [Mark] It'll make it even better Cliff. – I guess so, oh gosh. That's fine man, good. He was realizing that as an official to the European Medicines
Agency, they were basically approving for market all
these swell new drugs and biologics and so forth,
and then after a while he was there he starts
looking round and says, it was a wait a minute moment for him, he said "We're approving
all this stuff, but it's not "been taken up in the
market, what's going on?" Well in Europe it was health
technology assessment agencies and national payment authorities
were saying "Hold on, "glad that this thing got
market approval, that tells us "it provides efficacy in
those ideal conditions, "we got a little bit
of data on short term, "more common adverse events, but not rare "longer term ones and so forth." So his wait a minute was,
what's going on here? And what he started to realize
is that there's no longer this border, this boundary,
between pre-market data collection and
post-market data collection, and he started to learn
about efficacy, effectiveness and then relative efficacy also versus relative effectiveness,
those four concepts. So that realization occurred
because the market was changing in the low side, not just one,
the low side decision making, those were changing as well. So an important thing that RWE
does, and this is, I think, my essential point, is
that it re-balances, it re-balances the relative
importance of evidence sources in support of decision making. Think about it, if you're
a pharmo-bio company, you design and you ran
and analyzed those RCT's for a few hundred, maybe a
couple of thousand patients, and that comprised the
main body of evidence upon which decisions were made thereafter. Well that's not the case
anymore, that may be necessary but it's not a sufficient
source of evidence. Now, what's tipping the scales? What's tipping the
scales is all this stuff from these other sources and saying, well, what are decision makers
using now to make decisions about a guideline, a
pathway, a coverage decision, a purchasing decision,
they don't often go back to the trials done in a dossier for regulatory agency approval, they're looking at other sources. Furthermore, think about how
size of data sources matters. Janet's absolutely right
that a lot of methodologists are right about, well these
are larger data sources but they're kind of weak. Well, if you talk now to
somebody who's let's say a chief medical officer of a large PBM, well I guess they're all
large now, (chuckles) there are few of them and
they're getting bigger, or you talk to the chief medical officer of a major payer, they're
saying "Well yeah, "we used to depend on the
regulatory submission data, "that was very important,
but now, I got my own data, "and I'm not talking
about tens of patients, "or hundreds of patients, or
maybe thousands of patients, "I got data on millions of patients." And if you run a PBM, and
you're watching scripts, you got billions of
prescriptions, and refill data. So I know that there's a
problem with the quality of data and so forth, but when you start getting to very large numbers, representing highly heterogeneous patients, there's a lot of power in the data. Now ask yourself, where is the scale now? Where is the main body
of evidence that matters to a decision maker? That's a big change. Okay, few more points about the sources. Yes, pragmatic or practical
clinical trials are a source, payment claims, pharmacy
prescriptions and fills, registries, EHRs, EMRs,
laboratory test results, radiographic images, bio
banks, specimens, tissues and so forth, molecular genomic
data, vital statistics data, and quite important lately,
and I know that some folks in the room that are experts on this, patient generated data,
patient source data, patient streaming data, and whether it's from personal phones,
apps and all those other things that I don't understand. It's not just that though,
it's credit card purchases, it's how far do you live
from the whole foods or whatever grocery
store you happen to like that might have fresh
stuff, who lives with you, do you own a pet, do you take
a walk, so we've got ways to now be collecting those
kinds of data, the augment, not just the claims and the
HRs, but all those others, we have better ways of linking those data and telling much better stories. Now, it's not just those data
sources, the RWE sources, it's what you can do with them,
we've talked about linkages and so forth, and high powered
computing can do wonders, but it's also kind of the analytic tools, and so it's machine learning, it's natural language processing, you know with machine learning, if you
think about those data sources I just rattled off,
including how far you are from a grocery store
and some other things, machine learning can start
finding patient clusters that maybe non-traditional
sets of patients that aren't the kinds that
you'd identify in a traditional RCT upfront with certain
baseline characteristics, yeah it's somebody with
diabetes, but it's somebody with diabetes who exercises
a couple of times a week, does live near a grocery store,
is living with a caregiver or a family member who is
angulatory, and with whom he or she goes to the
movies with once a week, you know what I'm saying,
so patient clusters using machine learning can
identify groups of patients that will behave much differently,
respond much differently than other people with diabetes, or congestive heart failure,
or rheumatoid arthritis, that's an additional bit of power there. So I just wanna close in so
far as, some of the kinds of conversations that we have
with pharmo-bio companies, if we get into their sea
suites which is enlightened on their part, a nice admission. We say stuff like this,
when you're thinking about what products to put in your
pipeline, are you thinking ahead already about who
the gatekeepers are, and the key question
is, who is going to want what evidence when, who's
gonna want what evidence when, and those decision points,
those locks, those gateways are gonna determine the
development, uptake and diffusion, payment and so forth, of
all those swell molecules that you think that
you're developing, okay, and if you're not thinking about that now, you're gonna be behind,
thinking about that now necessarily embraces data sources far more than the ones that are required
for regulatory approval, far far more, there in lies that balance I told you about before,
they're just focusing, if they're traditional RND
people who might have developed a swell beta-blocker 20
years ago and got wealthy and famous doing that, if
they're just thinking that way, they're missing this market entirely. Last point, patient
willingness, I've talked about patient generated,
patient source data, think about the role of
much more active informed and engaged patients in all of this stuff. A research and development agenda setting, on medical needs, what trials to conduct, trial design, what are the patient groups, what are the interventions,
what are the comparatives, what are the outcome measures,
over what period of time, how will data be collected,
okay, patient, even recruitment into clinical trials, alright,
then what is done with the information that comes
out of clinical trials, post-market data collection,
patients are much more engaged in this, that is another
key development here, so if you kinda look
back over the 7.1 minutes I've just had,
(audience chuckles) you say, how does that decision making and data generation environment
that I just described, and we're about there now,
differ from when the laws that Janet carries out so effectively, and those regulations
were initially laid down, it's much much different,
don't forget the thing, a re-balancing of the
evidence that matters. – [Mark] Cliff, thanks very much, 7.2 minutes,
(audience chuckles) a lot of ground to cover, I'd
like to turn to Bill next. – Thank you very much Mark,
and also Greg for inviting me. So as you've heard already that these are indeed exciting times,
Cliff has articulated that. Particularly are exciting
times for patients, that's your last point. I think the definition
of real world evidence in a broad way, that is
evidence that's beyond the evidence that's derived
randomized controlled trials, is still very useful one,
it does of course however put the perspective that
we're dealing with RCT as a solar system, and then
the rest of the universe is really what RWE represents. So there is a great deal
of opportunity to that, but there's also, I'm sure,
a fair amount of anxiety in terms of, ultimately,
where we're gonna go with it. So I think with this
discussion today and tomorrow, there's no question that
intuition tells us that its not a matter of whether we're
gonna use this, it's really a question of what,
when and ultimately how, and obviously all the
sessions will deal with that. We agree that a broad definition
is probably very useful, because of all the things
that we still don't know about RWE, and so as a result
we don't think we should limit our current discussion only
to specific types of evidence. Well clearly we now, and
continue to use, and these are both the sponsors
and FDA, RWE for safety, and we've heard certainly about that. But we also understand that
it's useful for understanding treatment patterns, how to
find target populations, how to benchmark adverse events observed in clinical trials, and monitoring safety. It's also clear that we
already have some examples, and so maybe the focus has been on safety, but we have some examples where
the agency has utilized RWE for efficacy reads, and very
briefly, there's the example of (muffled), it's easy
enough for me to say, focused on relapsed and refractory ALL, acute lymphoblastic leukemia,
and that whole process went through a breakthrough,
Janet, really excellently, very fast, of course there
are a lot of good things, there's a good effect size,
a really terrible disease, I think there was a very
good clinical trial design, but most importantly it utilized
RWE as part of the control, in the single-arm of study. The second deals, very quickly, with recombinant human factor seven A, where in fact for this particular disease, glanzmann's thrombasthenia,
the use of registry data was helpful in adding on to the use here. And then finally, tocilizumab, another mouthful, which
is something used in the refractory rheumatoid
arthritis patient, who was refractory to
DMARDs, and this was also, the indication was
expanded based on efficacy and safety information collected
from post-market studies, so there at least three,
I'm sure there are many other examples, or at
least some other examples, right Bob?
(laughs) But the point is that,
I guess FDA regulations provide flexibility in the
requirement for determining or demonstrating effect of
this, and I think the FDA has used its flexibility in this. So, a couple of points, one,
I think we as an industry applaud the agency and Duke
Margolis for having this session because I think it's about
multi-stakeholder discussions like these that allow us
to translate our intuition of the use of RWE into
reality, into usefulness. Collaboration is key, and clearly having everybody here is also important. But there is great opportunity
here as well, and that is how can we think about RWE
in benefit risk assessment, and how can we think
about it in a broader way, and focusing on context of
use, something that, Janet, you've talked a lot about,
respect the bar markers, taxonomy, as what you've
talked about bar markers. So the point is that, it covers,
arguably, it covers a lot, how can we be much more
deliberate and focused in terms of understanding
what that context is, then asking what are the kinds of data, the quality of the data, the methods that could be applied to each context. One thing that gives us great hope is that with the current PDUFA
six technical negotiations, which are heading toward
completion, we do have an element of it, of
enhancing the use of RWE in regulatory decision making,
we'll hear more about that. So this whole idea of
combinations, so context of use, really so the way to think about it is, and my colleague, Mike Levy,
has brought this up Mark, in previous sessions, is
really thinking about the types of regulatory decisions
you're talking about, the characteristics of
the medicine or disease, and we can even add another element, which is types of RWE,
as Greg has articulated. So how can we sort of
develop, if you only have two dimensions, a Latin square
of these efforts and try to then identify specific contexts. So just give you a good example, and Janet you talked about it, and Greg you listed a
few of these as well. So for instance on
regulatory decision making, approvals based on
historical controlled data, in a single-arm pragmatic clinical trial, those based on RWE derived
from medically accepted alternate use of medicine,
RWE from another country with quality data, on promising
early clinical trial results with monitoring of safety and efficacy of the drug through the use of RWE. Sub-population analysis,
so approval based on that. And then finally, fulfillment
of post-market commitments, you already talked about
replacing phase four trials with analysis based on RWE. So these are just meant as
illustration, and obviously not to be complete, but
on the drug or medicine, disease characteristic and
a few of these could be the availability of other
therapeutic options, the urgency of the
disease being addressed, the size of the patient population
and the drug effect size. So in the end of the day, we
really want a better definition and clarity on regulatory
framework, and we should build, and learn from, the examples
that we already have, but also think about pilots, and we talked about perspective pilots,
and we're gonna hear about a case, or cases, but I also
have thought that we should be thinking about how we
can use retrospective data in a sort of learning way,
okay, we're not gonna learn new information, but how can we learn from retrospective
studies, how do you perhaps do this work and where to apply it better. So I'm gonna conclude by just
noting a couple of things. So we believe that RWE
represents a valuable source of information, we said that okay. Two, that enables sponsors
to submit applications based on appropriate and
adequate data gathered through the real world use of a product by medical professionals
could actually help, and will help patients and society. So the benefits include
reducing time necessary to approve approval, or reach approval, accelerating the availability
of new approved therapies to patients, driving sponsors to improve the healthcare system's
ability to draw from and integrate disparate clinical data and improving the amount
and quality of information available to all physicians
regarding therapeutic benefits and risks, furthermore,
FDA approved labeling, we talked about possibility that labeling could be affected by these
data, and they can often then ultimately affect patients ability to obtain reimbursement for therapies and access to potentially benefit. And lastly, really again,
back to the patient, okay, just where we began, patients
interests for all us, to think really carefully
about how we can utilize RWE in the benefit risk assessment,
and so we encourage, again coming back to the
theme of bringing stakeholders together in collaboration,
how do we work together to, perhaps not solve the whole
problem of what RWE can do, but how could we solve specific ones, and so we can learn from
those and grow from there. Thank you very much. – Great thank you. (audience applause) I really appreciate
all the context setting that's gone on this morning,
clearly real world data presents a potentially much
richer source of information that can help improve patient decisions as Bill was just emphasizing,
but it is more complex and it does raise some issues
in terms of non-randomly missing data and in terms
of potential confounding that has been highlighted
in on way or another in all of your remarks. I Wanna go back and start this discussion with some more comments that
both Cliff and Bill make, Cliff I think you said
"A key issue is who wants "what evidence when, and determining "how to use real world data
to develop that evidence." And Bill I think you
highlighted the context of use, similar, I think, similar point. And Cliff especially highlighted
how real world evidence seems to be used much more
widely now in decisions by payers, I remember that nature of study about the increasing disconnect
between regulatory approvals in Europe and covers
decisions, that maybe because of the different nature
of the context of use, where, as you pointed out,
the questions really are for payment, increasingly
about comparative effectiveness and about evidence relevant
to specific subgroups of patients. To bring it back to the
regulatory context though, I'd like to push on what you all see as some of the best,
unexplored or potential uses of real world evidence
in the regulatory context and how it then uses it sensibly. If the FDA decisions, or some of the most important decisions
are about the safety and effectiveness of a new
treatment, well obviously, it's gonna be hard to
get real world evidence on a treatment that
hasn't been approved yet, we've highlighted this
morning, a focus potentially on label extensions, does that need the same level of evidence? You all highlighted some other uses. Could you talk a little bit
more about where you think the best opportunities are to expand out the regulatory uses of
real world evidence. – Yeah you know, real world
evidence can be thought of as well as feeding back to
the, think about what trials you're running and how you design them, real world evidence can tell
you that we're seeing outcomes in the real world that had not been built into earlier trials, and
therefore when we do our next set of trials, whether it's within
class or different class, we might wanna consider
other outcomes that matter to patients and that effect
their quality of life, mortality, morbidity and so
forth, so it's a feedback to the outcomes, it's also
feedback to the selection of patient groups and
baseline characteristics, as well as those subgroups
that you may wanna follow longevity through the trial and after the trial, so all
the way through phase four. So it's feedback to trial
design for which patients, what outcomes over what period of time. And then, also Mark,
in the context of what those other evidence seekers
are gonna be seeking, a smart pharmo-bio company in
an ongoing enlightened food and drug administration
is gonna be thinking, you know we also want
efficiency in the system, so while we're doing our traditional or quote traditional RCTs
to get regulatory approval, can we be smart about
being more efficient about the kinds of data we might
wanna start collecting now, so that when the thing
does hit the market, in its first few years
after it hits the market, we'll have more data sooner to help the other decision makers in the realm. – [Mark] Janet. – Yeah, well to your last point Cliff, certainly the EMA is experimenting
with getting the payers, the patients and everybody together before development program
goes forward, because we know the tech assessors mainly are looking at, what they're looking at differently, I have to differ with you, okay, they want quality of life
information to do the qualities, because that's how they do their payment, whether it's worth
paying for or not, okay, and typically like that
information isn't collected, and typically we don't have
instruments to collect it, or if we do they're more like
health status instruments. So I think when, the first
thing you were describing was actually research, which
is, what we have to recognize I think, is that most
health records don't have any patient reported outcomes
in them either, alright. You know, they're filling out
a form on a computer, okay, (laughs) that's what
they're doing, and it's not an instrument to figure
out how the patient feels, it's how maybe the doctor
thinks the patient feels, okay, so you know, there is research
needed in the real world, I agree on, if you wanna
know what outcomes matter to patients you have to ask
them and then you have to go through a structured
process to figure out how to capture that data
in a way that's valid, so that you can, then
but that's really what the European tech assessors wanna look at, is does this, if it doesn't
impact survival, okay, does it impact quality of life? Those are the two things
they care about, right. So typically those are not
strictly the end points for a lot of trials, even
for symptomatic diseases, they might have a scale, they
might have a likert scale or a pain, or something like
that, but they, you know, we try to put in quality of life. So that goes back to what
Bill was talking about, about with the end points that
the patient and everything, so we might have to do
research in real world settings to figure out what patients want. But I think that's my
understanding, I've talked to Hans-Georg and all
the folks over there, and they are doing pilots
where they're getting together with the tech assessors
and patients before the end of the program, before the
efficacy trials are initiated, and trying to figure out what
outcomes should be structured in there that would meet the needs of all the different parties. – Yeah, I'm just gonna say,
not all the tech assessors in Europe or Australia or elsewhere use cost per quality adjusted life years of the threshold criteria.
– [Janet] Yeah that's true. – But they all care about some aspect of quality adjusted life,
– [Janet] They do. – Whether it's a EuroQol
5D or other measures and so forth.
– [Janet] Right. – So there is a pan, well
an international interest in better measures, and you are right, of quality adjusted
life and similar things, and again the patient advocacy groups and the patients and researcher
groups are developing these and validating them further. I will say though, in
so far as producing data that's of use to decision makers, you're now seeing groups
in the United States, ISA, ACC, AHA, (mumbles)
caring with its advocates and so forth, you are
seeing different groups trying to quantify value
in multiple dimensions, that are using some of those measures and kicking the economic component, we're hearing about cost per
quality adjusted life year, now, not from our government, but from our non-profit private sector
groups, that are trying to provide information to inform these very same decision makers. – Mark, just an additional
comment to your question, which really again focuses
on the regulatory framework. We of course think, and I
think Janet would agree, that maybe the early applications
of real world evidence, beyond the few examples
that I've provided, really will be in that
post approval space, thinking about how,
potentially new indications can be supported by these data,
this could be very helpful in situations like
breakthrough, where you do have fairly good effect size and
evidence that you are helping, good benefit risks assessment, but that you're still
not 100% sure, right. And so you could use
post approval to continue that assessment, not only
of safety but of benefit. I would maybe editorialize
by saying that no one is 100% sure about benefit
risks, really, at the time of decision making, and so ultimately can RWE be useful for all medicines? And then finally, the last frontier, Bob, will be whether RWE, I'm
looking at you Bob (laughs), last frontier will be
whether RWE can be useful in some aspects of pre-approval decisions, I think it's certainly,
I think you told me once that not in my lifetime
will we ever have RWE in RCT, and mix RCTs pre approval, So I dunno how long–
– [Cliff] It still looks good. – [Janet] I have a comment,
I'd like to raise a point. – But the point is that I think we need to continuously learn how you
can maybe add RWE to RCTs, I don't think we'll get rid
of RCTs as full standards. – [Mark] Well we'll hear
from Bob Temple soon, probably should have had
you on this panel as well, but Janet. – Well this somewhat
tangential to this discussion, sort of, but we're seeing
with precision medicine, okay, we're seeing that we're
shifting away from, to some extent, the traditional
definition of disease, and we're moving into pathways and where there are underlying
pathophysiologic defects that actually underlie
presentations of sometimes rather desperate diseases,
and where I would foresee is that what typically
would happen in the past is, okay we'd approve it for some
small subset of this, okay, where the pathway, where
their are enough people and the pathway was known,
and you do a targeted precision medicine program,
with a randomized trial, and you get the drug on the market, and then people would use it
off label for all the different other defects where you
could identify the defect, and that situation would
go on for a long time, because that's how it's always been, okay. But I could foresee that you could, with good data collection if there was enough natural history understanding, you could add more
subgroups then of people with that underlying defect, to the label based on experience in treating
them, and so treating them off label wouldn't be wasted
anymore, you'd actually collect that information, eventually
you might understand and we could rapidly get
to decide where there is an appropriate treatment
for these other groups in other diseases, so that's
one example where I think the follow on, and we'll see
that in cancer, of course, but it's actually happening in
a lot of other diseases too. – [Mark] I think Greg
you got a comment too? – Yeah, I just wanna go back
to a key concept that Cliff brought up about the sort of
re-balancing of the evidence, and I think I'm quoting you
Mark, or maybe paraphrasing, but it used to be where, you
know, we get as much evidence as we can during clinical
trials, and then the point of approval we just sort
of hope or the best, and it's clear among this group,
and what we've heard so far that there is lots of value
in continuing to learn the experiences of what
happens once these products are on the market, and the
opportunity to re-balance pre-market data collection with
post-market data collection is something that our colleagues at CDRH are heavily involved in
now and are putting a lot of emphasis on building a
new national medical device evaluation system that
leverages the experience that patients have that are
brought forth with registries and potentially linking that
to sources like Sentinel and like PCORnet and I
realize, you know, this meeting is focused on drugs and
there are lots of differences between drugs and devices by
which perhaps that utilizing real world evidence in new
indications like we've seen in devices, one particular
registry, the TAVR, the transcatheter aortic
valve replacement therapy, there's a registry on that
that was built primarily for payment, CMS, coverage of
evidence development program where the data and evidence
that were being produced were primarily supposed to be
used for coverage decisions, to understand how this is
working within the population was actually very valuable
information that was collected but was useful for a new
indication, that FDA used that evidence and
registries are difficult, there's a lot of burden on
providers, but linking in and figuring out how to make
better use of real world data outside of registries can
be an important aspect. – I wanna open this up to
comments from the rest of you who are here in just a minute. But picking up on this notion of learning from off label uses, I
think Janet you said that that potential evidence
wouldn't be wasted anymore, this notion that we definitely
seem to be moving farther and farther away, if we
ever really were there, of just approving drugs and
then hoping for the best, payers certainly aren't
satisfied with that, in getting value out of
the system, and patients are increasingly dissatisfied
with that, in terms of being able to get information
that's relevant to them, and to the outcomes that
they care most about. Greg talked earlier
about an infrastructure for real world evidence,
several of you have referred to real world evidence systems,
I have to say that seems more aspirational now in many
ways than well established, we're gonna hear today from
some of the leading efforts around building a system
like FDA's Sentinel and efforts by PCORI, efforts
by payers including CMS, the support registries, but
it still seems like there are a lot of gaps, and I'd
like to keep this focused on kinda FDA and regulatory decisions, how can the efforts that
we're talking about today, in terms of trying to use
more real world evidence more effectively in regulatory decisions, how can that contribute to
a better infrastructure? Better systems of real world evidence? – I was just wondering
whether Janet could comment on the Guardian System, I've
heard a lot about the system that will maybe be kind
of a make two Sentinel to look at, benefits. The other side comment would
be, you mentioned PCORnet, Janet you and I sit on a steering group, obviously our great colleagues
are sitting back there and will talk about PCORnet,
but what is the great potential of a PCORnet to be able to
allow us, prospectively, to be able to utilize those networks that are being developed
to be able to effect the use of RWE in a better way? – [Mark] Including in
regulatory decisions. – Including regulatory, yep. – Oh absolutely. Well what I think has
to be done next, or what we have concluded internally
at FDA has to be done next, is we need to do some demonstration
projects, experiments, we need to try this out,
see how far we can get, and what are the gaps
in the infrastructure, running Sentinel for a
while has helped us to learn what are missing and
what is the power of that in actual use cases where
we try to answer safety, saying only, either we
can or we can't, you know, and we learn in what is missing. So what we would like
to do is, first of all, PCORnet has the electronic
health record data as I said, and so that's one piece,
that's one kind of information, unfortunately that information
often is inner-operable among systems, and
missing different pieces, the Sentinel has the claims data, right, and then there are other pieces out there that we're also trying to bring in, and there's our federal
partners in Sentinel, such as we can't directly analyze into the Medicare databases,
but we can't use them, we can't access them, we
can't just merge them. So the question is can we do
some demonstration project where we're asking different questions other than safety questions,
and see how well it works, and I think what you
suggested Bill was a good one, can we do a retrospective
prospective experiment where we already know the
answer from randomized data, go back and see, I like
to do stuff like that, just like o mark, like actually see how the methodology performs,
does it answer questions you already know the answer
to, because it doesn't make a lot of sense to me to ask
questions you don't know the answer to when you don't
know how your methodologies gonna perform, so we
would like to do that, and do some projects, maybe
merging across PCORnet and Sentinel as much as we
can, see how we can link, so we'll learn a lot about that,
and also try to investigate some questions, I know in
Sentinel we did a safety one where there's an ongoing
randomized trial, and then we did it also in the Sentinel,
okay, so doing those things, to get your point Mark,
we'll start to get real on the ground understanding
and what can we and can we not do, what is
feasible, what has to be fixed, what bridging do we have to
do amongst these and so forth, but that's what I would think. In Guardian we have to have
a different name, okay, 'cos Sentinel is our safety system, it's statutorily mandated
some of this stuff we do, we can use the same
infrastructure from Sentinel for other things, you know
we've been urging to do that for almost a decade, (chuckles) let's have one infrastructure
for all this stuff, but this other activity
is not like a safety type of activity, so that's
what we're planning to do, and I think that will,
maybe in a year or two, we can come with a whole bunch of things that have to be fixed so
to speak, or we have to do this project and this
project and this project in order to be able to
rapidly utilize data from existing networks. – One infrastructure of
multiple uses and more clarity and opportunities coming
on these additional uses to strengthen the infrastructure. – That's right. – We have time for a couple
of questions or comments if there are any from the
participants here in the room. Bob, (laughs) you deserve a comment. – [Voiceover] I think a lot
of this has been said already, it seems critical to
me to be very specific about the uses we're talking about. I mean Janet, these things in
a way are historical controls, that's what they really all
are, you're not randomizing, your collecting data that exists already, that's what historical controls are. Well those are plausible as
we've written about repeatedly when the effect size so large
like Janet's oncology examples and when you think they're
actually collecting the data. But, as also Janet said,
there's places one might look, just as a candidate,
there's been a recent study called Pegasus showing that
long-term use of ticagrelor after a year or two is
better than not using it, and these anti-platelet
drugs, we don't know how long to use them, there's
mixed data, it's uncertain, so now somebody did a study,
showed a 20% reduction in major events in a randomized
study with 20,000 people, well there were three groups, it really would have been 18,000 people. It would be really
interesting to look within these databases at outcomes
like heart attack and death, and see whether longer
term use of Ticagrelor in these settings corresponded
to an improved example, and that's a small effect,
20% reduction is a big deal, but that's at the limits
of historical controls, but it would be very interesting to see, and I'd be interested in the examples that Cliff was giving where
people have discovered all kinds of stuff, by
looking at these things, did they publish it, I mean
did they change their payment, do we know the payment wasn't
changed because they wanted something cheaper, you know,
it's really worth looking at example 'cos this
is uncharted territory. – Well I would add something
to what Bob is saying, you have to have a large
absolute number of events too, if the 20% reduction was
off six people or something, or ten people, then that's
not gonna be the kind of thing you're gonna see with so
much noise in observational, so you have to have fairly
large background rate of events and then a large
proportional reduction to find, I imagine. Well we gotta start with
that, we gotta with something that is viewed as relatively
easy, not, you know, a .1% reduction in mortality
in a 40,000 person trial. – [Mark] We're gonna
come back to those issues in the next session– – You also can pick your
patients, you can try to pick a patient population just like
the highly rich population they included, so it'd
be interesting to see, is 20% below the limit, forget
about it, or does it lean, or– – Hold the thought of
which are the best examples to start with, what would
be the key places to start in terms of these regulatory applications, we're gonna come back to
that in the next sessions. I have time for one more
comment, if there is one. Yes, please go ahead. If you can manage the logistics
with the mic, (laughs). Thank you. – [Voiceover] Thank you, Grace Lee Brumen. I have a question, so I
agree that sort of we say that not in our lifetime
we'll use real world data for first approval, I see it
to, one, that we talked about, it's the infrastructures, the data, but also one other is
looking at very different new endpoints, and it seems
to me we almost may have a great opportunity of creating endpoints that meet what the patients
are really looking and needing, what's the physicians
really are looking for, what's the payers are looking, and sort of I'm thinking what's the process? How can we get there and also make sure that it meets the regulatory expectations, so I'd like to start that discussion, I know it's something
that might take years. – Just to make that a
little bit more pointed, are there reasons why, if
it's what the patients want and it's what the doctors
want, and what the payers want at least for value, or
at least the outcome side of value, why would that
not align with what's needed for regulatory decisions? – Mark here. Point well taken. There healthcare system, you
know, new drug development in the healthcare system
is not done for the purpose of satisfying a statute, we
are really glad the statute is in place because it does
protect us from unsafe things and it helps establish some
evidence about whether or not something works, but
satisfying the statute is not what we're trying to do
in the healthcare sector, so if we can think a
little bit further ahead, we want the FDA to please
keep doing what it's doing, but realize that it is
part of a broader tapestry, a broader framework which is
getting people to feel better, and the work that the
FDA does can be perhaps better integrated early
and more richly with other evidence generation needs and
communication of those results and getting feedback from the
patients whom we're trying to serve, so I love the
statute, it's done a great job for the most part, but it's only part of a bigger mission
that we're all pursuing. And so your point on those patients, there outcomes is great. – [Mark] I'm gonna got
to Janet and then Bill and then we're gonna wrap up. – Well I don't disagree at
all, but I will reiterate my point, many of the
outcomes that patients are interested in are not
well captured in the current real world evidence.
– [Cliff] Not yet. – And so that's where we're
doing a patient centered drug development to try and integrate. Those can be captured in
clinical trails as well, and the patient groups, I
think, are starting to wake up and figure out that part of
their role is to determine what is measured and considered benefit, and that's a good thing, but
then you have to figure out how to measure it and that's… And then, for real world
evidence, somehow the clinician has to have entered that,
(chuckles) or somebody. In the future we will
have more, as you said, patient involvement, that we
can use electronic whatever, capture systems for
patient centered outcomes, but we'll have to build
that in, it's not there now, it's not n claims data, it's generally not in electronic health record. – I too agree, it's a great
point, and I think the problem right now is whether it's
common data models et cetera, we don't maybe include all
the things that we might need, but of course the challenge
for all of us is understanding the science of patient
output, input rather, and understanding what
PROs and COAs might be, but it's possible that
using real world evidence will begin to at least ask
hypotheses of issues that might in truth be important, and
true endpoints for patients, but then need to be tested,
so I think that could be a great lever in order to
make headway in that area. – [Cliff] Good point. – I wanna thank you all for
getting a lot of key issues on that table, we're gonna
take a 10 minute break now, when we come back we're going
focus in on what the criteria should be for these
kinds of regulatory uses of real world evidence, thank you. (audience applause) – Welcome back, we're gonna
go ahead and get started on our next session, so if
you could please make your way to the tables. (background chatter) Can you circle the. I'd like to go ahead and invite our next speakers to
join me up on the stage, we'll introduce them in a moment. And we're all mic'd and
sitting in the right chair? (chuckles) Great so I'd like to welcome
our next round of speakers, Bob Temple is Deputy Center
Director for clinical sciences and acting Deputy Director of
The Office of Drug Evaluation at the Center for Drug
Evaluation and Research. Bill Capra is Senior Director
and global head of oncology for real world data science at Genentech. And the Rob Metcalf is Vice President, global regulatory affairs US
and global medical quality at Eli Lilly and Company. Building on an expert workshop
that we held back in January, this session will begin to
lay out a specific piece of real world evidence
regulatory use framework that we've been working on. And so you've heard this
morning, we need to get specific, we need to identify
the best opportunities, so these next two
sessions, this one and then the following session
will begin to ask when, so this session is when do we
think that real world evidence to support regulatory decision
would be most appropriate? The session after this will
begin to explore how one might go about developing the evidence to fulfill a regulatory use. The specific use case
that we're talking about is a new indication for a medical product that's already on the market. And so I'd like to start things off, I think we're gonna go in
a bit of a different order than what's in your agenda,
we're gonna start off with Bob, he's got some slides,
then we'll move to Bill, and then we'll round that out with Rob. So Bob. – I'm just gonna use that. – Okay. – Yeah. Well in some ways I have the easy task, I mean this isn't using drugs as a sign, this is randomizing within
the real world setting, so the main question is can you do it? What are the nuances of how to do it? Are the data any good? But the concept is,
okay you're randomizing. So that's, I have the easy task. What are some things to think about? Well one, and this isn't trivial, is finding potential
participants, that's not a small problem these days. A second issue that I'll address briefly is how you get consent
under these circumstances. A third is, do the data you
have on identifying patients tell you everything you need to know? Or do you really have
to get into it and have the investigator confirm it
all, which is more complicated than just using the data. Can we reliably know that
the disease is easy present from the data that's collected? And then there all kinds
or enrichment patterns that people use in trials,
can you learn those from the data? This is all about whether you've
picked the right patients. You gotta consider whether
you need to have blinding, and whether there's gonna be a placebo, and whether that can be
done in these settings. And then are the electronic
health record outcomes gonna be enough for you,
or do you really wanna know much more, they don't
collect symptoms very well, so even if you're doing
an outcome study in, say heart failure, you sometimes also want the Minnesota living
with heart failure scale, well you're not gonna get that from the electronic health records, and there's probably a
million other issues. So, what strikes me as unbelievably
simple, straightforward and very virtuous, is
helping you find patients. We hear all the time
that companies take years to find the patients for their trials. And the most obvious possible use of electronic health records is to find the candidates for a trial. I think this should be accompanied
by a system wide urging of patients to join trials
because it's good for everybody, but that's a different question. This should be relatively
easy for a marketed drug, it has known properties and
you can ought to be able to screen in the electronic
records for whether the patient has the
disease, you ought to know what the current treatment
is, you might be able to take a good guess at
severity, and you could probably pick up certain
kinds of enrichment features, like prior events, laboratory findings, at least if they're standard,
not if they're exotic, and the duration of illness,
all of which might help you decide whether this
is a good candidate. (coughs) Excuse me. I was thinking about where
you might wanna do this, where it would be useful,
and I'm just thinking of candidates. How long should you give
bisphosphonates to people to protect their bones,
well there were a large outcome studies, with
many thousands of people that were done, wouldn't that
be a very good thing to do within the healthcare system,
that is find out people who are on them, and then
see if you can randomize to how long to take them, there's a gain, they have side effects,
they're not benign, people might like to know how
long it's worth taking them, so that's a possibility. Another example, just
that I sort of mentioned is the recent ticagrelor
study called Pegasus, a very large study, compared
two doses and placebo, both added to aspirin to see how long to give an anti-platelet
drug, again, you really need to know the answer 'cos a benefit could be reduced heart attacks and
strokes, but the downside is there's more bleeding
if you stay on it, so people have an interest in knowing whether to stay on them or not. And so I think you could
conceivably do such a study in a randomized environment
of people who are say on the drug for a
year and they wanna see how long to take them,
another good candidate. Another possibility is,
and this has been done in randomized withdrawal studies, how long should you use
an ADGEV in chemotherapy? There are consequences to
being on ADGEV in chemotherapy, you wanna know if it
lasts after three years, lasts after four years, and
there've been trials like this. Those all seem like great candidates to do in this sort of environment. Now then there's the
question of doing the trial, you still need to get
consent, there's no way not to do that, which
means there hast to be, not necessarily, there probably
has to be an interaction with an investigator to
do a reasonable consent, someone to answer questions, but it's not out of the question that
you could do this online, at least initially online,
probably with a follow up, and that's sort of been tried. Probably, if what you're
testing is whether to maintain therapy, you
really need investigators to see the patients
periodically, you have to have, within this system of real
world evidence, you have to have actual investigators who've
agreed to be part of the study, so you still have to
get them and enroll them and do all that stuff,
doesn't save you that. Whether the system can
collect the endpoints for you, or you need to have
patients come in and visit and fill out forms and all
that stuff is gonna depend. These systems are not very good actually at collecting death, Sweden is we know, 'cos they have registries
and you can use them, but it's not so easy in
the US, I dunno what, I'm not sure what to do about
that, they probably collect whether a person had a heart
attack, but we don't know whether they make the
diagnosis correctly or not, and everybody's working
very hard on getting good diagnostic criteria
for a heart attack, it's not so simple, I don't
know whether, if you do it on the basis of the evidence
collected, it's as good as or as precise as having the investigator and then a judgement
committee fill it out. If you're interested in a
heart failure treatment, yes you can get hospitalization
for cardiovascular illness, you can probably get that from the system, but you won't get New York
heart classification, probably, you certainly won't get
the Minnesota score, you won't get an exercise test, maybe you want all that stuff. So even I you use these
systems to identify patients, there maybe a lot of other
stuff you have to collect, and that's gonna have to be decided. And there's a whole range of
other stuff, depression scores, ADAS-cog, that is not gonna
be collected automatically. So you're gonna need to have investigators that people are gonna have to visit. For a recognized approved
drug it's often possible to greatly reduce the
amount of safety data that needs collected, we
have a guidance on this, informally referred to
as phase three four lite, but it basically says if you
know the drug, the only things we wanna know are adverse
reactions that kill you or impair you or make you
drop out of the study, and you don't have to collect
the routine other stuff. Most of the time you're gonna
want to blind the treatments, so you have to get into
the question of how you do a placebo controlled trial
under these circumstances, there might be some
cases where you wouldn't. So the limitations of all
this, for randomization within such a system, I
think used to find people for enrollment, that's great, that's easy. The use of collected
data has a lot of issues that need to be worked on,
I dunno what that says, impression hides, anyway.
(audience chuckles) Oh no no no no, it's inaccuracy hides. To the extent that the
data that are collected are not precise and not good,
they make it harder to show an effect, everybody knows
that, noise obscures, so if the effect is
large and overcomes that, the conclusion is perfectly reasonable, it would be very hard to imagine a persuasive non-inferiority
study in this setting 'cos you don't have any
prior historical experience with this environment, you
couldn't really usually set a non-inferiority
margin, the exception to that is where the effect is just enormous. So, you know, it's interesting. The effect of coumadin and other drugs in reducing thermobolic stroke in people with atrial fibrillation
is something like 70%, so I'm not even sure you couldn't
do a non-inferiority study with an effect that large
in a setting like this, it would be interesting to do it. And then a lot depends on
how good the registry is, we've all had the recent
example out of Sweden, of the taste trials
for thrombus aspiration in myocardial infarction,
and it was possible because it was only one
treatment, it was just one day of treatment and you
didn't have to follow, you didn't have to maintain treatment. All-cause mortality in
Sweden is easy to ascertain, you could do it, and then
they had other registries that collected additional
secondary endpoints in a way that I'm not sure we have. But if you have all that
stuff it's pretty plausible, and it's a fascinating example, maybe our registries will get better. Or maybe where there is a registry already of some particular thing,
you can use those data, that's not quite real world evidence, but it sort of has plausibility to it. Okay, end. – [Greg] Great, thank you. Next Bill. – Good morning. I'll be putting forth
an example which maybe the ideal setting for
assessing real world data for regulatory label expansion. First I want to highlight what
we mean by real world data, as some of the differences
and the types of RWE impact how we can
potentially use these data for regulatory approval. The strength of the
insurance claims databases are the large sample
sizes, but as we know, these were collected for and
built for insurance purposes, not for clinical research,
so the challenge then is to link the claims with outcomes, potentially through a chart review. Health provider claims
databases have data completeness as they are closed systems,
but the sample sizes are often much smaller. Registries are generally
either disease specific or product specific, but
often the amount of data collected per patient is limited. Electronic medical records
link up all records from the general practitioner
and specialty clinics, but often the key
endpoints are buried within unstructured data, such as physician notes or radiology reports. But there are trends in
advancing this and trends on improving the data
quality and completeness through technology, such
as the ability to link up the insurance claims with the EMR and with molecular information. Now randomized controlled
trials have been the best tool for measuring the efficacy
and safety of a medication, and the differences between
RCTs and real world data have been well documented. But in highlighting these
differences, we can see an opportunity for real
world data in rare diseases where the large number
of treatment centers provides the ability to identify patients. Now this will require new tools, the tools around clinical trials have
been around for decades, such as case report forms, but many of the real world data
sources are relatively new, so it's gonna require new
tools for data collection and even for endpoint definition. So in rare diseases, because
of the small patient numbers, an RCT is generally not feasible. And even a single-arm
clinical trial can take years to evolve because the limited number and the fixed number of study centers. So rare diseases maybe the
ideal setting to assess the use of a real world
data for label expansion. So an example where this could be applied is a rare cancer based on both anatomy and biomarker alteration. The setting here is a small
population with high mortality or high morbidity, and no
effective treatment option. The experimental
medication would have been previously approved in a
larger cancer population, so that the safety and benefit
risk are well established. Now it's essential that
there's a biological basis for the activity in the
rare cancer, for example an agent that targets a
specific biomarker pathway, and there may be existing
phase one clinical trial data showing safety and efficacy
on a very limited number of patients with that rare cancer. And now we provide the basis for a larger confirmatory cohort obtained
in a real world setting. The primary endpoint for
this rare cancer example is treatment response
in a real world setting. Now to overcome the challenge
of the unstructured data in an EMR, patient
vignettes could be prepared using redacted physician
notes and radiology reports. The small sample size in
a rare disease population enables a patient by
patient review to assess the treatment effect, this
can lead to an opportunity to define a standard
definition of response in a real world setting,
and then that could be used for future applications
in the same disease area, extending this beyond just one product. Many of the common issues
with real world data can be addressed in this
rare cancer example. The data quality and completeness
concern can be overcome through obtaining redacted physician notes and radiology reports. Patient vignettes from
a limited sample size remove the need to pool data. In a setting where a single
agent results in tumor shrinkage the need for randomization is alleviated. And patient privacy can
be maintained through use of a third party to de-identify data. In summary, real world data
in a rare disease setting is an opportunity to
demonstrate efficacy and safety in a population where a
randomized controlled trial is not feasible. The manageable study size
enables an in-depth review to confirm clinical activity. And then this could be
extended to larger populations, where clinical activity maybe limited to extending stable disease. So in the oncology setting,
overall survival can be obtained from real world data sources and then this would suggest a randomized design. In other broad populations
there would be a need to create data standards
with well defined endpoints in the specific diseases,
a pragmatic design would then be an option
here if there are not available agents with a similar mechanism of action for crossover. Thank you. (audience applause) – Thanks Bill, Er Rob. – Thank you Greg. I think both Bob and Bill have
given nice specific examples where RWE can potentially be
used in regulatory context, regulatory decision making. I'm gonna go a bit broader in terms of some of the contextual elements
of what we might consider and some of the criteria from that regard. First starting with a comment
that many of the speakers this morning made, that
substantial amounts of electronic health data are
being generated every day, given advancements in the quality and the documented validity of these data, along with improved
understanding of appropriate study designs and analytical
methods, we are now in a position to consider
these new opportunities to incorporate such real world evidence into a range of regulatory decisions. This use has the potential, I think, to significantly improve
the quality and efficiency of evidence development and
ultimately drug development. These will help ensure the
safe and effective therapies reach patients with the
best information available in labeling to guide
healthcare professional and patient decision making
at the point of care. The use of real world evidence is growing, albeit in fragmented and
somewhat disconnected manner. Post-marketing safety evaluation
of pharmaceutical products, as we've heard this morning,
in the real world practice has been performed for many
years, becoming a mainstay of regulatory safety
assessment with the creation of FDA's Sentinel system. By embracing application
of real world evidence through potential application
and evaluation of efficacy, we now have the
opportunity to move forward in a more effective and efficient
means of drug development. These additional uses
have only been considered more recently, I see
multiple potential uses in benefit assessment that are merging, and each of these uses need
clarity as to how they fit into the existing regulatory framework. A focus of our discussion
today is on the use of randomized designs
in the clinical setting, in other words, pragmatic
study approaches, to support approval of new indications for already marketed medical products. While this is a logical area
focus, other potential uses for real world evidence
have been alluded to, and could include expanding or restricting approved patient
populations, refining dosing and product labeling, confirming
efficacy as well as safety in the post-approval
setting, and establishing, as we've heard repeatedly,
historical controls for clinical trials. Evidence can be garnered
from real world data sources using a multitude of study designs ranging from pragmatic
clinical trials to perspective and retrospective observational studies to K series to case report studies. The possibilities for use
of RWE to either augment or in some cases potentially replace traditional data sources are exciting and potentially quite extensive. Experience to date with the
use of real world evidence for benefit assessment has largely been on a case by case basis, in
a number of these instances the criteria seem to include
the following characteristics, situations where there
are limited or no adequate alternate therapeutic options,
where the disease state is serious and life
threatening, rare diseases with small patient numbers,
and when the drug effect is large and measurable. These early experiences
provide a great foundation for the application of real
world evidence approaches in the regulatory context,
and with the opportunity to broaden and make more
consistent the situations where real world evidence
is considered as meaningful supplementary or alternative data sources to traditional randomized
clinical trial approaches. Clarity in terms of regulatory
transparency, predictability and consistency to
establish best practices around the regulatory
framework for consideration of real world evidence to
inform regulatory decisions will encourage more investment
in this emerging field. As a specific use case
there's a growing interest in the utilization of real world evidence to support approval of line extensions for marketed products, this
would ideally be pursued in situations where the
clinical safety profile of the product is
adequately characterized, the patient population
is easily identified and sufficient numbers in
real world data sources, and the outcome of interests
are appropriately captured in real world setting,
further medical practice should guide us as to when
we should be considering evaluation of these medically
accepted alternate uses, randomized pragmatic trials
are increasingly offering the benefit of high external validity through ongoing advancements
in data sources, methodologies and designs, including
means to minimize bias that could potentially be created by unmeasured confounding variables. This is an important
consideration as we work to establish causality,
causality is certainly an important objective when
considering the approval of new uses of previously
approved medicines, causality is assessed
using multiple factors, and these factors have been
documented to form the basis of modern epidemiological research. In the assessment of
causality one must consider the biological plausibility
and temporal relationship between intervention and the
outcome, the reproducibility of the finding in multiple
studies and the strength of evidence and the
strength of the effect size and dose response among other factors. Pragmatic trial designs have the potential to offer an alternate or a supplement to randomized controlled
studies to assess these factors. Approaches for new indications
traditionally require two adequate and well controlled trials by both regulation president,
also allows for approvals based on single trials
with support of evidence. In situations where reliance
on real world evidence is a possibility,
consideration could be given as to whether one
traditional clinical trial plus evidence gathered
from real world data, such as that created in pragmatic trials maybe adequate to meet standards. This appears to be a viable
and logical place to start as we build a more
consistent and transparent regulatory framework for the
use of real world evidence. As I mentioned earlier, and
has been mentioned previously by others, there are potential multiple other regulatory applications
of real world evidence that we could consider. One of the opportunities
that is most exciting to me is the use of real world
evidence for label changes which could include expansion, restriction and refinements within
already approved indications. As many have already articulated, data from traditional clinical
trials, while very important, is collected in somewhat of
an artificial environment and a very limited patient
population relative to the use of the product the real world. Data from real world experiences
could help us provide better information to patients
that aren't tight matched to the exact type of patient studied in traditional clinical trials. Label changes resulting
in from evidence collected in the real world setting will ensure the healthcare professionals and patients are making decisions based on
the best information available at the point of care. A third category of potential
use that were alluded to earlier in talks would
be the establishment of historical controls, which the FDA has a number of precedents around. There are multiple examples in
oncology and orphan diseases where traditional
approaches to control arms maybe very challenging due to
the small patient population or the life threatening
nature of the disease. Use of historical controls
could substantially augment the power of a study in very
efficient and effective manner. Of course appropriate attention
must be given to avoid bias and confounding factors,
but through diligence and matched demographic characteristics, providers skill level, modes
of care, technology use, analytical methods and
other factors, one could go a long way to ensure that results are relative and comparable. I'd like to conclude with the strong push that a coordinated effort
is needed to address the potential use of real world evidence in regulatory decision
making, in this regard the efforts exemplified by this workshop are critically important. I've described a few but not all of the many potential applications
for real world evidence. We have access to
inordinate amounts of data in the real world, and can be converted into real world evidence. Answers to a variety of
questions have the potential to be researched more
efficiently in terms of cost, time and resource utilization,
while still delivering high quality robust information. Greater alliance and RWE has the potential to significantly enhance our
approach to drug development and regulatory decision making. Considerations of pragmatic
clinical trials as part of the data package
submitted seeking approvals of line extensions is an
excellent area of focus. Guidance from FDA is strongly
needed in order to form the foundation for the
regulatory expectations to encourage investment and
to enhance predictability in the use of pragmatic
clinical trial designs and other real world evidence usages, as well as other types of RWE. Our early experience and 21st
century science enables us to pursue the great
opportunity in front of us. Thanks. (audience applause) – Great, thanks to all of our speakers. So looking at the
presentations that I saw, and Bill your examples were
in a very specific case where real world evidence
could support a new indication for a product that's been
already on the market that has a well established
safety and efficacy profile within that approved
use, but your example was I think suggestive of where you could use real world evidence in a case where you weren't randomizing potentially. And Rob you brought up scenarios where potentially randomization
would be important, but the other uses of real world evidence, and I think outside of a new indication, but potentially other kinds
of regulatory decisions. So to Bob, and all of
you, let's first tackle I think randomization versus
not, and maybe narrow it to a new indication,
and maybe explore that a little bit further
on are there scenarios for a new indication, and
what would be the qualities of those scenarios, where one might say that even randomization
may not be necessary. So Bob, do you? – Yeah is this, yeah it's on. This all turns on how big the effect is. I mean the use of… I mean this is not a new idea, these are historically controlled trials of a particular kind, and
nobody really believes historically controlled trials are useful I'd the effect size is very small. So all of the cardiovascular
uses, claims that I think of, which have made a huge
difference in extending survival and all that stuff, it's
very hard for me to imagine that a none controlled
trial, none randomized trial is gonna be credible, the
effects of these drugs are like 20%, that's great you know, that's a lot of lives saved,
but without further evidence I find implausible that
those will be credible with an uncontrolled trial. But one could look, we
have lots of trials, I gave one, the recent
trial on how long to give an anti-platelet drug. There's another study, the DAPT study, that did the same thing. If someone wants to make the case that a historically controlled
trial without randomization could be useful here, let 'em do it, find those populations and
see what the results are like. With the gain of experience
maybe we'll learn how to adjust the population
and stuff like that. But without that they are, to me, patiently none credible. Now our experience with
the novo anticoagulants is sort of interesting,
we've got a lot of things to look at, bleeding, for various reasons that we're worried
about, it's interesting, a varied consistently, dabigatran, has had fewer intracranial
hemorrhagic strokes, and that has shown up
repeatedly, it's always better, and that's at least partly
because the hazard ratio is .5, compared to coumadin
there, it's a large effect, it's the effect of
two-fold, on other matters, whether there's GI bleeding, the results go all over the place. So how big the effect is
has a lot to do with this, if a secondary use is very
large, the examples Janet gave, you know, whether it shrinks the tumor, that's perfectly credible,
for other things like that I think people have to get
enough data to convince us that those are plausible,
and you know we're learning more about how to adjust people's risk, maybe that'll make it possible,
but until somebody shows it, preferably by showing the
historically controlled trial and that environment came out the same way as the randomized trial,
I find it hard to imagine that that's gonna be credible. – [Greg] Bill. – Yeah, the example I
was bringing up was one where there'd be tumor
shrinkage as a single agent, and I know the agency has
approved clinical studies on the established trials. I think the reason why that
I think is a nice example to look at is because of the
challenges we talked about earlier today, was about how
do you pull the information out of the real world data
source, it's different than a resist criteria in a
clinical study where you have very fixed visits scheduled,
fixed visit assessments. In real world setting it's
gonna be very much different. How do you do that in a
very large randomized study? It's gonna be difficult
to interpret relative to a small rare population
where you could actually go through the small patient
numbers and actually look to see, did that individual
patient really respond, 'cos it's gonna be a
different of looking at it than the resist criteria. – And tumor shrinkage is one thing, we're usually interested
in time to progression, is that even possible? – [Bill] Yeah
– I dunno until people look. – So getting, I think,
in a small population and looking at that, getting
our comfort around that before we can extend it
into a larger population where we'd be looking at
time to disease progression in a real world setting,
which I agree, that's gonna be a very difficult one to interpret. – You know I think Bob
alluded to the magnitude of effect, I think the other
is, is it well characterized from a safety standpoint,
and is the data there, in those real world data
sets, to give us confidence around that, are there other places too where you can leverage more of that, in terms of new indications
in various fields, where you're potentially
looking at combinations of new therapeutics
that are used out there, oncology is a prime example of this phase, if the data sources are amendable to that, if the practiced medicine
has been doing that, are these data sources that can supplement or compliment randomized
studies in terms of their use patterns and provide
two pragmatic approaches, or other approaches,
additional information that we could see sufficient
for regulatory decision making. So I think there's a couple
of these areas to explore. – One of the points made
in the previous panel is that these things may
or may not be sufficient to prove something but they're very good at raising hypotheses, and I think that should be recognized,
one group given a drug seems to do worse, older
women or something like that, you wanna go looking,
maybe you wanna go back and look at your trials,
see if it was there too, but the large amounts of data
make possibilities like that very good, and if it's important
enough it could suggest that further trials should be done, that's of very valuable use. – So supplemental evidence
on specifically in the case where we're looking at a
new indication, I'm gonna sort of steer away from that for a moment and then go back to it, I
think 'cos we got a lot of time on this panel, but it sounds
like, at least for now, at least a demonstration
example or showing the agency that the conditions are
right for potentially not randomizing to show an
effect in a new population as long as the data
are well characterized, can be collected, the
effect size is really large, perhaps randomization is
either impossible or not able, then their might be unique scenarios where that could be the case. So the departure that I
wanna take is maybe around some of the other regulatory decisions outside of new indication. So maybe a little bit less,
so like label changes, so you're still looking
in the same population, you're still looking, but
you're looking at potentially different dosing,
different duration of use, but the population are
the same essentially, are there scenarios where,
does that sort of relax it a little bit in someway,
where non-randomized design, maybe a well performed
prospective study that includes registries for the
necessary data elements, some of them are prospectively collected by the clinical sites, but do
leverage real world evidence from electronic medical
records, lab results et cetera. So in those scenarios,
where you relatively have the right population
in terms of the label, can you foresee a framework
where these real world evidence designs could actually be
useful outside of randomization? – Well I can see one. It's well known that in clinical trials compliance is probably better
for a variety of reasons, the harassment from
investigator and all that, then it is in the real world setting. So if we, for one reason
or another, approved a dose of X, and you find out in
the real world that nobody stays on it, everybody stops
it, that could either suggest that you wanna say something
about that in labeling, I mean it might be real world
evidence of whether people will tolerate the side
effects, that strikes me as something to go on the label, now you'd have to decide
whether the lower dose worked in that, so you'd bring your
dose response information to it, but we often approve a higher dose even if the evidence that
it's better than a lower dose is not overwhelming, I mean
you don't have enough data to know precisely whether
dose one is different from dose two, so you
sort of look at the curve and make your best guess,
but if you discovered that it's very poorly tolerated
and people don't take it, I'm not sure that couldn't go on labeling, I can't think of any examples of it, but it doesn't seem out of the question. That's an are where real
world evidence might trump the clinical trial
setting where the tendency to encourage and stimulate
a proud beat people who don't take the drug, might
give you a distorted idea of whether a dose is tolerable or not. So that's one possible example, I haven't seen any but– – To me the dosing is a
logical place as well, but if you wanna expand,
Greg what you talked about a little bit, narrow it, keep
it in the same population, and you might get meaningful
information on sub populations, in the elderly or combined
dosing in the elderly, because clinical trials
won't be able to cover all those scenarios that
all those doses as broadly as you do in the real world. If we have data sets, if we
have abilities to collect that type of information, to
mine that type of information, to think about study methodologies
that might pull that out, you're not thinking about
populations within populations that might respond
differently could be an area of opportunity, to look and
say where real world evidence might put us, to better
ways to use that medication in the marketplace. – So turning then, to what
this panel's really about, and what we're trying to focus on, randomization in the clinical
setting for a new indication, in our previous expert
workshop I think somebody brought this up, I'm not
sure if it was Marc Berger or Sean Tunis or maybe even Adrian, that's where if you think about
a pragmatic clinical trial, you have randomization on
day one, and then everything after that is an observational study. So it's not a situation where
you have a well controlled traditional clinical
trial and you're bringing in data elements from an
electronic medical record, it's actually where you
have, maybe relaxed controls on switching, or adherence
to the medication, all of these things that
happen in the real world that actually can come
in to play in the study after you've randomized,
for those kinds of designs, the usefulness for a new
indication is the question at hand for this panel, so what we heard in the earlier panel is
that we would be great, and I think Bill brought
this up, if we had sort of a Latin square, sort
of someway to understand more of what would the
conditions need to be in order for a pragmatic design to
bring beneficial information that would be sufficient
for a new indication. So one of the ones that
we already talked about is a large effect size, so if
there is a large effect size, that would be important, also
if we were able to collect the relevant endpoints with
a high degree of validity and reliability, that would
be an important aspect, but what are some others,
what are other conditions that you all think, at
least if where the concept of starting slow, identifying
some potential pilot projects or demonstration projects,
and we have some examples, but I wanna push a little
bit more on, sort of, what the conditions need to
be, that might lend themselves to this sort of design. – Yeah, so when we think
about, you've been comparing what the pragmatic design
would be compared to an RCT, then focusing on real randomization,
and as you brought up, the other part is control,
are our patients staying on within a control group or
not, and then maybe we need to cross over another
effective medication, and I think that's really, I
think one of the conditions needs to be thought about is,
is there another medication with a similar method of action
that potentially patients could cross over to, if
there's not, and then maybe that's a way to leverage
the cost of randomization, so if there in some
centers that that's not a payment option,
potentially the randomized to those centers versus
others where there could be potentially a crossing over. I think that's the question. – It's very hard to have this conversation without specific examples. Lots of large, big outcome
trials are pretty large and pretty simple, and
they're pretty pragmatic, they don't, I mean we
very strongly discourage excluding people for this
or that, some of them are less pragmatic 'cos they
have enrichment features and stuff like that, which helps you win, so you wanna be careful
about getting rid of those. Many of the things you're
talking about, crossing over, stopping and not complying
make it very hard to show an effect, so if your
goal is to show a new effect, and you're allowing all
kinds of noise to enter, your chance of winning
goes away, it falls, whether that's an attractive
feature 'cos it's simpler, I dunno, but if everybody
after a week crosses over to a different drug
and then you don't have the placebo group anymore,
so you know, I dunno. I think these things
have to be worked out, a lot of people agree that
large numbers of exclusions are bad, and we've
strongly discouraged those, people used to exclude
everybody over 75 from trials, or over 65, and we and the
ICH are very much trying to discourage that, so they're
pragmatic in that sense, we always try to tell
people, don't exclude people who are on other drugs because
then you won't find out whether there are drug
interactions, that's silly. So more and more trials,
large trials anyway, are getting to be what some
people would call pragmatic, but that doesn't mean the
noise doesn't obscure, so there's a tension here,
it's not clear the trials will win if they're too noisy,
and maybe that's real world, but it doesn't help you
know whether the drug that's out is of potential value. – So the question based on
that response I would say why would a company ever wanna
do this if there's so much of a potential for this
attenuation of the effect? So it's great that we have two members in this random panel, maybe
you can help us understand from your own decision making,
and Rob you sort of brought this in a little bit in your
comments, but for you all, as you're thinking about
designing the right studies if you were contemplating
more pragmatic designs, what factors go into that decision? – I think, you know, in
addition to what Bill and Bob have alluded to, I think
as we think about it from a regulatory context
is, is the disease fairly well known, is it
fairly well characterized, is there good regulatory
precedent around it, if you take a well know disease
like diabetes for example, we know a lot about the
history of that and so on, versus a less well characterized disease from historical regulatory
standpoint, so to me as we look at risk around that, do
we see the opportunity where we could, as I said, meet the substantive evidence
standard, not maybe solely from one source or another,
but by supplementing, and by using, in the right circumstances, RWE with a well controlled
randomized clinical study, that's sort of the risk
benefit that we'll look at. The other thing that we'll look at is, is there regulatory precedent
around that, and what does the guidance look like out there, so it's a risk based
assessment on a whole host of factors, both scientific,
but both regulatory from what's available from the framework and what's the precedent that's out there. – [Bob] Go, oh no, your turn. – I was gonna say, I think
another consideration is also the complexity, so
as Bob was talking about, there's more noise entered
into a pragmatic design. More noise means larger sample
sizes, that's gonna make it difficult if the endpoint
is very complicated, something like a time
disease progression endpoint, versus a hard overall survival which maybe a little bit easier to do, then
disease disease progression. – [Greg] Bob did you have a? – Well there's some areas,
I dunno if they're mostly interested to drug companies,
but there are some areas that could be studied in
a real world environment that need a look, and
need better assessment, I mean if you had to name
the greatest health problem in the United States
and probably the world, it's failure to use
drugs that we know work. The off therapy rates
for any hypertensives are probably still 30 to 40% after a year, people are gonna die
because of that, you know, we know that, same thing
for staying on your statins, maybe 'cos it's all the
noise that people use to object to them, but
something like 50% are off them at the end of a year, well
that doesn't make any sense, and large numbers of people
don't use anti-coagulants because they're too
worried about bleeding. It would be of great interest
to see if you could improve retention rates and use rates
for some of these major gains, I mean we're talking about
effects that reduce mortality by 20 and 30 and 40%, and do
it over a long period of time, it's the major gain in the
health system these things, and they're not used, there
might be interventions that could help people stay
on them, lottery tickets are my primary example, but
I'm sure there are other ways that you could probably do
it, and that's worth studying, I don't know if that's of
interest to the industry, but it ought to be of interest
to payers and other people, unless they'd rather
have them stop, I dunno, I dunno what the rules
are, but those are ares that have to be done in
real world environments, there's no choice but to do them there, and I think that needs a lot of thought. – I think those are great
examples, even for groups like PCORI and NIH Collaboratory
that we talked about before, but sort of the
growing opportunities to leverage real world evidence
to bring important questions about value, quality et
cetera, one of the things that we also talked about
in the, you guys weren't at the expert, at least Rob,
not at the expert workshop, was around relative cost
to doing a pragmatic design when you think about, you
know, because of the noise the population has to be much larger, that could be a benefit because
with real world evidence we have the opportunity to
generate data and evidence from larger populations,
but on the other side it's really hard if the
electronic medical records aren't routinely
collecting the information, if medical records across sites
aren't able to be, sort of, efficiently linked, and things like this. So there is a sort of barrier
in term of technical linkage, but also cost, from a company perspective, how are you all sort of
viewing that challenge and how does it play into your
strategy and thought process for exploring more pragmatic designs? – So the electronic medical
records are built primarily for physician's use with patients, not for pharmaceutical research,
the way we're looking at it right now is if there's
other ways that we can obtain some of that information through,
potentially, third parties who de-identifying data, but
it's really, I think a lot right now is we're using that is for understanding
hypothesis it's generating as Bob talked about,
understanding the disease space, understanding how treatments
are being used today, since we design for how to
best design our actual RCTs. – Yeah, I would say very similar. – Okay, great. So we have a lot of
time left on this panel, but a lot of people in the
room, so that's a good thing, and I suspect many of you might
have questions about this, particularly the panelists
for our next session who are folks that are
already designing, conducting and have experience with
these kinds of studies, we're gonna hear about how to do that, but any questions from the
group to the idea of developing at least a framework for when
these designs would be useful and when we might think
about approaching them? Oh thank you, Adrian. – [Voiceover] So I was
particularly interested in Bob's opening comments about
under what conditions this would be important,
so certainly the belief that randomization is important
and so that would be key for this, but I guess when you
speak to the so called noise in the system, that's a
concern that it would be hard to detect efficacy, I mean
one thing that we often hear is that we don't necessarily
know about the heterogeneity of effects, of efficacy, and
the only way to get to that is have larger sample
sizes that would have suitable subgroups to evaluate that more. So what that be a context
where say a new drug had an expanded indication to
promote having a large study and more records in the
real world to evaluate different subgroups? – Well there's no question,
I mean it depends on the area you're talking about, but
if you look at labeling and you wanna see subgroup
analysis, once you leave the cardiovascular area,
you won't see them, maybe you'll see men and
women if you're lucky, black and white, but you
won't see any of the others, disease severity and stuff, we don't show these forest plots for
anything, so a large enough sample size to be able to
detect, factors that influence how you do would be attractive. I wanna mention one other
thing, maybe it wasn't clear. If you're just using
these systems to discover potential candidates,
that's sort of a no brainer, I mean there's nothing to
loose there, it doesn't harm the study because you're
then gonna do a study that will have the same
endpoints you would have if you were just doing
a regular old trial, a question is whether
you can collect the data from the real world, from the
evidence already collected, but that's a separate question. But just using it to identify
patients, if it works as we all hope it would,
is really a no brainer, everybody should do that, and
one reason is to have a trial large enough to be able to
look at subgroups, as you know if you read labels, for
most symptomatic conditions it's hard to identify a whole
lot of subgroup analysis, now which ones are important
remains to be seen, we do demographics all
the time, and sometimes we do disease severity, but
there could be other things that influence how you do,
your income cut, for example, could imply whether you
stay on therapy or what kind of healthcare system you're
in, all kinds of things could be influential, and it would be nice to be able to know that,
so one attractive feature of having an easier way to find
patients is that you can do a trial that's a little larger,
hard to argue against that. – Any other questions from the– Yes, what's your name? – Hallar (muffled) with Genentech. I wanna come back to the question of noise in the pragmatic clinical trial
design, and also something that Janet touched upon
earlier, with respect to data capture systems
and the data quality. Could we say is the problem
of noise in the data and confounding and bias
addressable through more complete and more accurate data
infrastructure and data capture, so that we can actually mitigate the challenges associated with them? – I think it depends upon
the type of confounding, I mean if we were talking
the challenge when patients crossover on day two
to another medication, no amount of different data
capture or different analysis is gonna remove that, so
I think it depends upon what you re talking about,
if we're talking about different data systems and
how the data are capture, but patients are remaining
on their controlled therapy then that's a different
question that we can adjust for. – I mean if your endpoint
is hospitalization for heart failure, and in
the data they can't tell whether you entered the
hospital for pneumonia, heart failure, a new myocardial
infarction or a stroke, and they get it all mixed
up, you're not gonna be able to find your effect
unless it's really huge, so there need to be
some precision on this, and we see this all the
time, you know we do these diabetes studies, and the endpoint is either MACE or MACE-plus
or something like that, for what it's worth, when you use MACE which is dying or having
a heart attack and stroke which are pretty precise,
your numbers are more precise than when you use
MACE-plus, because MACE-plus includes more subjective things,
so that's always a concern, how precisely the endpoint can be done, everybody's, there's a
process in it right now in trying to give better
definitions for essentially all cardiovascular outcomes,
if those became incorporated into real world data and
stuff, I image you could hope to be using those outcomes as endpoints better than you do now,
but most studies have a precise definition for what
they mean by a heart attack, I don't know how that relates
to the definition used in the doctors office,
and if that's important, you're gonna loose precision
and make more noise and have more difficulty
showing a treatment effect. – Yeah, so just to add on to that point, is that, you know, a big thing
that came up this morning, and even Janet brought it up,
is one of the big challenges is the data themselves, and the ability for electronic medical
records, claims, et cetera, to collect the endpoints
that we really wanna study, and there are a lot of
challenges as Bob mentioned, particularly in the claims,
you might be able to know a little bit more from the
electronic medical record, that hospitalization with heart
failure, but a lot of energy is being spent on identifying
what those challenges are for measuring the outcomes in claims data, coming up with claims based
algorithms, so you might find that just one claim with an
ICD-9 code for heart failure, and the place of service was the hospital, is not sufficient 'cos you
sort of pull a sample of those, go to the medical records
and you find out that most of those patients went into the
hospital for something else, maybe it was history of heart failure or something like that, so you develop more sophisticated algorithms,
and folks at Sentinel and individual organizations
are spending a lot of energy in developing more
sophisticated algorithms to really identify the
endpoint that you want, and some of these come up
with like positive predictive values of over 90%, which is pretty good. But the other challenge is
that it's not necessarily the quality of the
data, it's that the data just aren't captured, so like
patient reported outcomes and things like that, so I
would say that it seems like as we continue to improve
our ability to utilize these data, as we continue
understand the true validity and reliability of these
data, and put investments into enrichment, so being
able to link that data to the medical record or
patient surveys or even today, you know, health plans
are doing more and more about collecting quality
of life information, large provider systems are
collecting quality of life information, so as we get
better data it seems like, combined with an example where
there's a large effect size, other things seem
straightforward for randomization and more pragmatic
design, that these efforts would be worth while. – It also seems important
to distinguish between what a billing record might show, and what some of these
others, I mean I don't think the billing records are the
same as what an HMO collects, my bet is the VA has,
which is really an HMO, has vastly more data,
probably of better quality, than a billing record
does, or the Sentinel has some HMOs in it and it
has some billing records, one could probably look to
see if there's a difference in the quality of data there,
or what you can collect. – Ah, Rich. – [Voiceover] Does the FDA
have a regulatory interest in effectiveness, as I'm
trying to make a silk purse out of the problem that
you pointed out Bob, that if you do a pragmatic
trial you're often left with the fact that the participants
will take the meds the way they take their meds,
but if we could not show a difference in that
circumstance, shouldn't we care, and is there anything in
the way the FDA is charged to operate, or that sponsors
think about evaluating their products that should play into the regulatory decision making? – One of my least favorite
distinctions is between efficacy and effectiveness, but we'll let that go. I mean what we're obliged
to know is that the drug has the effect it's
claimed to have, that's how it's described, it doesn't say efficacy, it doesn't say effectiveness,
doesn't use those terms. If there were something in the
patient or in the properties of the drug that really
made people not use it, we would of course want to
know that, and there are plenty of drugs that make people
drop out right away if you don't use them right,
so we worry about tie trading, you know the alpha-blockers
and stuff like that, nobody would use them if you
took the full dose right away. So they're all labeled
to be tie traded up, so if you want to, that's sort
of an effectiveness concern 'cos nobody will take the
drug if you just go on at it full dose right away,
and we do worry about that, we worry about when you
see an adverse reaction, you wanna know if it's dose
related, if it occurs early, if it can be mitigated by
tapering, by slowing the dose more or by watching closely for this or that. So we're definitely all
worried about that, probably doesn't make you, if that's
a problem, it probably doesn't make you think
the drug doesn't work, but it's very important
to giving good advice on how to use it, so I would
say we're very interested in that, and if studies in
which control is less tight, which is sort of what effectiveness
studies are defined as, all of a sudden just can't
show anything, and someone can try to figure out what
the reason is, that could be very informative, it doesn't make us think the drug's ineffective, but
we would be very interested. Does that answer your
question, or did it evade it? – Adrian? – [Voiceover] Yeah, so I have
another follow up question about the outcome ascertainment,
so Bob you mention the precision was really
important for outcomes because of, certainly, to
evaluate efficacy and make sure it would be more helpful
'cos that'll be there, just even in clinical trials
where we do event adjudication as it goes through quality
assurance processes where events are recycled,
even with expert panels, that we don't get 100%, that's one thing. Secondly is even when
studies have evaluated C C determine events versus site reported, for the most part the
interpretation of inferences don't change, and so if
you think about leveraging electronic health records or claims for outcome ascertainment there probably are certain categories where
there is strong incentives to be accurate because of
reimbursement practices and there's some that perhaps less so, so the contrast would be, as you noted, say hospitalization for
heart failure or procedure, that would be highly focused as apposed to say a gout flair where that
may not be captured that well, does that seem a reasonable interpretation of what you've been talking about in terms of outcome ascertainment? 'Cos that seems a critical aspect here. – You're comments are right
on, whether adjudication of events is useful is highly
debated, I don't really have a position on that, we've
accepted trials that use what the investigator
said, and you're right, it usually doesn't show
a major difference, I'm more worried about not
having any definition at all, I mean even in a trial where
you rely on the investigator, there is some definition
of what constitutes an MI. I just think you have to
know how good the system is to know what it's gonna
do to your findings, if it's very noisy, because
it's billing records, it's gonna make it smaller, won't give you the wrong direction or anything
like that, but it might make the effect smaller so that
you have trouble detecting it, it's just gotta be something on your mind. It doesn't mean you can't
do it that way, even noise, if it's just added on to it,
it won't obliterate an effect if the study's large enough. – [Greg] Mark. – [Voiceover] So I wanna
suggest where noise can be your friend not your enemy, and this is where I have
seen the biggest opportunity for pragmatic clinical trials. If you have two different
therapies that have very different modes of
administration and dosing schedules, there could be different
preferences for by patients to use it, a number of
years ago we were working on a oral medication for aminazine controller which, clearly in kids was
preferable because kids didn't want to have to use
their inhaler in school, and the question is, if you did
a randomized pragmatic trial could you show that there's
a difference in adherence and persistence that translated
into difference in outcomes, so I think noise can be
your friend, when you think that you're gonna have a
difference in persistence and adherence between you
and a comparative medication, and that's a place where
I think is a good example where it could be an entry point where a randomized pragmatic
trial could be of value. Now, are there many examples like that? I don't know, but you
know, clearly, this is not well studied very much, and
i think that, you know Bob, you brought up the point
that looking at persistency of effect, that is another good
place that you might wanna, you could use a study
like this, so I think there are entry points and it
doesn't necessarily have to be that noise is always the
enemy, what you wanna know is, how is it going to play in the real world, so you could look at patients
who have many comorbidities and are taking six or eight
medications versus those that are taking one or two medications. We know that persistency and
adherence changes depending on comorbidities and how
many medications people have to take, so I think
there is there there, but I think we need to look
at noise as our friend, not as our enemy. – By noise I meant
imprecision, you're defining bonafide differences in how
people are willing to stay on, maybe because of side effects,
that's worth finding out, that's the value of real
world studies where you don't harass people to take the drug, and you let them decide,
you can find out things about intolerability and things like that, that would be very important
to know, I think that's good. – [Greg] Great, Sean. – [Voiceover] Yeah hi, Sean Tunis from The Center for
Medical Technology Policy. So I just wanna make sure
I'm not misunderstanding what I think the converging
view of this panel about the potential use of
pragmatic clinical trials in kind of a healthcare system setting, like Collaboratory or
PCORnet to support approval of a new indication for an
already approved drug, right, so it sounds to me like,
and doing it in a way that's maybe 10 to 20% cheaper and
faster that would otherwise be required for getting a new
indication, 'cos otherwise why would we care? Right, the whole idea
here is cheaper, faster and it still gets you a new indication. And it sounds to me like the
answer is that's probably not gonna happen, so I probably
misunderstood something, but I'm just wondering,
before we go to a break, if the panelists could sort
of say, yeah I don't see a way that's gonna happen, or yes
I can see that happening, cheaper, faster, using
Collaboratory, PCORnet or something, and we can get a new indication, and under what circumstances
is that even feasible or imaginable in the next five years? – Let's be sure we have
the definition right. Pragmatic clinical trials, to
me, doesn't mean uncontrolled, or observational or historical, it means you're not excluding, I
mean I have that thing where their are nine factors,
but that still includes randomized trials, so my view
is that a pragmatic trial that has fewer exclusions,
and doesn't see the patient as often and all that stuff,
is a perfectly fine way to get a new claim if you
win, I don't thing there's any impediment to that at all, and it has some attractiveness'
because it's a little closer to how it's gonna work in
real life, I can understand why people would like that, so am I misunderstanding your question? I mean pragmatic doesn't
mean uncontrolled. – [Voiceover] No, I'm
thinking of pragmatic as a randomized trial,
and as you pointed out, probably, it couldn't be
pragmatic in the sense of allowing lots of crossovers
and all the ways in which your effect would probably be eliminated, so those wouldn't be the
domains of the precise wheel or whatever, that would be more pragmatic. But I'm still trying to
figure out, well what would be more pragmatic that
would be simpler, cheaper and still adequate, you know
what's an example of that? 'Cos I get how it's
theoretically attractive, but it just seems like
there's so many complications and barriers and hesitancy to
use it for regulatory purposes that it's just hard to
see when it would work. – Yeah I'm not sure why
you perceive hesitancy about those, we generally
have a bias toward including a full range of patients,
don't exclude 'cos of the age, don't exclude 'cos of other
therapy, don't do those things. I'm not sure there's a bias
against pragmatic trials that are still well controlled trials, I don't have one. – I think what we're hearing
from the agency especially, and this is consistent with
what Janet was saying earlier, is that there are
potentially good examples where more pragmatic designs that leverage real world evidence would
bring valuable information, I think Bob you even
outlined a few examples that would be sort of
ripe for some of these early demonstration cases,
I think the barriers, Sean that you're hearing from
the group, is that well maybe those first couple of
demonstration cases may not be 20% cheaper, and we're
gonna hear some examples in the next panel and
then throughout the day that because of the, and this is not a regulatory uncertainty
or regulatory barrier, but just simply because of
the barriers that we're seeing in the technical ability to
bring electronic medical records all together, and to
measure the right thing, and to make sure that
we're valid and reliable in what we're measuring,
having the right linkages with registries and claims et cetera, there's a whole infrastructure
that's not quite, maybe it's not quite
prime yet, maybe it is, but the infrastructure to
be able to do these studies routinely, efficiently,
that's not necessarily there, and we're gonna spend some time
even tomorrow talking about, okay if that's where we have some pieces of that infrastructure, but
what would it take to get more investment, to get more participation and to really, as a group,
improve this infrastructure, and the reason for doing that,
there's many, but one of them is that it would enable
more efficient, and maybe it would enable folks
to see 20% lower cost, and being able to do these
studies, but the data are there, and the data are of such high
quality, that we can identify, maybe not all studies, but certain parts of regulatory questions
where the data are sufficient to answer these questions. So I think you're right that it seems like this is all theoretical
and right today maybe the business case isn't there to do this, but this has been, there
are examples where this has been done and in certain
scenarios, the business case might be there, but we do
need to put a lot more energy into the data infrastructure. – I would've thought that even
being able identify patients, to inquire about being in the
trial in a more efficient way than we now have, would
be worth a vast amount of time and money, am I wrong about that? – Mark. – [Voiceover] Bob we're
doing that already. So we are mining data
sets, both claims data and EMR data, to look at
what are the scenarios if you manage your inclusion
exclusion criteria, what does that do to
the target population, and there are tools available
through Optum actually, where you can then identify
physicians that have large clusters of
patients, and by doing that we've been able to, on
a number of occasions, accelerate the enrollment
in clinical trials, and we're not the only
ones who are doing that, everybody's doing that, so that's the current state of affairs now. – Well it seems very
attractive, very important and there's sort of no risk to it, I mean it doesn't damage the
data in any way, why not? – You know I think from
a company's standpoint, there are examples out there
where this has been done, where real world evidence has been used in a regulatory context to help support, even initial indications, let
alone follow on indications. I think the challenge is how
do we work towards a system that makes that more predictable? And Greg you mentioned
a lot of those things. I'd say the other thing
is how do we do some of these test pilots that we talked about a little bit this morning,
so that we can get a better regulatory understanding,
through position papers or through guidance, of
where these opportunities are seen routinely, where
do we start with that? And we have talked about
some criteria around that, large effect size, what
characterized disease states, information databases around
endpoints that are very clear. How do we make those more routine? I think it's a multifactorial
element of how do we make the data more robust,
how do the methodologies continue to evolve, we see this
every day within our company I know you do as well, and
then how do we make sure we've got the external onus
to be able to put a system in place that makes it more routine. We'll still do this one
off, trust me, when we see an opportunity, you see an opportunity, we'll do this one off, I think
the goal is how do we make this part of the system
of drug development and information delivery to
prescribers and to patients in addition to payers
and other constituents. I think that's really what
we're trying to accomplish. – [Greg] Jennifer, I think you have. – [Voiceover] Jennifer Graff with National Pharmaceutical Council. So one thing that I didn't hear previously was the word adaptive,
so where you have an RCT and then you adapt into a
real world evidence type more pragmatic approach,
and one of the elements that comes into my mind
is infectious disease, you know we saw this week the
HPV vaccine where they looked at longer term outcomes, you
have a certainty of whether or not the exposure occurred,
you have randomization, but I'd like to hear, is
that a test area to do some of these demonstration
projects in some of these short term diseases like
infectious disease vaccines et cetera, and how would that
play in if it was an adaptive, more pragmatic feature? – So is your question, so the
examples where it's already on the market, right? – [Voiceover] It's already on the market, there's already been ef… That could be one, would
you ever allow more of that real world evidence to look
at those longer term outcomes, or for new vaccines that
are coming to the market and that might be coming
out, would you consider an extension to the primary
phase three to allow for that more adaptive studies come
back with the label enhancement when those studies are completed? – Well I can't quite answer
as I'm not sure I understand what it is, so you're
saying you start with a randomized trial,
vaccine versus no vaccine? – [Voiceover] Could be, yes. – And then what's the adaptive part? You follow those patients in there– – [Voiceover] Follow those
patients out so you'd have an interim point that would
be your end of phase three, you would allow that to go out beyond and use real world evidence data capture for those longer term endpoints. – Well, not quite like that,
but some of the NIH studies of lipid lowering drugs have
had 14, 15, 16 year follow ups, long after the trial was
closed, where they got the data I can't tell you, I think they
had the names of the people, but they've done 12 year,
15 year, 20 year follow ups on some of those, which
is not so different from what you're talking
about, and it's still the randomized population,
you don't have the ball, but it's better than
nothing, and I think people have tried to do that. – Just to build on that, would
the agency open to a design where you would have a
randomized clinical trial, and then based upon a surrogate endpoint, getting accelerated approval,
and then the long term follow up that's required
for the conversion would be obtained in a real world setting rather than in another RCT? – [Bob] Do you mean the same patients? – Potentially the same patients, yeah. – Well if it was the same
patients followed later you'd worry about retention and all that, but that doesn't seem crazy
if it's the randomized trial. That seems pretty good. I guess another question is if you had, well this is a question we always have, if you have your surrogate
marker, would you now accept, if the effect was very
large, would you now accept essentially historically controlled data, the answer is, as Janet
said, if the effect is huge, and you have great data
on response and you have, now you're looking for
survival, and the effect is very large, that's credible,
we've used stuff like that, but again it has to be pretty large. – Great, that takes us to
the end of this session. I know we have a couple more questions, but we're gonna be tackling
these topics throughout the rest of the day, I'd like
to thank all of our panelists, we do have lunch provided for
you, it will be in the lobby, we'll be eating lunch in
here and then we'll begin back up at exactly 1 o'clock, thank you. (audience applause) (background chatter) – Good afternoon everyone, good afternoon. We're gonna start our afternoon
session in just a minute so I'd like to ask everyone
to head back to their seats, get whatever you need to get
and we're gonna get started. (background chatter) Alright once again, welcome
back for our afternoon session, we're gonna continue our discussion on regulatory applications
of real world evidence with this focus on randomization. Regulatory use cases that
involve randomization, you may remember before
lunch we talked about some example cases and some
of the issues that come up in these examples, we talked
about issues like the clarity of the intervention being
defined and the nature of the outcomes, the ease
of verifying the data, and other issues that
effect the feasibility of randomization and the
value of the evidence emerging from such real world studies. In this session we're gonna
switch gears to focus more on the how, so study methods and data development
considerations for the efficient implementation of real world
evidence, randomized studies for regulatory purposes, and I'd like to just briefly introduce our speakers. We're gonna start with
some opening comments from three distinguished
speakers, and then we're gonna proceed to
discussions, so more in format the sessions this morning. First we'll be hearing from Lisa LaVange, the director of The
Office Of Biostatistics and The Office of Translational Science at the Center for Drug
Evaluation Research at FDA. Then Adrian Hernandez, the
faculty Associate Director and Director of health
services and outcomes research at The Duke Clinical Research Institute. And Martin Gibson, who's
the Chief Executive Officer of NorthWest Ehealth, Martin
thanks for coming over here from England to join us. Lisa's gonna help present
some of the issues for this session. Adrian and Martin I think
are gonna talk about some specific examples, again
focusing on the how to's, so moving from the when and
the settings and the issues to determine whether randomized real world evidence development is appropriate to designing these studies in
ways that make them effective for regulatory use,
obviously the studies need to be designed to be
sufficient for answering the questions at hand, and again
we're focusing on approvals for new indications, and at
the same time, be feasible, minimizing the burden
for sponsors, researchers and patients, so we're gonna
focus on design considerations that fall into four broad
categories, study design, patient identification and selection, data collection and quality, and outcomes and endpoints of interest,
we're gonna draw on some worked examples that
illustrate the potential promise and challenges that these studies have in the current clinical study environment, and how to overcome those challenges. So Lisa, I'd like to turn
this discussion over to you. – Okay, thank you, good afternoon, thanks for the introduction. I can see my slide, but you can't, oh there we go, great. My task is to talk about some
study design considerations that come up if you're
trying to figure out what you can do with real world evidence, and I'm just going to skip
across the top of a few topics that I think are related,
and try and build on what you've heard
already from this morning. So the first was just to
make a plug for the use of electronic health
records or information in randomized clinical trials, what I call the low hanging fruit. We've talked about this
in similar meeting before, but there have been in
use for a while now, trials that look like your
standard randomized trial but supplement the data collection with electronic health records,
so some thing are skipped in the case report form
and pulled directly from the electronic record,
this might be medical history, could be demographics,
concomitant meds and so forth. And this can be more
efficient than extracting and retyping or reentering
the data onto a CRF, and it also can reduce
the source data monitoring because the electronic
health record is the source in this case, but there are
problems in implementing this, and the biggest is that,
at least in the US, we use different systems
and different data formats, and just as a disclaimer,
I successfully did this in one healthcare system, but
had trouble rolling it out for a patient registry in another area. Although I will say that the
system I used has now adopted something more standard, so
I think it could be easier. And then we always have to
worry about data quality, and you've heard a lot this
morning about the quality issues that people are concerned with, with electronic health
records, but that being said, I think that this is something
that is happening now and will probably happen
more in the future. And you'll hear more about
this when Martin gives his talk which is an actual example,
I'm talking in the abstract. Alright then the other
thing we talked about today is the importance of randomization. So as a statistician I
always find it helpful to remind people why we
care about randomization, 'cos we seem to wanna
hold on to it so strongly. And the fundamental
issue with randomization is if I do have a good
sturdy randomized trial, high quality, the only
difference between the groups should be due to the treatment,
and that lets me attribute the observed effects back
to the treatment itself. Now there are problems
with what can go wrong, first of all it's easier
to do some of these things when you're masked a treatment assignment, and that's not always possible. The study does need to be
well controlled and you have to understand your control
group, you have to worry about the delivery of the
intervention and you really have to worry about follow
up, and we know that as soon as you have a problem with missing data or differential drop out,
then that attributability, you know, is gone because
your comparability is gone. But that being said it's
easier if you can start with randomization than if you can't, one of the criticisms of randomized trials in the traditional sense
is that there inference is limited to the clinical
setting, or the population that's being studied, and
while the randomization lets you make inference
about treatment effects through comparative analysis,
you have to be concerned possible about how you
can extract like that to a larger population, so that's where the pragmatic clinical
trials come in, now before I say too much about pragmatic
trials, I wanna just talk about a word that I
think it's misused some and that's representativeness,
we just had a meeting last week at FDA about
representing minority subgroups in our clinical trials
and how to make inference about treatment responses
and different segments of the population, and I think
people use representative to mean several things, so
I wanna just take a moment to think about this, so we are interested in drawing inference to the
population who would take a drug after the drug is approved,
that's what we care about at the FDA right, is the
safety and effectiveness, we want good safe drugs to be out there for people that need them. But representative means something very particular statistically,
so if you are in a sample survey setting
where you have drawn a probability sample
from a target population, then every number of that population has a known none zero
chance of being selected, that's what makes the
sample a probability sample, that's the definition, and
that means that if you estimate something from the sample you
have a good solid estimate for that characteristic in the population, so the percent who would be compliant, the percent where the hospitalization, the percent who died or whatever. And the makeup of the
sample mimics or mirrors the population, that's what you get, if you do probability
sampling in the right way, then your sample looks
like your population. So you have the same proportion
or relative proportion of your demographic groups for example. Observational studies are usually thought to be more representative
of the population we are interested in, those
who would take the treatment, but they may not be fully
representative, and that's because we don't dross probability
samples for them. So the randomized trial you
could think of as probably the least representative
in terms of going back to a broader population,
observational studies especially driven by
healthcare records gets you a little bit closer,
but you're probably not, you may not be using
representative as you think, so just something to consider
when you're trying to weigh the pragmatic and randomized trials. Okay so then back to the pragmatic trials, they are randomized,
and so we've got that, they're often not, I think
they're most often not masked to treatment assignments,
so you worry about that, I think they're thought to generally have a more meaningful
comparator, more actual use. In a randomized trial,
pre-market in particular, you often choose the actual
comparator as something that helps show superiority,
so it may not be the same, we worry about that a little
bit, and then a pragmatic trial is gonna be more diverse,
is it all the way to representative, you know
maybe not, but it would be more diverse in terms of the
types of patients you get, however the diversity,
especially in the choice of clinical centers, could impact the study conduct and the data quality. And then the outcomes are usually broader, there's interest in quality of
life and cost effectiveness, that's because there's more stakeholders behind these trials, and
the only concern there is just to make sure
the assessment of what we're worried about in terms of approving for another indication or
assessing the risk once the drug's used by a broader population,
that those assessments aren't impacted by the other assessments that have been added to the
trial, you wouldn't have found in a more traditional randomized trial. And then the interventions
in a pragmatic trial, they're often adapted
to the clinical setting, that's what gives you the
real world-ness in some sense. Where as a randomized controlled trial you're gonna be highly standardized. You could have, we talked
about this in the last session, there could be some
loss of power because of that variability, you could think of it as inducing a little bit of noise, so that's something to think about. The results of pragmatic
trials, because you've got this adaptation in some sense, are useful to decision makers, and it is helpful if you collect information
about any variability in the delivery that might happen. And then I think when
this happens it maybe hard to determine how to
attribute the effectiveness that you might observe
or there are safety risks that you might observe if the
delivery of the intervention is changing that much,
and by intervention here, just as we talked in
the last, it maybe more than just a treatment,
it maybe the treatment plus the compliance plus other treatments or rescue meds, depending
on how you define the things that you're comparing. Now this to me, a natural follow on to pragmatic trials, it's
something that I think is getting more and more
interest today, and these are sequential multiple
assignment randomized trials, these are really thought
of as another type of randomized controlled trial,
not necessarily pragmatic, but they allow multiple
randomization points, multiple clinical decision
points about the next treatment, and they're very real world in the sense that the outcome of one
treatment assignment decides what the randomization choices are for the next treatment
assignment, and there can be multiple of these, so it's like a whole personalized
medicine taken into action, I think they have some
of the positive aspects of pragmatic trials, in more mimicking the real world treatment
that really happens, and the nice thing about these is they are very tightly controlled,
the decision points and the randomization arms
are all specified in advance, so you've gotta do a lot
of agreement upfront. And the statistical
analysis is complicated, but thanks to a lot of very
smart people who've been working on this for a few years,
with the techniques of machine learning, reinforcement
learning and Q learning is one example of that,
the analysis is possible. The quandary for us as regulators
is, if I get to the end of the trial and I've
had multiple treatments, how do I know which treatment to attribute the good outcome to,
because I need to know that if I'm trying to approve a drug. But certainly in a post-market
setting, these things make a lot of sense, and that's
where we're seeing them now. And then, I know this is the
topic of the next session, but I'm just moving along my
sequence of study designs, I wanna say just a couple of
things about external trials. So we do allow for the use
of historical controls, Dr. Templeton cite the wording
from the rule and regulation, he does this for me quite often, it's usually in the
case where randomization to a concurrent control
group isn't feasible or isn't ethical, and
single-arm studies maybe the only choice such as
in an ultra-rare disease or some oncology settings. So I'll make a plug here for the use of the actual historical
or external control data, sometimes we call it externally controlled but we're really just taking a
response rate and treating it like it's the truth, but in fact if you have patient level data
on an actual controlled group then you can adjust for
confounding factors, if the two groups don't look comparable. And this is necessary
if you really don't know with so much certainty
what the response rate would have been if you hadn't
treated these patients. Well controlled, of course,
implies a certain level of rigor and for us that means
pre-specification, simple things like don't pick your control
patients after you know what the outcomes were
in the treated group, the more you can do upfront the better. And so continuing with that,
the importance of planning can't be over emphasized,
it's hard for us to sometimes be able to interpret results
when the external controls were selected post-hock. And then the availability of
the actual control patients is important so you can
check for comparability to the new patients and
adjust for confounding as I've already said, but
also to assess the variability in the estimate of the response
rate, and you may realize you don't know it as well
as you thought you did. And there are some, Rich Simon
is one who's published this, that if you don't have
these data you really have no business doing an
external controlled trial. Now there's some recent
work that's been published recently about using
historical or external control to supplement a concurrent
control if you're in a setting where the population is
limited, and that you do this to increase your power, avoid
randomizing too many patients if you're in a rare disease
setting, and possibly even do an interim analysis to
check the comparability and adjust afterwards, so
that's an exciting opportunity, I think, for leveraging. So I just copied a few selected references on pragmatic trials, that
you're familiar with I'm sure, and historical controls going
all the way back to the 70s, and mentioning that CDRH,
the devices group has a good discussion of
historical control data in their pivotal trial guidance. And then the SMART trials,
if you're interested, these are two of the methodology papers. So just a few design
thoughts, and the two examples you're about to see will bring
all this together I think, and show you that a lot of
what I think is important is actually possible with real world data. Thank you. (audience applause) – Okay, so hopefully my
voice makes it through this. I've actually, over the last few days, wanted to be part of a randomized
trial because I'd go to the flu and cold symptom counter
looking at all the choices that are there for relieving
my sore throat and runny nose and cough, and yet when
I look at the label I actually can't tell which ones better. And I certainly would want to
be part of a randomized trial so the next person could
have a faster recovery. So I'm gonna actually
talk about two examples that will hopefully frame our discussion. But the main context that
we're all saying is that every day there are millions
of patients who are walking through the doors of the health systems, and there should be a rational to learn from those experiences and
make sure the next patient is better, and if we could
only insert randomization, we'd have even stronger evidence for that, and what I'll highlight
is actually examples where we're actually going to do that. And Bob nicely formulated
key questions that we have to address, and some that are
easier and some that are hard. And so one notion is that
certainly people are using EHR to facilitate the recruitment,
but actually having piles and piles of data for which
you know there are patients who are eligible for trials,
is not as potentially necessary but it's not sufficient,
there actually has to be other parts to this in terms
of promoting engagement and actually wanting to participate. And so for the example
on cardiovascular trials, we easily can see that there
are hundreds of thousands, if not millions of patients
through our healthcare systems that would be eligible for
trials, and so it's not just actually being able to
understand that they are there, they're actually there but can we make things more efficient? The next kind of question would
be, well for just describing who the population is, so the
table one of understanding where is that population
and how's it compare to what we expect in treating
population, and can the EHR reliably provide such
baseline characteristics, and I think probably a
lot of people would argue that that would be, yes, that's the case, but there still probably
needs to be some validation of that scenario to actually compare what the electronic health record has versus what is manually entered, and actually what is better quality? What can be done in a
more systematic fashion? And does direct extraction
from the medical record, is that better than, say, human entry. The last question,
which is probably one of the most important ones,
is actually the outcomes, can we have valid outcomes
collected in the endpoints, and are they reliable? And so really can the EHR,
or claims, if you will, find events in a systematic fashion? So right now what we have
is a world where patients come into a site visit,
perhaps every three months, perhaps in some other longer term studies, every six months, and the we
expect a study coordinator, in a 10 or 15 minute
interview, ask if you've had any events, and also to kind of peruse the electronic health
record, which now has become so complicated to see if
there are any other events. So is that a good enough system
as oppose to doing something that systematically queries
electronic health systems and claims to ensure that
we're asking the question in the same way, highly automated way. And then getting down
to the bottom of this is that ultimately we do
need to ensure that results are of high quality, and so
I think we have to go back to guiding principles on
this and as I illustrate the case examples, hopefully
people can think about it in these principles, are we
enrolling the right patient? Is it representative of
the patient population that we ultimately treat? Did we actually employ
the right intervention, do they have the right exposure? Did we have the right outcomes? Are they defined correctly and are we systemically collecting
'em with completeness? And similarly for
important safety outcome, and then also are we doing
the right thing the studies? And through all of this I think one thing that should be recognized
is that we have built a system for clinical trials
that collects a lot of data that actually is never used. So we wind up collecting data that often is for risk mitigation, but in terms of what's the top line results
that actually becomes the deciding factor of
whether a drug or device gets an indication, the top
line results in the efficacy is what is a primary focus. And then secondly, the importance
of the safety outcomes. So just to go through the
couple of examples here. So one is the adaptable
trial, which is PCORnet's first pragmatic trial, and
the context here is that there are two doses of Aspirin, and there's high line
variation in practice for using those two doses,
and so the goal is actually to look at the two doses of Aspirin, 81 versus 325 among
patients with heart disease, to look at MASE, essentially
mortality, MI and stroke. And then also importantly look
at differences for bleeding. The way the study's gonna be conducted is identifying patients
with cardiovascular history using electronic health records
by a computable phenotype, those patients will be
contacted electronically, and then have electronic consent. And then they'll be
randomized onto 81 versus 325, and then we'll follow
them up electronically, over about a two year period. And the outcomes are similar
to other cardiovascular outcomes in terms of what
we've looked at before. This just illustrates that one can define a computable phenotype that can leverage electronic health records to identify a generalize-able population
of people would say this, yes, is patients with heart disease, plus an enrichment factor. And informed consent
meets all the regulations and will ensure that patients understand what participation means
and will have the ability to also use, and by
providing even more context in terms of illustrative
purposes such as video of what it will mean for
being part of the study. And this is just a mock-up
of the screens that patients would go through for doing that. So it's actually a bit
different that a 14 page consent that people often get for consent. So and at the basics, this
illustrates how we would e-screen, electronically
enroll and electronically follow up, but also have
a systematic fashion so that we collect outcome
data, not only through the PCORnet common data
model, but also systematically through claims, and then, importantly, collect patient reported
outcome data, in terms of their health styles and other
issues that maybe important for taking Aspirin, such
as bleeding or GI problems. And then finally because we
wanna ensure that we have complete outcome data, we'll
also have a backup system where we can call patients as well. And when you compare it,
everything that's been done for a traditional trial
is actually been done but with a different method. And working in this universe. So, that's an un-blinded trial,
but I just wanna highlight it is possible to do this
with a blinded trial. And so this is a study
that's being funded by NIH, that's being led by these investigators who have willingly, I have wanted to share this study concept with the community. And it's testing two doses
of influenza vaccine. So one can imagine that
there's a lot of uncertainty around that, and that's
true, and that influenza actually has important
public health outcomes. And knowing the right dose, in terms of cardiovascular endpoints
would be very important. And so in this study it's
similar to adaptable, where patients who have
acute coronary syndrome or heart failure are
randomized to high dose versus standard dose influenza vaccine, in a blinded fashion, and then
patients would be followed up in six to nine months for
primary outcomes of death or cardiopulmonary hospitalization. Importantly patients will be identified through electronic health records. But most importantly actually
is we'll use this system to capture outcomes, and the outcomes of cardiopulmonary hospitalizations
is pretty well defined in terms of having to
have to hospitalization and for the cardiac
and pulmonary illnesses that are associated in this
population, so it can be done in the same manner and
it can also be validated. And similar, you can see that
you can use the same systems to do a pragmatic trial, by
simply changing the intervention and the types of outcomes
by using the same systems. So just in conclusion
here, I think there are these opportunities and
it's up to us to define where they are and what are
the boundaries for doing this pragmatic trials with
EMR, and also making sure we understand the purpose,
so it fits in some scenarios but may not fit in all, and there are a lot of ways to define that. So thanks, and I'll
transition over to Martin. Okay.
(audience applause) Oh the water. – Good afternoon, I want
to explain to start with, why these slides are
just a dreadful color. (audience chuckles) This is this pink color is
actually the corporate color of the city of Salford, and
all of the vans and everything that go around the city are this color, so that you can spot them. And if you wanted to
know where Salford is, there is map of the UK, and you can see approximately where it is. We are actually running a
late phase pragmatic trial in Salford and the surrounding areas, which includes around about
7200 people, and there monitored in pretty close to real time
for both safety and outcomes using the linked electronic record system. And the results of the first study will be coming out later this year. There's actually two
studies, there's one in COPD which is the first one, and then that's a 2800 patient study, and then
there's a 4400 patient study which is the asthma study
which is still ongoing. And to put this in
context we're doing this, this is a GSK sponsored
trial and I think probably most people in this
room are aware of this. But do RCTs represent the real world? And the answer is of course no. If you look at the left
hand side of this bar chart you can see around about 100% of people that would be eventually
prescribed a drug, in this case for COPD,
and then you can look at the bars decreasing in size
as you take out the things that are most common in COPD trials, and you end up in a normal
kind of phase three COPD trial, with around about 1.7% of
people that would ultimately be prescribed the drug actually getting it in the trial situation. So, another aspect is what
about adherence to medication in these kinds of respiratory studies? On the left hand side of
this chart you can see what happens in a normal
randomized controlled trial, we basically really
strongly get people to take their medication, and
in fact there's usually an enrollment phase, where
if you don't have 80% of your medication in that
phase you're not gonna get into the study at all. So on the left hand side
you can see the three that are set at 80%, then
there's two more after that, one seems to have managed over
120% which is interesting, and then that contrasts with what happens in a real world environment,
where you can see in the studies that have been reported, people basically don't
take their medication, you've gotta take that into
account if you want to do an effectiveness study, which
is what we're trying to do. So the ambition of GSK
and ourselves in doing this Salford lung study was
to make it as near real world as we could, but we were doing that with a pre-licensed
medication at the time, so it also had to be acceptable from a regulatory perspective. So we wanted to embrace the
heterogeneity of the population, normalize the patient
experience as much as possible, and provide usual care in each arm, with relevant endpoints being collected, but intentionally we wanted
to maintain scientific rigor, so it is interventional, it is
randomized, it is controlled. And so on the left hand
side of this slide, down the left hand panel you've
got what the challenges were to doing this, and down
the right hand side you've got some of our solutions to that. How do we recruit patients? So we need all comers in this
trial, for all the reasons that Lisa was talking
about, as many as possible. Broad inclusion criteria, want
the results to be applicable to the whole population. Pragmatic diagnostic
criteria, so if your GP says you have COPD, you have COPD. And very few exclusions. So the solution to that over
on the right hand side is to recruit patients through
their primary care provider, so every GP in our area has to
be the principal investigator for their site, and
you'll see how many sites that is in a moment. The next bullet-point down
is, How do you ensure normal care for patients during the study? With minimal study
procedures, normal prescribing and normal dispensing. And I should point out
that normal prescribing in this situation is your
GP prescribes your asthma or COPD medication, and you pick it up from your local retail
pharmacy just down the road. So the solution to that
was that we had to get the study drug to be accessed through the local high-street pharmacy,
and you'll see how many of those we had to get
onboard in a moment. No additional review. And no change to care as usual. And the final point on this slide is how do you monitor patients
to regulatory standards without carrying out frequent reviews? How do you minimize the hawthorn effect? How do you ensure patient
safety which is vital? And how do you ensure that we
collect the endpoints we need? And as everybody knows, 'cos
that's what we're here for, we do that through the integrated
electronic patient record, and that means that we actually
capture things in real time, and you'll see some of the
benefits of that in a minute. So here's a study outline
for the COPD trial, the asthma one is not that dissimilar. 2800 patients you can see
in the left hand little box, and those are very open
criteria for a COPD trial. You'll notice also that you're randomized, and either to new treatment
or existing treatment. And then we have, I know this
is a little bit confusing 'cos there are really only two visits, and they are labeled
visit two and visit six, but take it from me, there
really are only two visits. The first one is actually consent. Then you have visit two where
we collect some information and the study specific. And the same at the
end which is visit six. The three, four and five, for information, are if the patient hasn't
seen anyone in the NHS, The National Health Service,
during the 12 month period of the trial, we give 'em a phone-call, and that was the regulator
asked us to do that. And that means that all of the rest of it, all of the outcomes and all of the safety is done by the electronic systems. This, you may wanna
just scribble this down. (audience laughter) This is the simplified
schema showing the linkage between the different
data systems that we have, and you won't be able to
read it, but basically the ones down the left hand
side are national systems that tells whether people
have changed their name, moved, died, whatever. The next ones across are
actually local systems, where we get data from
primary care records, out of hours services. Along the top you have more
study specific pathology data. Down the right hand side,
the two different hospitals that are now involved in the study. And those data flows are happening
pretty much in real time. This is a slightly more
understandable version of what you've just seen,
which is that all of the data flow in from the left in electronic form, they're put into a linked
database system in the middle, and then there are outflows
from that for different things. So for safety monitoring,
for regulatory monitoring, for the actual study analysis, and for the management purposes. Just to give you a bit of
scale, I mentioned this before, so there are 120, sorry there
are 88 primary care sites in the study, so that's 88
principal investigators, most of those people were completely naive to doing a late phase three
trial, which is what this is, so we had to train them all. That means that over
3000 people in the end were trained to GCP standards. 128 community pharmacies participate, they're all in competition
with each other normally, so we had to bring them together. We had to learn how to extract the data from their diverse electronic systems, and bring them together into the study. And the rest is as it says. This is a map of the Manchester
area, and I don't really, this is just to show
you that it's quite big, that's about, I guess
about 30 miles across that map that you're looking at. And you'll see there
are little dots on it, those are all the different pharmacies, so we have to go out and we
have to recruit them all, we have to get all the data from them, they're all in different chains. One thing that we are
particularly pleased with in this study, and this is I
think important to how we take this kind of technology forward,
is the safety monitoring. I do a lot of industry sponsored trials, and as an example, I might see
a patient on a visit today, and they might tell me about something that happened three weeks
ago, maybe they were admitted to hospital with a myocardial
infarction, they're fine, but that's the first I know about it, I then have to report that
within the usual time-frames, but the event was three weeks ago. With this we get the
events in almost real time, so a study nurse will, if you
take it from the top left, a study nurse will tag the patient in EMR, say that patient is admitted
through the emergency room, there'll be a pretty
much instant note through to this independent
safety monitoring team, who can then track that
patient in real time through the hospital and
see exactly how that event is evolving, they can do an
initial unlocked SAE submission to the sponsor, we can then
process all of the data over that next period
of time, and send a link through to the PI, who can
then establish causality and severity, and then that
gets sent through to the sponsor and then subsequently the regulator. This is very very good,
in fact, we can talk about this later, it's almost too good. So how has this influenced our thinking? So here's a kind of slide that
you'll have seen a lot of, which is, on the top I've
put some arbitrary numbers, about no years out to 16
years from original concept of a drug through to getting
it used in regular care. And then at the top I've put your health technology agencies, we have in the UK something called NICE, which you'll be familiar with, probably. And on the bottom the regulatory agencies, and you can see the
different phases there. So on the left hand
side you have efficacy, on the right side effectiveness. And interestingly the Salford lung study sit right on the middle bit of that. So we've now got experience,
which is really fortuitous, because that means we've got experience of a late phase three trial,
but also how you use it for effectiveness, and I
think there was some talk about that this morning. And we now have confidence
that we can build this kind of technology
earlier, so you could use this from phase two onwards. But you need some additional
bits to be able to do that, and so we have also designed
to work alongside this, a series of other tools which allow us to do rapid protocol
design and feasibility, those tools also enable
recruitment, and the subsequently, after you've done the trial,
because of what we've built with the lung study we can do
long-term pharmacovigilance, safety monitoring and look
at cost effectiveness. So that's now, what does
the future look like? And this is a real world presentation. So I don't think this
animations gonna work very well having seen the last
one, but we'll just see, it should all sort of sort itself out. So that in the future, maybe,
if anythings gonna move on this slide, we actually
start to overlap efficacy and effectiveness, and
ideally we can then shrink the whole timeline, which
is I think what everybody would like us to do, and
that gives us the cost saving and the benefits, and
patients get the drugs sooner. So what have we learned from all this? The Salford lung study,
or Salford lung studies, are the first of their type. They maintain scientific rigor, they're randomized,
there's an active control and there's a robust primary endpoint. It's actually a hybrid
of an RCT and real world, and I think that's
important to bear in mind, because I think that's a
really good way at looking at these things, we initially
built a whole ECRF with this, it does have an ECRF but you can flex it, you can flex the safety
monitoring as well. That means that we've
got a really flexible trial design system, based on the stage of the medicine's development. So you can have something
that looks a little bit like a standard RCT and you
could run it with this, right through to something
that's much more like an observational study,
and I think one thing it has taught us is that
we're gonna get an awful lot of information on how to
conduct real world studies, and probably a whole load
of information on how not to conduct real world studies. Thanks for listening. (audience applause) – Alright, thanks very much
for all the presentations, thank you Martin. I'd like to ask the
panelists to come up on stage and we're gonna turn to
the discussion segment of this session, and
certainly a lot to discuss as we move from the right circumstances in which to do real
world evidence studies, particularly randomized
studies, to the mechanics of how to actually do
it, and I think in both of these examples that
we've walked through, Lisa, they paid a lot of
attention to the issues that you would of thought were
important in study design. I might start with, remember
we talked at the beginning of this session about
a number of key issues, life study design, patient
identification, data collection and outcomes tracking. Let me start with randomization, how hard was this in the applications that you all have set up? And how important, did you
think about any other ways of any other levels of
randomization, cluster, other, or is that just to
central to the mechanics of answering the question at hand. – Alright, so these studies are
in the process of launching, and so there's been important
aspects towards ensuring that we randomize
appropriately, and actually in that context here,
we actually did think about cluster randomization,
so both for the Aspirin study or the influenza study, you
could certainly think about how you can randomize
practices for instance, but because of different
aspects, people really thought that patient level
randomization was necessary for the questions we're addressing. Other examples though,
the Collaboratory is where we do a lot of cluster
randomization at practice level and hospital level. – I'm slightly glossing
over the fact that we did quite a lot of work before
we started on this study, and one of the things we looked at was, in what you might call
real world data settings, so we actually took all of
the data of the patients with COPD exacerbations
for several years before the start of the study, and we modeled what it would look like and
how many numbers we would need, what power we would
need, and that actually would quite quickly rule out
a cluster based randomization for a study like this, so we wanted to do a patient level randomization,
but there were certain other things we needed
too, we needed to make sure that pretty much everybody
could get into this trial for all the reasons that Lisa
was talking about earlier, so we wanted to make sure that
everybody had the possibility of doing that, so cluster
randomization would have given us a problem because practices
are different sizes, and have different aspect,
different capabilities to run the trial, they
might have several nurses working for them, no nurses
working for them and so on, so we wanted to make it so
that everybody could get in, everybody could have
access, so we went for a patient level randomization
based on the power that we knew from previous exacerbations, over the previous few years. – And in these examples
at least, it sounds like, I guess Adrian it's still,
you're still in process, but it sounds like the
randomization mechanics are working out okay–
– [Adrian] Yeah. – In terms of identifying the patients, getting the informed consent done and getting them signed
to treatment, right? – And I'd say the example,
the influenza vaccine, that would certainly lend itself to doing cluster randomization
'cos you can imagine different practices getting
randomized to one dose or another, there are
other aspects of that study that people wanted to do
patient level randomization, but that had to do with
understanding more the immunology, but that's a perfect example where that you could do a cluster randomization, collect the same
outcomes, and the fidelity of the exposure would be
very high in that setting, and whereas the Aspirin study the concern is about the fidelity of the exposure. – Before I move onto
some of the other issues, Lisa, any other comments that you've got about study design concerns
here, do you feel optimistic about where we're headed
based on these examples? – (laughs) I do, I do, I
mean I think the examples are impressive, just one comment
the cluster randomization, I think The Center for
Biologics, my counterpart, Estelle Russek-Cohen is
in the audience there, says that they do, I
think, I dunno how often, but they have taken cluster randomization, and the other setting we've seen recently is emergency room settings where there's logistical
advantages to having everyone coming into this care setting,
receive the test treatment and everyone coming into
that receive the other and then doing a randomization,
and if you can have a large enough sample of practice sites then you can control, you know
there will be differences, but there'll be differences regardless in terms of practice
sites, and in some ways randomizing that out can help you. The power loss depends on the
outcome 'cos you would still be treating the patient,
and not treating the site, and you would be measuring the patient and analyzing at the patient
level, but you'll have a cluster effect, so
your effective same size will be something between
the number of patients and the number of sites,
but it won't be so severe as to just see the number sites. And i think we've talked
about this in the setting of antibiotic trials where people want, the patient groups tell us they
don't want to be randomized, 'cos they're nervous about if they have a terrible antibiotic
or bacterial infection, to agree to be randomized,
but if you randomize the sites then you come in and
consent them and tell them, if you agree to be in
this study you will get X, because you'll know that that
site has the new treatment or doesn't, and there's some
idea that that might help because, I think it's
been used with antivirals in some cases and it helped. So it's something to consider, and it doesn't have to be a big power dip, but there needs to be a reason for it, and in the Salford case I
don't think there was a reason for it, you didn't need to go that route. – Now both of you
emphasize in your slides, I guess the funnel equivalents of typical carefully controlled randomized, traditional randomized studies have a narrower patient populations as opposed to something that's more representative as Lisa noted in her opening remarks of the population that
was likely to be treated. I guess that, and that's good,
I guess the flip side of that as it keeps coming up in a
lot of the work that PCORI is doing, is that well,
the point of having a broader population is not
to get a more precise estimate of an average effect in
that whole population, but to be able to look
at potential differences across sub-populations,
how does that play, how has that been playing
in your experiences? Are the data rich enough,
and enough data there to support real subgroup analysis? – Yeah, so just using, say,
adaptable as a example, so we can immediately
understand how well our sample will be relative to the overall population because we'll have information
from all the health systems for that, and so just among
five of these networks represent about 25 health
systems, there is close to a million people who're
eligible for adaptable, so you can have not
only good representation for demographics but actually
racial and ethnic minorities, you can also look at
special areas of interest in terms of kidney disease
et cetera, diabetes and the severity of that,
and so it allows hopefully a broader base of population
with the comorbidities that we encounter in healthcare routinely. – [Mark] And you're seeing
high enough study participation to enable those subgroups to be– – Well that's always the goal,
so that would be the goal. – This is a difficult one for me to answer because the actual analysis
is ongoing at the moment so I don't wanna query the pitch too much, but as you can see from
the last slide I left up, there's a lot of data here,
and there will be plenty of opportunities to do
a secondary analysis. But I mean I think the
question from our perspective is how generalizable will the findings be, because we've had a really
good response rate 50, 60% of people that have the condition
are actually in the trial, but then that still means
there's quite a big percentage that are not, this is
the only study I know where we'll actually know
what happens to them as well, because we collect that
information, so we'll be able to see whether participants in the
study do as well as people who didn't wish to be in the
study, we will know that, we're also doing a separate
study with CPRD in the UK, which is The Clinical
Practice Research Data-link, which will actually compare
the effect in our site compared to lots of other sites
across the UK, so we'll get a good idea of whether
actually putting this study into Salford has changed
behavior of practitioners or not. – So you're gonna have an estimate of the famous study effect? – Yes. – One thing I will note is
the contrast between adaptable and investors that for
adaptable we're aiming for 95% enrollment
electronically, so patients who get contacted electronically
and enroll electronically. And I'm sure there are
gonna be differences between that method compared
to what people would do in terms of, say, a direct
coordinator for instance, but we'll be able to understand on that whether there are differences or not. But if you gain efficiencies
that way by being able to do larger studies, more
efficiently or cheaper then you could answer many more questions, so there's always gonna be that trade off. – And we've talked some about
identification of patients and representativeness, talked some about study design issues. Wanted to turn to data collection,
in both of these examples using a combination of electronic records and maybe administrative
claims and other data sources. I can't reproduce your whole
chart Martin, but a lot of information coming together there. Any broad comments
about how big challenges of using (muffled), are they
sufficient for capturing the variables that you were interested in, any other challenges
in selective data loss that you're concerned about? – I think two things at least
are important in the US, is that we do have health
systems that are not fully integrated, and so
understanding the complete outcome is important, so at least
the way we've approached it is actually kind of a
tiered approach so that we can ensure that we
collect the encounters that do hit the health systems
that are part of PCORnet, but not just solely relying
on that, because the absence of an event doesn't mean
it didn't happen elsewhere, and so being able to link to other data such as Medicare claims
and then also contacting the patients directly,
electronically or by phone to ensure we haven't missed any outcomes, so we have three layers that we go through to have complete outcome data, and then on the definition side,
again, for certain things it can be very good,
but say if our outcome was acute flair for gout
that may or may not be able to be as good as say
hospitalization for heart failure and hospitalization for MI. – My answer is a little bit anecdotal. When we first started the
study we thought we might need a whole eCRF to run it, and
we soon realized that was a really bad idea, because that
meant a huge army of nurses transcribing information
that was actually already in the electronic patient record,
so we pretty quickly stopped doing that, or did a
lot less of it anyway, there's still some eCRF
related information. But it gave us a little bit
of an opportunity to look at how much information that
we're getting from an eCRF, and how much information we're
getting from source data. And what we discovered was
we were getting far more information from the source
data than we were from the eCRF. So I think we can be confident
that source is better. – I'm good, sorry. – [Mark] No go ahead. – I'm not the moderator (laughs). – I was gonna actually turn to you next, Adrian you had a similar
comment about data that you were collecting
from source versus, we call it the 15 minute case interviews. – Right, I actually didn't
show the picture here, but if you take a picture
of a study coordinator now looking at, say, an
electronic health record that has all sorts of notes there, and especially if a
patients been in and out of the healthcare system,
it's very easy to miss things, and then also, recall
can also be a problem. And so we're hopeful that
a more systematic way of capturing and actually
using the data system to query it will actually
be more complete. – So after the experience, what
do you do to try and come up with some assessment of the quality? I mean you got CRF data,
you've got EHR data, you've got people looking at
both, because you determined that you didn't have to
transcribe everything, it was better just to take the e. So then what does that
reduce the monitoring to? Because I know with my limited experience, we had trouble coming up
with anything to monitor, since we were collecting the source. – I can't remember the numbers off hand, but we started off with
an awful lot of CRAs going out and doing a lot of checking, and we've got a lot less of them now. So– – A kind of a qualitative
assessment (laughs). – Yeah (laughs). – I think our approach
is for the outcomes seen we're gonna be validating 'em, so again, the definitions and algorithms
that we use to collect those endpoints, do some
validation work in terms of abstracting the medical
record or having a human look through and seeing
does this actually fit. So it's not too different in a way, then what one would do
for event adjudication, accept again it's using
sampling to identify any quality issues, or
reassure that they're quality, and if there is, I mean we're
actually going through that just as a kind of identification process using computable phenotypes
then say for the first hundred making sure that people
understand, like yes, this person does have
the eligibility criteria intended to approach. – So just to comment, you
mentioned the word validation, because that, we've spent
an awful lot of money doing validation on all
of this, so that we know that if we take data
from a particular source and move it across here,
and amalgamate it with this, that is exactly what we've
done, we've done a huge amount of testing on that, and
similarly the safety algorithms and triggering systems that
we use, again we validate every single one of those,
and that's probably cost more in terms of the development
work than any other single bit of the work. – In your source data did
you mostly pick up fields that were kind of discrete, or did you do any text mining or any, you
know, with all the notes that are available, did
you get into any of that kind of data extraction? Either one of you. – So, no, we haven't done text mining, we mainly used coded data in the system, and actually what I might
call hard data, numbers, from pathology results and so on. However when it comes to
the safety monitoring, we can access all of that
information if we wish too, which is an added bonus. – Assuming our plans there
is that, we're really focused on the key primary outcomes
and key safety outcomes, and in that context, I mean that's part of the validation work, will
be looking at records manually for a sampling, but not doing
it in anyway for the skill, because it's just not feasible,
and again for the purpose here, everyone should recognise whether it's a clinical trial
done traditionally or not, there's confidence in the roles between all the information we collect. – Now Lisa you did comment
in your opening statements about the importance of data completeness and selective missing
data, and how FDA has a lot more experience with the
case report approaches to the clinical trial data
reporting as opposed to this, what's your sense about,
have you seen many studies come in like these, and what's your sense about the validity of what you are seeing and what you're hearing about
in these practical trials? – So my sense, I mean
this is conjecture really, but my sense is that with
the larger pragmatic trials they might actually be
easier because you've got the mechanisms set up to
go after simple outcomes, even if you've lost a
patient, like vital status or something, because
you're in the system, you've got access to the health records, and maybe it's easier with
you, because you've got all kinds of data on people
you don't even study, whereas we don't usually have that, but we could in a healthcare system, I mean the missing data
problem is everywhere, and it can break all
of the wonderful things that we get out of
randomization and the other trial design aspects that we
rely on, but if you've got a system set up where you can
access people periodically and , I dunno, with phone
calls or EHR scanning or something, get any kind of
information about their status then that helps certainly, you may not, and of course you have
fewer visits anyway, visit two and six, I
love the idea of calling your second visit number six. So there's less things
to be non-compliant with in a rudimentary sense,
but it's still a problem. – Well hopefully one that's getting better as the experience
accumulates and the systems for collecting the data
become more robust. Which actually brings me to
the last question I wanna ask before opening this up to
some broader discussion, and that's about sustainability, I dunno if either of you all are in
a position where you could summarize what sort of
the cost of implementing the study has been, each
of your studies has been, one would expect that to be
a somewhat large number given all of the initial work
that you had to do, there haven't been, at least
for some of these systems, in your case Adrian, these health systems haven't participated in
a study quite like this, and Martin I think you described your work as somewhere between, what
was it, truly real world and a controlled trial, and this kind of augmented structure on top of it. The hope I guess would be that
between some significant cost related to validation,
as Martin was describing, and some significant cost
related to just getting this core infrastructure
in place and be able to put data together, enable investigators and systems to be put in
place that could conduct these analysis that the
cost is gonna come down significantly, what's
your sense about how far we're getting to sustainability
of these kinds of roles? – Yeah, I mean in the setting of PCORnet, I mean there were significant
funds to establish an infrastructure, and so
now that's been transitioned to actually implementation,
execution of the studies, and so after that the
incremental costs for adaptables estimate $850 a patient, so
that's in contrast to say– – [Mark] That's pretty good. – Yeah, so in contrast
to say another example, NIH trial that was coordinating
(muffled) called PROMISE, which is about $3350 a patient,
that was 10,000 patients and considered very pragmatic,
and compared to another type of trial that was less pragmatic
that's $13,000 a patient, so we really are trying to
change the scale of efficiency, but it does rely on the
systems that have been, an investment of getting
electronic health data, mapped to a common data model,
the engagement of patients, clinicians and health systems together. – So, I don't know how much this has cost. (laughter) And I don't work for GSK,
but I'm reliably informed by the people that do work
for GSK that this has cost no more that it would for a
phase three trial of this type, however, we've done some
stuff that we only needed to do once, so the validation,
the actual algorithms and everything else that we've built, the data linkage systems which actually would be applicable across
the whole of the UK NHS, because they're the same,
means that we can scale it, and indeed we have scaled
it already, 'cos we started with just Salford, and we've
doubled the catchment area, brought in new hospitals
and new practices, so this particular example,
I guess not that cheap, but most of that development work doesn't need to be done
again, so it will be cheaper and it will be scalable. – I would concert the capital
investment, if you will, that as more and more studies are done, then the return of
investment comes through, and the final concept is
that, right now, we have, we do a study, we break it
down, then we do another trial, even it could be the same
population, the same outcomes, so the ability for the network
for instance to implement say the next trial, like
invested, should be easier based on the experience from adaptable, and so having systems
that can be reusable, and tools that are reusable are key. – Now you've got 235 million rows of data, so if you look at the
cost per data element it might be a bargain. – That's actually out of date now. – That's a good way to think about it. I'd like to open this
discussion up to the rest of you who are participating
in the conference today, so if you do have any
questions or comments please raise your hand
and we'll get to you. Yes, please go ahead. – [Voiceover] I have a question for Adrian about the vaccine study which was blinded, how did you actually capture
vaccine information in the HR, since it was blinded? – Yeah so the study is
being launched and so this is a hybrid, if you
will, so like there are some elements that we have to retain say from a traditional trial, and again because of the interests in
keeping the vaccine blind is so there's not any,
there maybe some concerns about patients wanting
the high dose for instance as opposed the standard
dose, it was felt to maintain a blind, and so for that
part, the drug allocation, that'll be done similar to
what a traditional trial is, and that information will be available from the actual trial
database that augments from the records. – [Mark] Other questions? Yes, Sally. – [Voiceover] Thank
you, good presentations, I really enjoyed them. I'm Sally from PatientsLikeMe. I'm truly interested in
thinking a little bit more about the real time notion, right
now the real time capture, for example I think in the
adaptable study is when someone ends up in a system environment
that becomes real time understanding of something
that may have occurred, I'm wondering whether you have a tolerance for interest in expanding real
time to be what the patients experiencing outside of your
gaze, so that they're still a part of the study, but they're
gathering or contributing some information on their
own to also give you insights into things that may not
end them up in the system, but may still be of
importance of interest? – Yes definitely, so we
have a two pronged approach, so while I was emphasizing
the encounters that happen in the health system through
the adaptable patient portal, will actually be reaching out
to the patients periodically to capture their experiences, likewise if they have something
they can go into the portal to report their experiences,
and there we're actually randomizing the follow up
in the portal, so three versus six months, because
we actually don't know, like what's the right
touch points to optimize patient engagement or
participant engagement and collecting information,
some people feel like if I get pinged every
month, that's too much, and then if you get
pinged every six months that's not enough, and so we're actually going to randomize. – And yes, is the short answer,
we really wanna do that, and from our point of view
that is another data feed that we want to include,
we already do that in some of our direct care
provisions, so for example all of our renal patients are
part of a renal view system that allows them to contribute
to their information, see all of the data themselves,
similarly we're rolling that out for diabetes, but we
are very very very interested in getting that information
directly into the source data, it would become source data. – [Mark] As you move forward
on efforts like that, what kind of concerns do
you have about, I guess, the selective, selectively
missing data that Lisa was talking about earlier
when applied to now data that's perhaps selectively
coming in from patients. – Yeah, so we're actually
wrestling with this, so originally some people
thought well why not just get patient information
and not be the gold standard in terms of as it's
reported through the portal, but I think because of the
concerns about the validity and the completeness and any
kind of recall issues that that would be problematic, and so we have a tiered approach in
terms of, if we see it say in the health system, that's
very clear, it counts, if we have it reported through
the portal we also wanna ensure we validate it by
looking at the health system or if it wasn't at the
health system that was part of PCORnet to find that information. – So nothing coming from
the portal was prompted? It's just– – No, we'll prompt, we'll
ask, because we are also looking to make sure that we can capture out of system events, and so
just like a study coordinator asks did you get hospitalized any time in the last three months,
we'll be doing the same thing but through the portal or
through the call center, and if someone reported
something, we'll validate it, that we capture that in the
PCORnet common data model, if we don't have it there
then we'll go and say, you know, where was it and
ensure that we have that record from that health system. – [Mark] Sally do you have a follow up? – [Voiceover] Just one quick follow up, thank you, I appreciate that. One last thing, I was also
wondering whether you have any mechanism for giving information back to the participants with some regulatory and what that looks like when you do? – Yeah, so that kind of community, creating the adaptable
community is really key, and so we actually have,
they name themselves, called the adapters, so these
are patient representatives that have basically informed
everything that we've done, including the logo and so
forth, and also we'll be able to send information back
to the community about what's the latest about heart
disease and what have you, and then at the end of the
day, once we know the answer, then we'll be able to transmit that answer to the participants as well as
potentially eligible patients who are similar to the
adaptable population, so that is very much on our mind. – And that is a really
interesting question, because it depends on what
you're trying to find out. What we're trying to find
out with this particular series of studies is
whether people would adhere with their medication more
or less depending on whether it's an easy to use medication
or not, so we deliberately do not want them to
know they're in a trial, so we spend no time
whatsoever reminding them that they're in a study, or
asking them to contribute to it, however, the system is set up
so that you could run a study exactly like that, and you
could have far more interaction, it really depends on
what you're trying to do. – [Mark] Do you have those
same kinda concerns Adrian– – For us, because we want,
really we need to have engagement of the participants to ensure that we have complete follow up and
adherence to the therapy, that's adherence to the
325 or 81, like that's an important aspect, and so we
want to have that engagement, so it's a different setting
or different context that we're aiming for,
and so from that lens, that's were our adapters
group has defined how we do certain things, and
actually even the web portal and the consent process was designed with potentially eligible
patients, and we modify, we actually thought we were pretty smart, and we totally were wrong,
and so we went through a number of modifications
based on the responses from the adapters. – Yeah aside from getting,
sound like some really good practical useful advice on
how to design the reporting and the portal and so forth,
you're not gonna have results to tell them for a while, so
what are you telling 'em now? – We will periodically
tell them information, or new information, about
cardiovascular health, and so again, it'll be
a learning community, and so while we won't
know the specific answer for the question, we still
want them to understand that if there's something
that's new comes along, we should make sure they
understand, know about it. – [Mark] Johnathan, did you have a– – [Voiceover] Sure, I have
a comment and a question. So the comment, today we're focused on the FDA regulatory use
I assume, or EMA of this, but there obviously are many
other uses including payers will use this information,
and I'm thinking about the SMART trial design
Lisa was talking about, one of the areas that we really neglect is clinical trial guidelines,
and due to the vagaries of when a drug comes along to the market, particularly in oncology
where you may have seven drugs that are approved for
metastatic renal cell carcinoma, some of them were approved
for first line therapies, some were approved for
second line therapies, some were approved with no
differentiation, but it was based largely on the trial that was performed for practical considerations
of who they could recruit, both ethically and feasibly,
and what we really need is a blueprint because
it's being used, you know all these drugs are being used
out there in varying orders, you know as patients
fail on one drug, go into the next drug, and so
these SMART trials would be very helpful, this SMART
design to do sequential, even though you couldn't
say for certain which drug had the benefit from a
purely regulatory, are we gonna approve the drug based
on that, it does give you a lot of information that would be useful for the medical community. So my question is more
rhetorical, and maybe a little bit devil's advocate. So going to Professor
Gibson's slide about the COPD, and the 100%, and the 1.7% making it into the randomized controlled
trial, so for many drugs that we approve based on that 1.7%,
we give a broad indication, we extrapolate and say that
that 1.7% is reflective, we don't know of any
reason not to approve it for all of COPD, and
not list every exclusion that was in there, sometimes
we do, but oftentimes we don't, and so a drug will
get on the market for treatment of COPD, even though it was just that 1.7% that had all these restrictions
and who could enter the trial due to exclusions,
and we kinda know why they do those exclusions,
at least in theory, it's to decrease the noise
and have a better chance at having a positive outcome
whatever they're comparing it to and the amount of
patients that they're willing to pay for in both practical
and monetarily to enroll in a trial, so now we go into
the era where we're doing a real world evidence thing, whether it's, let's ignore for the moment randomization, and just say whether it's
observation of all 300 million Americans being treated with
a drug, and we're making a comparison between something,
and we have everybody, we don't have any exclusions,
but we see it only works, and this is gonna make her cringe, without pre-specifying it in that 1.7% that didn't have all those things, so we didn't pre-specify it
(laughs), or even if we did let's say, so would we
just approve the drug, so this is somewhat
rhetorical, but would we say this is only approved in
people that have exclusion one, two, three, four, five,
and not in everybody else, and then of course would
it be used that way? – So yeah, we'll be able to
tell you whether you were right or not to approve it, is the first answer. Be nice to be able to say you were wrong, or really you should–
– [Mark] Or partially right. – Or partially right. But then seriously, I mean
you know, in most COPD studies people with heart disease are
excluded, they're all smokers, you know, where's the sense in that. – I mean I'm just saying on a kind of a clinical practice guideline, or from professional societies, while there's certainly a
lot of interest to look at the totality of evidence,
which is kind of what you're bringing up, it's pretty hard
to say this is what everyone should be doing as kind of
recommendations based on studies that are not pre-specified
or observational nature, it is hard to do that because
there have been a number of instances, at least in
the cardiovascular field, where something made biological sense, we had good observational
data to support it, and then went and did a
trial and got surprised. And so after you've been burned with some of those scenarios, and it
comes up often, so we may see very extreme results that
look like reincarnation, and so then you gotta
question is that really valid? But then the results that are
more modest, that may be true, but you start getting
nervous about, you still have residual confounding that
you haven't understood, so it becomes a challenge. – [Mark] Yeah, the level
of evidence backing many of the guidelines
still remains limited. – Right, like actually even,
so the ordering of medications, I mean we wind up saying
background therapy is because of historical
nature, but if someone had just chosen to study
a drug earlier that was in the development, then
that would have been the background therapy, and
actually would have been, maybe had a stronger effect
but we just never saw because of it's later
stage in development. – I mean I don't think you were directing your question me, but–
– [Mark] Go ahead. – It's an opportunity to
make a couple of comments. So we're trying to
decide if the drug works, I mean we talk about
this a lot internally, and so we maybe in the
1.6%, and we want the, and we exclude the
people with heart disease from a COPD trial,
which doesn't sound like it makes any sense, but
if you're trying to see if the drug works in a clinical
setting you wanna get down to the somewhat artificial
environment where you can get the cleanest signal that you can, and that's a different question
than asking, okay I know the drug works, now I wanna
figure out who it works for, and the statistics maybe
very different for those two. We have a lot of conversations
in FDA about subgroups because it's like missing
data and subgroups, that's our job security in statistics.
(audience laughter) But you know, and I can't count
how many times people say, is this subgroup finding for
real or is it due to chance? And I just go well.
(audience laughter) And you know you can
compute, you can compute, we actually can compute, we publish, we have a working group,
Estelle was on it, subgroup working group, we
published a paper in December, we actually put in there
probability is that if you have a trend overall and a significant finding in the other direction in a
subgroup, how likely is that versus a trend in the other
direction, it's more likely than you think it is, when
there's nothing going on. And for that reason
statisticians are very cautious about subgroups, especially
post type subgroups, and there are methods that
borrow information to say okay the subgroup works different
but maybe it needs to be shrunk a little bit with the shrinkage estimator back to the overall mean, but still allow that it looks different. But the way I see this
playing out with these types of trials is pre-market we
decide if the drug works, post-market is a great
opportunity to determine who it works for, and pragmatic
trials will let you do that if you get a more
heterogeneous population, but I think the statistical
test might not be what you're used too. – What about expanding
indications, so the two examples I gave, like certainly, I mean
they're not being intended to have a label change,
but one could imagine that say, I mean influenza vaccine is– – [Mark] Why couldn't they
support a label change? – It's actually just, right
now it's on the market for influenza, but the goal of
the trial is actually to see if it can improve cardiovascular outcomes. And so that's seemingly
an important outcome, and it's already in the market. Similarly for Aspirin, that
if there was a difference between the two doses then
that would maybe important. – Bob, I'm sure Bob has
something to say here. – [Bob] Well it would
get in, I mean Aspirin isn't exactly labeled but
if you knew that one dose was better than the other,
that would change everything, I mean it's not as though
people haven't been worried about that for a long time. – But how would you feel about
that evidence coming from a, sort of a real world trial,
whatever you wanna call it? – [Bob] Well as I said before,
I don't think real world is the issue, I think if it's randomized and if you collected the data. – Pre-specified. – [Bob] And if you
actually, I mean the worry about all these data is
that they're noisier, but if you still show
something, I mean this goes to whether you'd believe non-inferiority and equivalence in such a
trial, and that's harder, if you actually saw a difference
that's perfectly credible, noise doesn't have a direction. – In that context, so
if you saw something, because you worry about noisy data, that it's more believable. – [Bob] Right I think failure
to, in this environment where we're not that
experienced, failure to see a difference is less certain,
I mean you'd be hard put to know, I mean you do
have reasonable estimates of what Aspirin does, so
you could try to write a non-inferiority margin,
but my nervousness would be in this environment, I dunno what it would really be, but
you'd have to talk about that, but if you actually saw a difference, that would be pretty credible. – But we have to be
careful about that criteria of seeing a big difference,
because you could also have a big confounding effect, and– – [Bob] But they've randomized. – You know beauty's in the
eye of the beholder so, if you see, well if you
randomized then you– – [Bob] Yeah, that's what he did. – If you don't have
differential follow up, if this if that, I mean
there's a lot of things that have to happen here, and certainly in an observational setting,
if you see a big difference, we like to say we believe
it more, but it's really that we believe that
the confounding effect wouldn't be as big as the
difference that we saw, and you have to rationalize
that, and a small difference might be fine if there's
no confounding whatsoever, so I, you know– – So if we randomized
and we actually had 100% complete follow up, and even
if we had showed a small effect but it was significant, I
mean that would translate in terms of a larger
population healthcare. – And you had the subgroups
specified beforehand and so forth, and it seems
like you could potentially run through a lot of
questions in one big trial, like the ones that you all have described. A lot of questions that are
relevant for label modification, also frankly be relevant
for clinicians, patients and payers in looking at some
of these subgroup effects. – I mean, in fact we've put
up the protocol publicly for comments or advised it
publicly, as we do things we'll have it there so it's
not just pre-specified, it's out there for everyone. – Bob did you have a further. – [Bob] I forgot is the
Aspirin study blinded or not? – It's not blinded. – [Bob] (muffled) – Can you get your microphone? – [Bob] Oh sorry. If people have some view
about what aspirin does or doesn't do and they
stop more in one group than the other, that
would make you nervous, so blinding would be better but probably not as easy to do. – I mean here the outcome is death and cardiovascular hospitalization. – [Bob] Well no, but there's
still the question of, well for example, if you're on the 300 you can't take ticagrelor, okay, well ticagrelor works pretty well, it adds a lot to Aspirin,
so could that lead to differential treatment
of the two groups in an important way? Maybe, I mean if more, if a
lot more people on the 300 didn't get ticagrelor
and the other people did, that could effect the
results of the study. – Well actually for that
specific one we actually asked the community about whether
we should include patients on ticagrelor, would there
be willingness to randomize, and it was 51% versus 49%. – [Lisa] And that's significant (laughs). – And it was, but because of the concerns about it's current label
so we exclude patients with ticagrelor, so you
won't actually be able to– – [Bob] You could include 'em and include that as a subgroup. – That's what we– – [Bob] It would go the
opposite, if 300 was better in general and went the other way, that's exactly what I'd predict. – I mean it is, the sample
size that you're talking about do permit potentially a
lot of subgroup analysis like this, I guess the
challenge comes in to how many of those can you really
anticipate well ahead of time, and where do you really wanna
put the bang for your buck, which subgroups. Bob more? – [Bob] I actually have
a question slightly– – Okay it's gotta be a quick question 'cos I'm about to wrap up. – [Bob] Alright it's slightly off, has anybody in a systematic
way tried to find examples of, or an enumeration of
the places where differences have been found between,
you know, the study that led to approval
and the real world data, that showed that they're
frequently different? I mean everybody sort of
assumes they might be, but has anybody sort of
collected all this to see when that happens, or how
many examples of it there are? – Well you have. – We'll have an example
for you later this year. – We have done comparative
effectiveness studies using observational methods,
and we do see differences periodically, I'd say where we see it is, so for devices, because
of selection of devices in the patients, we tend
to see results that seem to be more effective than
what the trial results showed for efficacy,
and there I think it is because of the selection of
patients and for the device or the procedure, for
medical therapies we often have found fairly similar
results too on efficacy. – [Bob] That's, I mean
there is this obsession with effectiveness versus efficacy, I'm just wondering, I'm
sure poor compliance in the real world could
make a big difference, we know that one, but
are there other things that also we're not smart
enough to have figured out? – So some of the scenarios
where we haven't seen the effectiveness translate into practice, when we look further, it's
because there was very low adherence or persistent to
the therapy, and so that became an explanation. – [Bob] Well that one you
could sort of predict, if they don't take the
drug it's not gonna work. – Still an important one in
practice, for clinical impact. Well I wanna thank our panelists
for a great discussion, and we're gonna take a break
now, and the build on this in the next session,
thank you all very much. (audience applause) (background chatter) – Okay we're gonna go
ahead and get started on our next session,
if I could ask everyone out in the lobby to begin making your way back to your tables. Is this on. Okay thank you, we're
gonna go ahead and start our next session, the
focus of this session is to begin looking at
the potential applications of purely observational data
in the regulatory context, I'd like to go ahead and
introduce our first speaker for the panel who is Jonathan Jarow, Johnathan is in the office of the Director at The Center for Drug
Evaluation and Research at FDA. Jonathan's gonna give a presentation and then I'm gonna introduce
the rest of the panelists who will join me up here on the stage. So Jonathan. – Thank you Greg. Oopsie. Good afternoon everyone,
we have, we are now at the most exciting
portion of today's content, observational trials, to
make it even more exciting the paraphrase, Bob Temple,
not over my dead body. (audience laughter) So I mean, where to begin,
even though we've done it let's just start with some ground rules, so what is evidence from
clinical experience, so really most of today we've been talking about randomized controlled trials, yes within the healthcare
system, using different forms of data collection et
cetera, but for the FDA, as you've heard Bob say,
this is really a no brainer, randomized controlled trials
is something we've accepted for decades, and this is not
a big deal, there's a lot of things we need to
learn about how to do it, what settings it can work,
and what kind of endpoints we can collect, and the validity of those. But by and large that's
not the big challenge, the big challenge is this holy grail that no one will admit
to thinking of which is, if I'm a guy or person with a great idea, can I take an existing
database, punch a button, and compare one drug to
another therapy or device to another therapy,
whether it be no therapy or an alternative therapy,
show evidence of efficacy and get labeling of a
marketable product in the US? And that's really the
question, I mean it's really the question is about efficacy. So the definition would
be evidence obtained from observational studies,
or clinical experience, and this can be from registries,
electronic health records, claims data, social media et cetera. And the uses from a regulatory
perspective is safety or efficacy, and we
shouldn't spend a lot of time talking about safety, I will
go over some of this toady. But basically all of our
safety collection to date has been observational,
we do not pre-specify the collection of safety data in most of our drug development
programs, with exceptions of the big safety
outcome trials that we do for certain drugs looking
for cardiovascular risk. But it also could be used by healthcare, economic information, for
payers, research for academia or for a hypothesis generation
for drug development. So there are many potential uses outside of the regulatory sphere. So let' spend a little
bit moment going over the statute, and I heard people talk about the statute earlier this morning, kind of implying that the
statute requires people to use arcane non-informative,
non-clinically relevant endpoints in clinical trials. There's nothing further
from the truth, the statute basically says for drugs and biologics, one or more adequate
well controlled studies that demonstrate that the
product has the effect purported to have under the specific
conditions of use. It doesn't say you have
to use an arcane endpoint of a surrogate that maybe
doesn't mean anything to a patient, like blood
pressure, or lipid levels, or FEV1, we use those, FDA
collaborates with industry to use those because
they're very practical, and it's something that can be measured, it's something that
people have confidence in that does predict a clinical benefit, but maybe it doesn't, and
that it is an opportunity to win in other words to
demonstrate superiority over another drug or in
other cases non-inferiority over an overdue approved
established therapy. What's interesting is
that in the device side, CDRH, they have this pilot
program called parallel review, and this is a parallel
review by both FDA and CMS who are payer, for a medical
device that's coming to market and they have input on
the development program, and one of the participants
in this parallel review was studying the device
that was for the treatment of hypertension, and FDA
said blood pressure is fine for your primary
endpoint, show that you've reduced blood pressure. And CMS said, no, that's a
surrogate, you need to show that you've actually saved
lives, and was not happy with that surrogate even
though we've been accepting blood pressure as a surrogate
for regulatory action on drugs for decades. So you can see how this
develops and transpires. And the other thing, just
to get one more subtle point on that pilot, is that they
also wanted to make sure that there are a lot
of people over age 65, because CMS only covers patients over 65, accept of course medicate. But they wanted to make sure
that that was in the study, because otherwise the study
wouldn't be applicable to their patient population,
so back to Martin's 1.7%. The other aspect of the
statute, the statute doesn't say you have to have substantial
evidence of safety, so we've never required
substantial evidence of safety, again with exception that I
was talking about a moment ago. The other big difference
is drugs versus devices, and I assume that today
we're really focused on drugs and not devices, but that
statutory standard for devices is reasonable assurance
of safety and efficacy, it's not substantial
evidence, and the guidances that come out of CDRH say
that valid scientific evidence for reasonable assurance can include randomized controlled
trials, but also include observational studies,
registries and K-series and large patient
experience, so all of those are included as potential
valid scientific evidence to establish reasonable
assurance of safety and efficacy. So really again back to our
holy grail, the holy grail is an observational data
showing efficacy of a drug or biologic, is what
people are really about. So as, you know unfortunately
going to the end of the day everything that I was
planning to say has actually been said multiple times already, but just to quickly go over
the spectrum of evidence, and this is analogous that
someone was talking about our clinical practice
guidelines being based on fairly weak evidence in
general, mostly expert opinion, but there's a spectrum
of evidence out there that's analogous to the levels
of evidence that we rate for clinical practice guidelines, and randomized controlled
trials and pragmatic trials, as long as we define
them as randomized trials in the healthcare space,
these are of the highest level of evidence, are they perfect? No, there's alsorts of
problems with these, and we get applications with
randomized controlled trials that are purported to show
an effect, and we review them and say no, you didn't
have substantial evidence of that effect, and Lisa
alluded to some of the problems, some of it's missing data,
some of it's crossover, some of it's picking a post
hock subgroup to look at 'cos they failed on the ITT population, some of it is just in
the way they, you know, if you look at progression free
survival in a cancer trial, and have a bias in the way
that the studies are performed in terms of the timing,
not being in the windows that were predetermined,
there are alsorts of issues that could have informative
censoring of patients and the Kaplan–Meier curves,
so there are a lot of issues in randomized trials
so they're not perfect. And then we get to the
observational trials, and obviously we prefer
prospective observational trials rather than retrospective
hitting a buuton, and after all the data's been collected and see what you find, so
this is where you pre-specify what the endpoint's going
to be, you pre-specify what the treatments going
to be, you pre-specify your groups and you have
some control, at least over the safety data that's
collected, you can specify what specific side effects to look for. Registries are single arm by definition, and these are typically prospective, and you can pre-specify but
not all registries are equal. And then you have case
surveys or case reports, and this would probably
be considered, at least in our opinion, the
lowest level of evidence. So history, what has FDA done with observational data in the past? We've looked at safety for
new molecular entities, we use Sentinel for
post-marketing safety assessment. For efficacy, rare disease, we keep using this term rare disease,
lung cancer has become a rare disease, at least
the way FDA defines orphaned designation, so
you can have rare diseases that are tens of thousands
of patients in the US, and you can have a rare
disease that's 20 patients in the US, and there's a big
difference between those two as you'll see in a moment, we
also use observational data at CDRH for devices, can we
use it for labeling changes of labeling updates? So let's go over some of the
case examples, so here's a list and Janet referred to some
of these this morning, of rare diseases where the
efficacy findings for that drug was based essentially on
K-series or case reports. So FDA has already done this, despite our statutory requirement that
we have substantial evidence to find there's one or
more well conducted, adequate and well conducted trials. We have for instance
lumizyme for pompe disease, survival data from an
international registry, Carbaglu, data on plasma
ammonia level reductions in a K-series, Cholbam,
data on growth, survival and reduction in laboratory
parameters of cholestasis in a K-series. Glucarpidase, sorry, for
methotrexate toxicity, this was based essentially
on a treatment protocol, expanded access use through the NIH. Metreleptin, another one that
was from a expanded access or compassionate use treatment protocol. And most recently a
treatment for 5-FU overdose, was also based on animal
data and a series of cases and expanded access used in the US. So there are times in
which we fully understand the disease very well, we
understand the pathophysiology, we understand what the treatment is doing, and there is a large effect
size, so we're not worried about confounders, and some
of these, almost every patient in the US was treated with
this drug, so we actually had the entire population,
we had Martin's 100%. What about labeling changes? We've heard a little bit
about the influenza vaccine being studied, but there was
actually an observational trial that was done retrospective
based on Medicare claims, so it had almost a million
people on the high dose and 1.6 million on the
standard dose, and it was able to demonstrate what FDA
interpreted as a real effect of the high dose being superior
from the standard dose. Rabies vaccine is kind of
an interesting case example, this information came
out of a drug shortage, so the standard dosing of
five doses, there wasn't enough supply for everyone that needed it to get the five doses, so
people were routinely using a reduced dosing regiment,
and they showed an effect that was convincing enough
for CDC to make that the now recommended treatment, with this there's only four doses. For hypothesis generating
the FDA in collaboration with the NIH and CDC and the
World Health Organization has this project for
neglected tropical diseases, looking at the world
experience of repurposing drugs for a treatment of
neglected tropical diseases with a website and a mobile
app, and a global reporting tool to report cases, so will there
be an approval based on this? Not over Bob Temple's dead body (laughs). But there may be, wait and see. So I'm gonna talk, just a
moment, about methodology, I am not a statistician, I'm
a urologist by background. I am a statisticians worst
nightmare, I have a stat program on my computer and I
love punching the buttons and seeing what kind of p-values I can get without knowing whether
it's the right test or anything like that, but
let's talk a moment about it. So one of the big concerns
with observational data that we don't have with
randomization, assuming that there's not a high dropout rate
and crossovers and the like, is that we don't have to
worry about known confounders and the unknown confounders,
so now I'm sounding like Rumsfeld, so the
factors that patients have that we know will
influence their prognosis and potentially their
response to a therapy, predictive and prognostic,
but there are attributes of patients that we have no idea about, so before we knew about
graft, did we know, or eGFR, did we know what
patients would respond to one drug versus another drug? No. And so the question is,
in observational studies, when we have errors, whether
it's errors missing data or errors of diagnosis or errors
of outcomes, is it random? Which you know, Bob keeps
saying, well if you win and it's just random errors,
noise, it should be equal in both arms of your observational
study, comparative study, and therefore you can
trust the results because if you winning with the
presence of noise it must be an even bigger effect, but
what if it's systematic? And do we have tools to detect
whether something is random or systematic, and that's
really a question to you, and something that's going
to have to be established as we start looking at
using this type of evidence to make regulatory decisions. The other thing is analysis
in view, I referred earlier to we still have problems even with randomized controlled
trials and differentiating whether there is still a bias
introduced post randomization, and this is one that really
struck me when I started out in FDA in the reproductive urology and bone metabolism division,
and so I'm very familiar with this, we've been using
bisphosphonates to treat post-menopausal osteoporosis,
there are a number of drugs approved, been on
the market for a long time, and there are constantly new
safety signals coming out with this, like atypical hip fractures, and in 2010 there were
two studies published in the literature, one
in JAMA and one in BMJ, both looking at the
association of bisphosphonates and esophageal cancer, and
I was struck by the fact that both observational
studies were performed in the same database over
the same time periods, and they had the same people
that had esophageal cancer in both of these studies,
and one methodology, and I'm again not a statistician
so I won't make a comment about which one's right
or wrong, but one came out with no evidence of an
association with a relative risk of .96 and the confidence
we'll let you see, the other one found the confidence
invalid and excluded one, and said there's a 30% increased
risk of esophageal cancer for women taking
bisphosphonates versus never having taken bisphosphonate,
and then when they studied duration of bisphosphonates use it was even higher relative risk. So which way do we go? Which study do we believe? If they were two different
populations I would see it maybe understandable,
but this kinda makes me very uneasy with observational
data when you can have two studies of the same
population, just using different observational statistical
techniques, get different results. So what do we need? As discussed over and over again we need some demonstration projects,
we need to be assured that we can do this, we can
link the data between claims and EHR, that we can measure outcomes, that we can find out who
died using this thing, 'cos death is not, as mentioned earlier, is not typically in EHR or claims data, so we need demonstration
projects to determine both methods, problems
and then which outcomes this will work for, we always
seem to talk about MACE, you know, myocardial
infarction, stroke and death as an outcome measure,
that's the low hanging fruit, how are we gonna do
depression drugs, how are we gonna do overactive bladder
drugs, how are we gonna do a variety of drugs, I
mean many of our drugs are for symptomatic
diseases, are we gonna leave those out from all this? As mentioned by Janet, there
is some regulatory issues that we have to deal with,
and this is for FDA to do, is this going to be, how
are we gonna apply part 11, which is the electronic data,
and what needs to be ordered, what's gonna be considered source data, and of course part 50, if you're doing your randomized pragmatic trial
you're collecting consents at the outset of the trial,
if you're doing retrospective you run into a problem, so
let's say we have a device, is the typical example,
we ask the company to do a post-marketing study
to look at some outcomes, whether it be safety or
efficacy with their device, after approval for marketing,
and they can accrue, or they have trouble
accruing, and they come back and say "Can we use this registry? "There's a big registry
that was created in Europe, "this device has been
on the market there for, "you know, five years,
can we use the registry?" And FDA says "Sure, that's great." But now, can they use the registry? FDA requires, US law requires
that they get consent if it's a trial being done
for regulatory purposes, and so even though the common
rule would exclude that as part of standard care, low risk, FDA's current regulations
do not, so those issues have to be dealt with, and
as you'll hear tomorrow from Melissa, we're working on developing a federal evidence generation system. So we need a lot more academic research, Bob keeps on making pleas
for people to look at it. When is a pragmatic trial
showing a different effectiveness than the randomized controlled trial? When we have a randomized controlled trial can you duplicate those results in doing observational
studies after the fact? So these are all things that
need to be done and hopefully will be done soon, thank you. (audience applause) – Thank you Jonathan. I'd like to go ahead and
invite the other panelists for this session up. Richard Platt is Professor and Chair of the Harvard Medical School Department of Population Medicine at Harvard Pilgrim Health Care Institute. Marc Berger is Vice
President of Real World Data and Analytics at Pfizer. And then Marcus Wilson is
President of HealthCore, a subsidiary of Anthem. Welcome all, and we'll go
ahead and begin our discussion. So Rich, I think you have
some opening comments? – Right, I'm next. So I ought to begin by
saying I'm here as a poser, because it's above my pay
grade to answer the question what would it take for
purely observational data to be suitable for regulatory needs? If Jonathan's talk convinced
you of nothing else, it's that I'm not in a
position to advise FDA on what it should make, how it should make regulatory decisions. But I am prepared to talk
about what it would take to be able to use observational data to drive, to be high quality evidence
to drive clinical practice for instance, and as background
for it I do wanna comment on some of the remarks earlier today about the quality of observational data. See claims data has lots of
problems, I'd say at this stage of our lives EHR data
has many more problems. But as we got decades of
experience working with claims data and we understand a lot about what it can and can't be used for,
I think we're still in the wild west days for most EHR data. I mean I applaud the
many initiatives going, that are going on to work
towards standardization of EHR data, but for
the time being we live in an environment where customization of electronic health records so that they are mutually incompatible
in terms of what they contain is a feature and not a bug, and there is a substantial chance that part
of the harmonization efforts will get to a superficial
level of harmonization, while the actual coding of
what the clinician intends to the EHR may still be quite different. So I have high hopes for EHRs as sources of high quality data, but we still have an enormous amount to learn about that. I think the other thing to
say about observational data is we should explicitly
include prospectively acquired observational data by
reaching out to clinicians and patients about their experience, so to Jonathan's point
about the fact that many of the things that we
care about is information that really can only be obtained from the individuals themselves,
I think that's part of what we should consider to be
the observational data set, that is baked right into
the way PCORnet operates for instance, the notion that patients are really active participants
in the development of evidence, and I think that's gonna be an increasingly important
piece of our thinking about how to do these things. With regards to what kinds of evidence should we be attentive to, I
think we have to be prepared to be opportunists, in the
absence of randomization, there are certain circumstances
in which I think we can be more confident about
accepting observational data as evidence than others, so for instance the influenza vaccine study
that Jonathan mentioned is a very good example I think,
if you haven't had a chance to look at it, I recommend
it to you, it has several appealing features, but
one of them is that people who received high dose
vaccine had a better outcome than people who receive
regular dose vaccine, think if the result had been flipped, we said high dose vaccine,
people who received regular dose vaccine had a
better outcome than people who received high dose
vaccine, it would suggest to us that there was important
unmeasured confounding, because the expectation is
that if there are biases that we would expect them
to operate in the direction of the sicker people
getting higher dose vaccine, so I'd say that's a situation
in which, if I tell you that the effect estimate is 20% different, I'd say it starts off
seeming like better evidence if the advantage is for
the high dose vaccine than the low dose vaccine, and
I think that basic approach holds true for a number of situations. Secondly, that study was
remarkably well designed, as you read the method section,
the, what was it two million people who were studied approximately, in that they were drawn from
a much larger population of Medicare beneficiaries
for whom CMS had information about the vaccine they'd
received, but they restricted the population in a way that
makes it seem more plausible, that the population who
received the two doses were likely to be
comparable, one way they, an important way that
they did that was to say "We will only study people
who received their vaccine "at a pharmacy." And the reason for that was to
say, it gives you some sense that the people were
ambulatory, that there was some self-efficacy, and so we're
talking about a population that is likely to be
somewhat more homogeneous, just by the fact that
they went to a pharmacy. And then the next thing
they did was to say they matched within pharmacies,
that is they only accepted people who received their
vaccine at a pharmacy that had distributed both kinds of vaccine within a two week period. Those are really very
clever design features, I haven't read about 'em in any textbook, that is this is a design that
arises from deep knowledge of the, not so much of the
vaccine, but of the population that's immunized and the
way healthcare is delivered, so it's always good to make
sure that you don't just push the button and ask
for an answer to come out, because you can have more
confidence if you know how the trial was designed. I'll retreat those, my final point is, so I think that study is
extremely high quality evidence, partly because of the
way the result came out. I think in some ways that kind
of result would be credible partly because of the
direction that it went, but we should also be true
to our intrinsic natures as Bayesians, that is all
of us approach decisions based on what we think
about the world anyway. And so I'd say the place
for observational studies should at least start in
environments where we think we have a pretty good
idea what the answer is, and we're trying to extend
that to a new situation. So if we have good evidence
about how a vaccine works in the under 65 population,
then observational study seems to me, that give a plausible result, is pretty credible if we
say we're gonna extend the age range, or we're gonna
extend it to a population that hasn't been well studied, we're not gonna do randomize, you usually
won't do randomized trials in that environment, and the
question's are we better off using the observational
data there to inform what we think we know about
effectiveness than we would be if we said we know nothing,
and so I'd say we start with a presumption that
we know a fair amount about how the, in this case the vaccine, works in a particular setting
for a particular population, and extending it to others,
or to other conditions, is probably among the safest ways we could use observational
data to support the development of good clinical practice,
evidence based guidelines. So where does that leave
us, unlike the situation of randomization where we should be able to trust randomization to deal
with so many of the problems that we care about, I think
for the foreseeable future we're gonna have to say
observational studies can only be useful for
certain kinds of questions, right at the outset we
should start off saying it will be unhelpful to
do an observational study in certain settings, and then
be prepared to only accept certain kinds of answer
from observational studies. Back to you. – [Greg] Great, thanks Rich. Marc? – So it's a pleasure to be here. First I'll start with a
disclaimer that my remarks are my personal remarks, do not represent the views of Pfizer Inc. And I'll also start by
saying that there's no doubt that the randomized
controlled clinical trial was the single most important
methodologic development in clinical research in
the last hundred years. There's no question about
that, we're not getting rid of the RCTs, but we are
entering a digital era of data, and this digital
era of data is gonna power our learning healthcare
system, and people are using it to make decisions today,
now whether it's good enough for regulatory purposes
that's the question today. But payers and providers, and I've worked for pharma companies,
I've worked for payers, I've been a provider,
they use it all the time, they use it to say which
patients might be better off getting pre-screened for
disease, which patients deserve to have case management or care
management, how do I design my benefit design to make sure I'm getting the best population outcomes. It's being used today, and
I imagine that if any of you walked into a surgeons office today and you needed some
surgery, you'd like to know of the last thousand patients
that surgeon operated on, that looked like you, same
age, gender, comorbidities, what was his outcomes with that surgery? Don't tell me anyone
here wouldn't wanna know that information. So evidence based medicine
says use the best available information at the time
to inform your decision. So that's one thing at the
bedside, it's another thing at a treatment guideline level, and it's a very different
thing at a regulatory level. So we have to, I grant that. And so you've already heard,
I'm not going to go into it, the FDA's already using
observational data, and the question is, are we
reaching a tipping point? can there be things that
the FDA can do today, or in the near future, to
more use observational data to inform regulatory decisions? And I'm mostly interested in labeling, because that's the holy
grail, if it's in a label you can talk about it,
if it's not in the label you can't talk about it. So we've heard there's
challenges, data quality, it's allover the place,
however that means that observational data does not
equal observational data does not equal observational data. The data from the Salford
study is probably better than most observational
data that's out there. And even if the data is not that clean, and has some missing data, it doesn't mean that the conclusions
you're gonna get from that are incorrect, because
how you analyze the data to deal with bias and
confounding varies dramatically in analytic and design rigor,
so not all studies are equal. So how can the FDA decide
whether the data is good enough and the design and
analysis was good enough to support a decision, well
I've been really privileged over the last 10 years of
my career, to be involved with a bunch of different
initiatives where we've been trying to set forth best practices about how to analyze observational data. And in 2009 I was part
of an ISPOR task force that put out good research practices for retrospective database studies, that was published in Value and Health. In 2010 The National
Pharmaceutical Council sponsored the GRACE principles, which was published in the
American journal A Managed Care. And then there have been other things, and most recently there's
been a collaborative between ISPOR, The National
Pharmaceutical Council and Academy of Managed Care
Pharmacy, the pharmacists who do all the PMT
reviews to decide whether drugs go on formulary, to do a
CER collaborative initiative, and one of the things
we developed in there was a questionnaire to
help people who are trying to evaluate individual
pieces of observational data say what is the relevance
and the credibility of this data, it's not
a one zero, this is not a binary function, you
have to make a judgement about how good the data is. So in this questionnaire we divided it into two big domains, one is relevance, and you've heard about that today, is the population the
right population to which you wanna make a decision about? Are all the interventions
that you're interested, were they included in that study? Were the outcomes, both
primary and secondary, the ones you were interested in? And did you measure them over
the right period of time? What was the context? This is like, you know,
basic blocking and tackling. But under the credibility
side, we go through specific things about design, and in
design we make recommendations, so there's nothing wrong
with using observational data for hypotheses generation,
great, but if you wanna use it to make a recommendation
for a health policy or make a recommendation more generally to a population of
patients, you should have an A-priority hypotheses, you
should have a formal protocol, this is where we went
with RCTs, we need to go to the same place with observational data. And we even argue that
the way to know that is that those studies
should be preregistered, now we went through the
preregistration conundrum with RCTs in the last 15 years, why? Because there's publication
bias, not every journal wants to publish a study
with negative result, but you need to know what
the negative results were and the positive results, and
you need to make sure that what people are reporting on
is actually what they did, well we should do the same thing if we wanna elevate observational studies to make them more credible for the FDA. Data quality, there were many
things we go through there, they talk about how you do data quality, I'm not gonna go into that. Analytics, Rich has mentioned one thing, if the result of the study
comes out and actually flies in the face of an expected
bias like channeling bias, that should give you greater confidence. Things that we talked about
in our, in that questionnaire was if you wanna build on an RCT, and now we think that
all observational data is built off, generally
speaking, most of it's built off of RCT data, do an analysis
that shows that if you restrict the population to look like
the population that was in the published RCT, and you
can reproduce the result you got in the RCT, then when you look at the more general
population, you should have greater confidence that
what you're finding there is not hooie, it should
make sense, and maybe you can draw conclusions
about that and make some regulatory decisions about it. More recently a lot of work has been done, David Madigan has led this
effort, to show that in order to deal with systematic bias
and bias in the data sets, you need to conclude negative
and positive controls, and we talked about this
in the questionnaire, about using outcomes that
are completely unrelated to the intervention, it
allows you to look to see if there are temporal or
other systematic biases that's built within the
study, and we've done studies where we see that there are general biases built into the databases,
but even with that, when you use positive
and negative controls you can still find important associations that are real and powerful. I think you could make
regulatory decisions off of that. The FDA recognizes this is
coming, so we heard a preview, and there was the announcement
about repurposing Sentinel, which my good friend is
a very important part of, to start looking at effectiveness
and efficacy over time. And this day is going to
come, because patients are gonna be making decisions,
they're taking more control over that, they wanna get
the hold of their own data and they're gonna wanna know
what patients like them, what happens to them, and
so the label which changes from time to time will fall out of step, and the question is how do
you keep it more in step so that it actually is relevant
to the practice of medicine? So what is the place to start? And the first place to start
is in the relevance place. So, if as Rich has just mentioned,
if you're not going to do and RCT in a population
that wasn't included in the primary RCTs, but you
have good observational data and it makes biological
sense and it's plausible, why wouldn't you wanna
put information about that into a drug's label, somewhere? The credibility issue
is much more important, and this is where I think that we need to call and inspire the FDA
to get actively engaged in shaping the criteria for understanding what good credible evidence
is, as long as we look at all observational data and
tar it with a broad brush, we're not gonna get
anywhere, so what are things that you might do that
you might advocate for that might move the people who are doing observational studies
to go to best practices? Well, one of the things
we do with RCTs is we say, one RCT is generally no
good, you need to have two. Well with observational
studies, shouldn't you have multiple observational studies
that showed a same result? And shouldn't you require that it uses multiple analytic approaches
to deal with the problem that Jonathan was talking about? Yes, so if you have multiple
observational studies that are using different
analytic approaches, and they all point to the same result, I gotta tell ya that's
about as good as it gets in evidence based medicine. We have to have
pre-specification of hypotheses, we have to have
preregistration of results, and we have to look at where we think the known confounding
and biases will push it and see if the results are
going in that direction or against that direction. So I think there is opportunity
today, and I realize that are some regulatory hurdles that need to be talked about,
but evidence based medicine would say we are using that,
this kind of information today to make the best available
recommendation to patients going forward how to
take care of themselves for their individual problem,
for who they are individually. And I think there is an
opportunity for the FDA to start looking at populations
that are not included in randomized controlled
trial, looking at dosing, as you already have, looking
at adherence and compliance, I think that there is enormous differences in adherence and compliance,
and whether or not that is due to the particular therapy, but the way the therapy is
administered or the way it's used or the fact that, you
know, the patients on seven other medications, isn't
that interesting information that a patient would like to know? That a treatment regiment
A in most patients has more adherence and compliance
than approach number B, that would be useful information. And I think down the line,
as we begin to do that, I do think there's an
opportunity for new indications, particularly off label
indications, where you're not gonna want to conduct an RCT. And I do think that, despite
my good friend Bob's concern, and I agree with you, all
relative effectiveness studies have relatively small
effect size, I think that if you do multiple well
designed, well executed, non-randomized observational
studies, and you use multiple different analytic
techniques, and they all go in the same direction, I
think you can infer causality, and there are a number of statisticians who believe that as well,
particularly if you put it in a bayesian context, if
you have a baseline RCT that says we know in some
population it does work, it's plausible to work
in this other population, I think we can get there,
and I think we will be there, and I think the worlds gonna
pull us in that direction. And so my plea is, rather than saying let's hold back the dawn,
I'd rather have the FDA be actively engaged and putting out, instead of saying look
at, you ever hope to get an observational study
to get a labeling claim? Don't come to me with
one observational study, come at me with multiple
studies using multiple different analytic approaches,
have your study preregistered and et cetera, if you
don't preregister it, and you don't have the
pre-hypotheses, then listen that's a great study, that's a hypotheses, now go do that in some other database, go do that with multiple
analytic approaches, show me you get the same result. I think we can elevate the
criteria and begin to decrease the separation between the
gold standard of the RCT and what you think is, I
think they're not as far apart as you think they are. Anyway I've been proselytizing
that for a long time, Bob knows that (laughs). – [Greg] Thanks Marc, Marcus? – So how do you follow that? I dunno. Jonathan's (muffled)
anything you wanna say it needs to get said as
you go down the line, so. I'll try to be, I can't be
as provocative as Marc was, but I actually really
agree with the last point you made on this. And I think really what
we're trying to accomplish, and what I think is really
the primary objective of our efforts leading
up to today and then even our efforts after
this, is to begin to develop a blueprint, because if
we reframe the problem, when I came into this
discussion it really wasn't me trying to find a way that we
can include pragmatic designs or observational designs or
real world evidence development in regulatory decision
making, it was much more about how do we address a very
big issue that we're facing in our system, and that
is, is that when a product receives regulatory approval,
then those of us who're have to make decision post that approval, have woefully inadequate
levels of evidence to make the decisions,
whether that's policy related decisions and payment decisions, or prescriber level decisions,
we really don't have enough evidence to make an
informed decision at that point, so think about it in the context
of, and it's been mentioned several times, but I'll give
a little stats about it, and that's the quality of the evidence that goes into therapeutic guidelines, which as payers we use
a lot of the guidelines to frame out our quality
measures, as providers we use a lot of the guidelines,
they help guide therapy and choices we make, and so
they're really in many ways the gold standard of care
for a lot of organizations and provider types, and
so if we think about the quality that goes into it, then think about
cardiovascular guidelines, when you look at a very
systematic review of the evidence that goes into cardiovascular guidelines, depending on the guideline
that you look at, the worst case scenario is
there's 0% good evidence, where almost all of it is expert opinion, the best case scenario, which is actually congestive heart failure
is 25% of the time we know what we're doing, 75% of the
cases we really don't know what we're doing, for
oncology it's much worse, it goes form zero to 18%
with an average of 6%, for infectious disease
it's a little better, but it's only 15 to 30%,
meaning that most of the time when we're providing care,
making decisions of any type, we really don't know if
it's the right decision, and we don't have a
systematic way of following up to see if it really worked out right. I think it was we were at a conference, it was a CMTP conference,
and maybe it was Sean who said this sidebar
conversation we were having, and we talked about variability and care, and it's one of those things that everyone always thinks, well if
just get everyone to follow the guidelines we're gonna
achieve a high quality affordable healthcare system,
and I just gave you evidence that that's probably not the
case and that the variability in care is probably
justified to a large extent. And one of the things that
Sean said that was really getting to on this is that
it's really not the variability that occurs, it's not being
able to study that variability that's the real tragedy in all of this. And Sean I don't know if I'm
attributing it to you or not, but it's actually a really good comment, but that's the, from my
perspective, I think that's really our problem we're looking to solve. So my question is if it's
not using real world evidence then what are we gonna
do to solve that problem? And so I'd really like to look
at it a little differently and say not why we can't
do something or why we can, is really how are we
gonna solve these issues? How are we gonna solve these problems? So if I think about it in
a sort of mathematical way, if I think about the
improvements we're making in randomized clinical trials, and the increasing
investment in those including using real world data to
help facilitate the conduct of those studies, perhaps
we can get to, in time, where RCTs, defined as
traditional RCTs may get us a third of the way there,
I actually think pragmatic type designs, and I know
that there's not necessarily clear distinction between those, but I think pragmatic designs
may get us another third of the way, but there's still
gonna be a pretty large gap left after that, there's something, whether that's observational
data or some other form, has to begin to address
that gap, because it's till pretty darn big, especially
if there's decisions, if you're fitting in that
last third of that gap, and you're decisions are
made about you or your child or your parent or your sibling
or your wife or your husband. So I think from my
perspective we still have a pretty big gap. Now the challenge I think for
all of us that we're trying to get to, and this is a
little bit of going toward what we're looking at I think
tomorrow in the discussions, I hope, is that if we
look at pragmatic designs and observational designs,
we base our judgement on how well they fit
into regulatory decisions or other decisions, based
upon real world data that goes into those designs today, then we're probably
accurate when we really put a lot of qualifiers on
the value of the data. And so what I think is
important is that we think about what are the things that
would amplify the utility of the data that exists today. And I think there's really
three major amplifiers within that we really need to focus on. One is integration of data,
now mind there's a difference between linking data to
enrich and integrating data, and I think it's beginning to
resolve some of those issues that we referred to earlier
around really understanding how good the quality is, but linking data an integrating are actually
different in my mind. So the more we can integrate
to give a more robust view and enrich that view of that individual, the better off I think we're gonna be, so that's a major amplifier
we've talked about. We've talked about
another one, only briefly, in a couple of remarks that came across, and that is the
longitudinality of the data, so right now we get snapshots
of data and when someone within Anthem health plan
leaves and goes somewhere else we loose the follow up with
them, or when they're coming to us from someone else,
we actually don't have that historical perspective,
and we're working on ways to try to solve that,
starting in California we have an initiative called Calindex, or California integrated data
exchange, where we're going in with other payers to try to
build that longitudinal view, sort of putting data, if you would, in a pre-competitive
space and saying look, what's more important to
all of us than anything is we create an integrated
longitudinal view of the population, and then
we'll compete after that, but let's do that because
most important thing is, is that's gonna power a lot of decisions that we can't make today,
because we don't have the integrated view, and I think that, getting providers involved in that as well is really big aim and trying
to create a longitudinal, deep and longitudinal view. But there's also a third piece of this, and I think it's just as
important as the first two, and that is, is that
as we build that view, that integrated view, we especially today, and Rich said it extremely
well, is we really don't know how to use these data,
claims data we do have a lot more experience with,
we do have a lot better sense of what we don't, we know what we know, we know what we don't know,
and what we don't know what we don't know is a very
small slither with claims data, within the electronic health
record data it's a big, we don't know what we don't
know, that piece is very huge. So from my perspective
that's a big part of this. But that third piece, the third amplifier is maintaining provenance,
a robust provenance, we need to make sure that
we stay close to the systems that generated the data so we
understand the circumstances under which it was generated. 'Cos if we don't we
can't look at things like non-apparent confounders,
non-apparent sources of bias that we may find reasons that
we find things in the data that we can't explain,
maybe we can trace back in the systems, and
sometimes we uncover really what that was and adjust for that. Or in some cases, what we've
found working with some of the provider groups, is
that when we're looking at data that goes into the electronic
health record system, and we're pulling it out and
we're realizing there's 30, they wanna try to clean it
up too, especially if there's a reason for them to
do it, we did a project just outside of New York
City with a very large group practice there,
and when we pulled all their EMR data out and
started integrating with our claims data, with our Empire
Blue Cross BlueShield plan there we realized that all
the height and weight data were really bad, really bad,
I mean it was almost unusable, and so we went back to that
practice and said there's what we found and we
started trying to look at your height and weight,
simple entries into the chart, they went back in and
systematically corrected that, they didn't realize how bad,
the fact that they're including some other note, some
other obstructed field and they weren't really
paying attention to the units of measure, they
realized that there was a consequence to that,
so they started changing how they entered that,
and I think that's part of the feedback, like a
part of that blueprint that we have to get to,
we're saying look it's not where it needs to be,
what do we need to do to get it there? And what are the things we need to inform to make sure we get it to that right spot? So I'll end with that,
really just with a plea for all of us, is not
really think about why, justifications for using
real world evidence in regulatory decision making,
but how are we gonna solve that big problem that
exists today post-approval of a product, and getting
evidence that we really need to make continuation of good decisions after that post-approval, thank you. – Great, thanks Marcus. So just building up on
the last two comments, I mean it sounds like the onus
is really on the researcher and the people who are
putting this data together, so Marcus you walked through
steps to make the data better, steps to, and we heard a
lot about that this morning, the challenge tends to be
in the quality of the data, is it measuring, is it
showing what we think, is it measuring what we think
it's supposed to be measuring and how can we use that, and
so Marcus you walked through how to improve the usability of that data. And Marc, you also sort of
focused on what researchers need to do to elevate the credibility of the actual research, rather than, you know what we didn't
hear is a lot of, sort of, what FDA needs to and what you know, it's more on producing better
research and demonstrating that we have better research. So Marc you mentioned that there are a lot of best practices, ISPOR,
NPC, a lot of groups have gone through and put
together best practices in an attempt to, say
if researchers did this then it would be elevating the
credibility of the results, but I suspect as those groups were putting the best practices together
they probably weren't thinking for regulatory usage
traditionally the data is thinking for payers to make a decision or for providers to make a decision. Are the best practices sufficient, so it's a run on, but I've
got a question in here. (audience laughter) Is the problem that typically
researchers are just not following the best practices
that are already out there, or do we need to improve
upon the best practices to really make sure that these
studies are at the high level of credibility that you're talking about? – Listen this is a new field. I mean health services
researchers have been around for 30 years, 35? Epidemiology's been around maybe 50. Outcomes research less, and
editors don't really understand how to really evaluate the
quality of the information. The pre-review process is not
as robust as it is for RCTs. And so the quality is allover the place, at the same time methods
have advanced dramatically, and so in terms of
dealing with confounding, one of the bigot advents during
the late 90s was the advent of propensity score analysis
to match a population. It's not perfect, but it was
a lot better than what it was before, and so the methods
are improving as well, and researchers respond
to incentives, so when the international consortium
of medical journal editors put out some criteria
saying, you wanna get your study published you
have to do x, y and z, guess what, researchers do x, y and z. And this is where I think
the FDA can have a role, if the FDA wants to engage, it could say, you wanna have any chance of
getting an observational study to get you a change in a label
having not to do a safety, then I want you to do at least x, y and z, FDA has not engaged with
that as well, that would be a very powerful incentive. Around the world, this data is being used, it's being used by
regulators, it's being used by HTA authorities, it's
clearly being used by payers and by providers, and yes,
all of these best practice things have been really
focused on the consents of evidence based medicine,
so evidence based medicine has a different mandate, and the mandate of evidence based medicine
is to use the best available evidence to make the decision
at the patients bedside, which means you're always
operating at the penumbra of what you know and you
don't know, 'cos you can't know everything for a particular patient. Having said that, you can use
those kinds of best practices and those recommendations
to think about how you would apply it if you wanna make a population recommendation, whether it's a treatment guideline coming
out of a medical society or whether it's labeling
decision made by the FDA. I think we can apply those
same kinds of standards. The question is, how
certain do you have to be and what's the regret for
making the wrong decision? Well if you're always starting,
if you're starting from the point of view you're gonna root it in, I have baseline RCT that tells
me that this therapy works in this population I have proven it, okay, and if the results are
biologically plausible, and if you do multiple
analysis multiple ways, and the results go against
what you think the prejudice should be or the bias
should be, I gotta tell you, as a good Bayesian most clinicians
would say I believe that, and I would think that
there is an opportunity for the FDA to believe that too. – [Greg] Marcus? – You ask what I think the FDA could do, and to build on those
comments, is really– – [Greg] No, I didn't ask that. (laughter) I think Marc asked that just to be– – I wasn't sure of the run on. I got some good stories about Greg, Greg used to work for us– – Marcus actually gave me a award, I don't think you remember
this, about putting together a learning module for how to
do prepencity score matching in your company, but– – Yeah I know. – Okay. But that's an old method according to– – So can we have the reward back? So I think the one thing
I would say is that I really do believe that we
have to not just think of real world designs if
you would, broader type definition of this, in
the post-approval space, and I think we really need to contemplate the using of them, and I
would love to hear the FDA think that way, you don't have longevity in your genes, right Bob? (audience laughter) Oh jeez okay (muffled)
not in your lifetime. So I really do think it's
essential that we think about that because I think we're trying
to close a gap sooner, we're trying to get to
relevance much faster than we can today, and to wait
for post-approval time period we're just adding on to that,
we're really not accelerating it to the great, it's not transformed, it's evolutionary but not
revolutionary kind of thinking about it, and if there's a way, and what I would like to
see from the FDA by the way, is to say, okay, look Marcus,
if that's gonna be the case, then here's what has to
happen, here's what has to, here's where we really need,
here's the basic principles that have to be met in order
for something like that to occur, I think that's
when you begin to get everyone shaping their
data development efforts in the right direction,
that's when you start getting folks thinking about designs differently than they have today, they
start thinking about methods, they start thinking about
their drug development process perhaps differently, I think
it's that signalling aspect that says look, we have to fix a problem, if we're gonna do it in a
much more sustainable rapid way then here's the roadmap
we need to get through. So that's what I would, that's my opinion. – I have a lot of empathy for
you, 'cos our job is difficult to some degree, but your
job is an order of magnitude more difficult, so to get
a drug into the marketplace you just have to show efficacy somewhere, and then, not because
of, for whatever reason, we wash our hands of it
and say we don't regulate the practice of medicine,
and so once it's approved for COPD, if a doctor wants
to start prescribing it for an off label use, whether
it's a different dose, a different dose regiment,
a completely new indication, that is their prerogative. Marc working for Pfizer, wants
us to put it in the label so they can communicate about it. But your job is much more
difficult because we deal with whether it works, whether
the risk benefit profile, there's a great deal of
uncertainty about that at times, sometimes more or sometimes less, so we don't have 100% certainty
about a drug when we put it on the market, but you've
gotta make a value decision, you have to determine clinical value, we don't say, as a
statutory requirement that a new anti-hypertensive drug
has to beat every other drug on the market in order to get approved, it technically only has to beat placebo, it could be worse than any other drug that's currently on the market,
but if we think it provides another option for those
patients who potentially can't tolerate the other
drugs, so we don't require them to show that they're
better than everything else, except for the accelerated
approval pathway, where you do have to be
better than available care. And so then you're stuck, 'cos
now this drug is out there and everybody wants to
prescribe it, and it costs a lot more, potentially,
than some generic drug, so what's the value,
is there a value to it? And so the decisions that
you have to make as a payer are, I think, a order of
magnitude more difficult than what we have to do to determine does the benefit override
the risk of the drug, and is it efficacious. Having said that, when a drug is approved, even after approval,
there remains, and Janet said this earlier, thousands of questions about that drug that we
would love in the ideal world to have a randomized controlled trial, showing every subgroup,
every dose, you know, most of our drugs that we approve are a very poor development
program in figuring out the right dose, we would
love to have a plethora of randomized controlled trial to answer all of those questions, but
we're not gonna get that, it's not feasible, it's not practical. And so many of these
questions can be dealt with by real world evidence,
observational data, and we do for the most part,
basically all the safety we have this like esophageal cancer trial, I mean they reviewed multiple studies and epidemiologic databases
to come to the opinion not to make a labeling
change for the class of bisphosphonates and add
esophageal cancer as a new risk. And so we do do it all
the time, have we put out a guidance on how we do it? We have not, and I think
it would be worth while for us to talk about this and
get some general principals for it, because I agree with you, if the pharmaceutical
industry saw a direct benefit to doing observational data
and submitted it to FDA as a pathway for a new
indication or expanding the population or
contracting the population that's indicated, or finding better doses or anything about their drug, I think that there would be more money
invested in this space and therefore potentially
greater development and greater confidence. I guess I'm not a Bayesian,
you said all physicians are Bayesian, I guess I'm, not a Bayesian because I'm a little
concerned when you say, when someone makes a
statement, if it shows a result we expect I'm gonna believe it. It's just that worries
me, I had an old professor teach me that research is a
little bit like skydiving, it's best done with your chute open, so it's best to have an open
mind about what can come out, and what we expect to be
the result is typically not the result when you do pure,
bench side research, so. – Having said that, I meant
when it's built upon an RCT, not just– – It wasn't you that said it actually, it was Rich that said it. – I would say we are all
Bayesians, you know we– – Well physicians
generally treat the patient in front of them based on their experience with their last patient, and not with all of their experience. And so we prefer evidence based
medicine, but quite frankly most of medicine's not evidence based, there's still a lot of
problems with physician's prescribing for instance anticoagulants for people with atrial fibrillation, and there maybe some
variants in patterns of care, but that's not one situation
where you would really want that to be a varied practice. But again getting back serious
about the task at hand, I think there is a role
for observational data, maybe not everyone within
FDA sees it that way. I think that we already use it,
as everyone keeps on saying, we already use it, I
understand that there's interest in us using it
more and we need to get more information about our
drugs and how best to use 'em and which ones are better than others, and what sequence is the
best, sequence and all that. We're not gonna be able to do
randomized controlled trials for all of this, so
we're gonna have to use the best available evidence,
and what crosses the line to make it into labeling? You had it happen with the vaccine, it didn't make it into the labeling? They didn't believe them. – [Voiceover] The Medicare
study did not show up in the label, the company came in with a separate randomized
study that ultimately wound up on the label, but the effect size was in the same direction,
it was the high dose was better than the regular dose. – Well there you have it, so even that we didn't put in the labeling (laughs). So where are the channeling? It went the opposite of what we expected. So I think we a ways to
go, and you know again it is an issue that needs to be worked on and I can't go any further than that. – Okay, I think we have a
minute or two, I'd like to see if there's any questions in
the audience for the group. Bob. – Not a question, but look, we already can, under our rules, use historically controlled
data, an that's what this is, that's what all of these
are, that's what they are. So we can use them if we're persuaded. We're usually not
persuaded when the effect is modest and it's not very visible. But the suggestions Marc made,
which we never see mind you, are very good at helping people do it, one is doing it twice
showing the results the same, having a protocol for
one of these studies, having a protocol, never. And this is not a new
thought, I wrote an editorial in JAMA in 2000 or there abouts, saying you should always replicate your historically controlled trial, but it's not commonly done. So the other thing is,
we take into account what we know already, I mean if you know our evidence document, it
says, when you're looking for a second claim that's very similar to the previous ones, we accept one study, what's more Bayesian than that? We have a law that says we
ca rely on a single study plus confirmatory evidence,
they didn't write it as Bayesian but that's sort of what it is, you've got priors, you've got beliefs that they all fit, and the
idea of modifying a claim slightly when you already, as Rich said, have controlled trials
showing that it works, it's intuitively obvious
that you don't need quite as much data, but there's
no experience with this, I'd be very interested in seeing
all the things that people want to do to refine
treatment of heart failure and stuff like that,
and what kind of studies might expand current
labeling to help you do that. I mean just for one
thing, all of our labeling on heart failure drugs
assumes that you're on all of the other drugs, because that's how all the studies were done,
I'm sure in the real world lot's of people haven't
been on a dietetic, lots of people haven't
been on the other things. Maybe there's a way to look
into that, that would be very informative to us, it
would help modify our labeling but some real world examples
where people do this and then replicate it, that
just seems like a great idea. For what it's worth, studies
don't always come out the same, Rich knows more about it than I do, but the OMAP analysis
of the data in Sentinel was very distressing
because the results came out different in every site all the time, so is that a problem
or is that overcomable by doing it right or I don't know. But we already can use
persuasive historical control. – [Greg] So we are running out of time but any quick reaction to. – I'd say one of the
very illuminating things about this conversation is the
fact that we've been talking about the fact that there
are degrees of evidence, and yet regulation tends to force us into a black and white classification of it, so as long as we're clear
that there's a gradient and we need to know where
we are in the gradient. Then the hard job is the FDA has to decide where on the gradient do you decide that white has changed to black. – [Greg] Marcus. – So I'll go back to, and
there's a couple of points been made today about not
really, no distinction between efficacy and effectiveness
and no real distinction between randomized controlled trials and pragmatic clinical
trials, there sort of, especially with randomization being. And so here's a consequence
of us not being, drawing that line, not
we, the FDA not drawing that distinction is that if
any of you have ever gone back into a pharmaceutical
company, starting out with a pragmatic idea,
and then take that back into the development process within that pharmaceutical company
and try to keep it pragmatic, or real world in any way, you'll realize that the gravitational pull back to a randomized traditional
clinical trial is extraordinary, it almost, you never
survive those, you have the greatest concept, and
immediately, not long after, it gets pulled right back
into a traditional design because of their concern with the FDA. And so the point is is
that if you really are as open to that, then
be clearer about that, write better language about
it, draw some distinction to send that message that
these are a little different, and we're good with that being different, and specifically when that
might be because it creates, actually it creates that
black and white scenario that we're inside of these organizations because of the fear of the
FDA, and I think that's what we've gotta try to overcome in part with the messaging behind it. – Can I say if pragmatic
means randomized trials with fewer exclusions
and things like that, what used to be called
the large simple trial, we don't have any reservations about that, we never have, so I'm not sure what everybody's worried about. – [Marcus] This is quite different. – So our perspective from the FDA's side, and we meet with companies
at end of phase two to discuss the phase
three development trial, is we are encouraging
no exclusion criteria, and they want those
because they're fearful again back to, most drugs
that end up being approved have a small incremental effect size, if you had a drug, so let's go
back to what was being asked a moment ago, if there was a
drug for metastatic melanoma, that had 100% cure rate at
one year, we would approve, we would be committing
grievous harm by not getting that on the market as quickly as possible. So again, it's understanding
is there a confounder to death, no, is there a,
you know people don't die, if historically everyone
was dead between six months and 12 months, and now
everyone's alive past one year, and no evidence of disease. And you have the historical
control, we would just, if it was a single arm,
totally observational from clinical practice,
we would have a hard time not approving that, the problem is is that you take a BRAF inhibitor that's approved for melanoma lets say, and you
move it into a new tumor type you do the ASKO study, the
taper trial where thy have evidence from clinical experience
in a group of oncologists using an off label for a different tumor, and it results in a two
month greater the survival of progression free survival
if you want, either one, then what you would have
expected, what do we do with that? We have no idea what to make of that. And in a randomized controlled
trial, that maybe all it does but we have a randomized controlled trial and we're confident in
that result, so again it's the effect size and
understanding that it's almost a never can be event,
now is there something between that? Obviously yes, but that's
where the difficulty is, is figuring out where
that something between is. – Okay, great discussion, we're
gonna continue many of these tomorrow, I know Mark
is gonna take us home with a final conclusion,
thank you, thank you. (audience applause) – We are just finishing up
now, I wanna thank the panel for, as Greg said, a great discussion. And thank all of our panelists today, and all of you for your
contributions to this meeting on enhancing the application
of real world evidence in regulatory decision making. We heard a general agreement
that the increasing availability of real
world data and methods to turn it into evidence
presents some opportunities, important opportunities,
for improving regulatory decision making in patient care. But we also noted some
challenges around data collection and causal inference, and
this wide range of methods that can be applied to
real world data settings from retrospective observational studies all the way through pretty
sophisticated randomized controlled designs. And those methods are becoming clear, suggested that there was
a set of identifiable and potentially
addressable issues in using real world evidence to
address regulatory needs to help promote better
evidence for regulatory decision making, as
well as better evidence for doctors and patients to use in care. Tomorrow we are going to
focus more on some of these policy implications and
on how to accelerate the development of a better infrastructure for decision making and
evidence development from real world settings. So we hope you'll join us
for that, it's same place, same time, I do wanna
say that we're scheduled to have a half day meeting
and the weather is scheduled to include some snow
and other complications, we are keeping an eye
on it, and it's always, in Washington DC, I never
wanna say these storms aren't gonna amount to much, but
the forecast at this point doesn't look like it
will be very disruptive for activities like our conference, if that changes we'll
notify everyone about any adjustment in our start
time or anything else, but at this point we're
definitely planning to go ahead as scheduled
tomorrow morning right here, 9 o'clock, given that we are web casting and we can have people join remotely if necessary, we don't we're
gonna need to make any changes and we hope that all of
you will be able to join us as scheduled in the morning. The schedule tomorrow
includes sessions on building a 21st century evidence
development infrastructure, what it should look
like, how it can address some of the challenges in
building on the progress that we've heard about today. We'll talk about incentives
and policy options to support this, and you
heard just on the last panel a lot of agreement about
potentially more clarity about how this evidence
can be used in regulatory decision making contributing to that. And we'll talk about engaging providers and patients as partners
in this effort as well. So that's all coming
tomorrow, before we finish today I wanna quickly thank our speakers for their contributions,
especially those who were members of our planning group for this
set of real world evidence activities, and that planning group which has contributed a
lot of expertise and time over the last few months
includes Janet Woodcock, Jonathan Jarrow, Lisa
LaVange, Melissa Roberts, Sally Oken, Rich Platt, Adrian Hernandez, Sean Tunis, Marcus Wilson, Marc Berger and Mark Overhaid. Also wanna thank our staff
from the Duke Margolis Center who contributed to the
planning and execution of these events including Greg Daniel who you just saw, Morgan
Romine, Adam Atton, Pernov Aurora, Jonathan
Bryan, Joanna Klassman, Christina Flores and Brittany Weisgater. And most of all I'd like to
thank all of you for attending and for the thoughtful
comments and contributions throughout the day, all of
the event material from today will be available on our website. There will be some further
steps from this conference, we'll talk more about those tomorrow. In the meantime have a great
evening and we hope to see you all in the morning,
thank you very much. (audience applause)

Leave a Reply

Your email address will not be published. Required fields are marked *