Skip links

“No Proof Probiotics Work”: The Confusion Caused by Evidence-Based Medicine’s Futile Hunt for the One True Effect

I was recently asked for my comments on a Daily Mail article that made a claim that probiotics may not actually work. They did so on the basis of a recently published study in the Journal of Gastroenterology, which took a deep dive into all the available evidence to provide a conclusion on whether these oft-used agents actually work, with the result questioning one of the most accepted doctrines in the nutrition space.

As a practitioner who has observed profound improvements in gut health, eczema, sleep and more following the use of probiotics, I took a look into the study itself. To briefly summarize, the researchers sought to determine how effective probiotic supplementation is for gastro-intestinal issues and so, to do so, they ran a meta-analysis of 82 different randomized controlled trials. Together, these trials tracked 10,332 individuals diagnosed with some form of IBS. After tallying their results, they coarsely signed off by noting that there were ‘some’ bacterial strains that ‘may’ be beneficial in IBS, but that they had little confidence in these results.

Of course, such conclusions were easily crowbarred into punchy headlines:

And, understandably, such headlines induce frustration in the public, driving sentiments like: I wish they’d make up their bloody mind about what works and what doesn’t.  And: so probiotics were good for us yesterday but now they’re not?

While such contrarian headlines are reliably good for clicks, they leave the public disillusioned with a profession that seemingly can’t make up its mind on probiotics (or any other factor, for that matter) and frustrated with the confused messaging (‘they were useless, then they were miraculous, now they’re useless again). Most of all, it leaves individuals unable to confidently take steps to address their own health aims.

As a professional in this area, I can acknowledge that there are areas of human physiology that will always remain complex. Yet the potential value of probiotic use stands out as one subject that need not be confusing. I don’t know a single nutritional therapist that would disagree with the premise that probiotics useful tool when addressing gut health.

And while it’s very easy to say, “the conclusion is wrong and should be discarded”, or even to point out the downsides in taking guidance from over-caffeinated academics that have never worked with a single patient, papers like these represent an opportunity to actually look into what the data says, what they actually concluded and then consider why we consistently see contrasting headlines. In doing so, we can understand both what probiotics do and how they can help but, perhaps more importantly, in laying out the fatal flaws that are deeply embedded into the Evidence-Based Medicine paradigm used to reach these conclusions (which are then duly over-summarized by the media as clickbait).


What Did the Study Actually Say?

The first thing worth pointing out is that the researchers didn’t exactly say that probiotics don’t work. Their precise words: “Some combinations of probiotics or strains may be beneficial in IBS. However, certainty in the evidence for efficacy by GRADE criteria was low to very low across almost all of our analyses.”

One could be forgiven for thinking that the results were negative. They were not; the results showed an impressive effect. The average response across the trials was a 22% reduction in symptoms, which is especially impressive given how many of the trials were conducted on populations with no fair chance to respond (such as like bloating, classically driven by excessive bacterial activity and therefore not something you want to throw more bacteria at).

There were further findings that could be teased out; poor results for bloating generally (what a surprise!), that there were different reductions for different types of probiotics, and that there were much better results in IBS-D (diarrhoea-dominant) compared to IBS-C (constipation).

So why the negative tone? The researchers misgivings were based on the possibility of publication bias and also on how the trials conformed to the GRADE criteria, a ‘framework for grading the quality of evidence’; most of the trials ‘failed’ on that they did not fully account for all patients that were lost to follow-up when reporting the results.

What does this mean? It relates to the problem that is endemic in all scientific research conducted on humans; some people have bad reactions to the treatment and withdraw from the trial, some people move out of the area, others lose interest. To over-summarize, the GRADE criteria determines that researchers should publish results in a specific format. Very few papers actually do.

So we could go deeper on this issue. And it does seem valid to question why were are so concerned about papers on probiotics getting low grades for data presentation when there were no ‘red flags’ in the number of drop-out (while the famous BHF Heart Protection Study, in which 26% of those taking statins dropped out in the first month, remains highly quoted and continues to inform clinical practice). But we will not.

Because this would come down to arguments for and against whether we should use this GRADE criteria to help determine if the body of literature, the one that indicates that probiotics reduce IBS symptoms by 22%, should be trusted.

But that’s the wrong question.

Ask the wrong questions, get the wrong answers


The basic issues here lie in the issue that plagues almost all meta-analyses, which is to say the questions posed and the over-reliance on statistical averages.

What are the problems with the questions posed? First of all, the study is presented as assessing the benefit of probiotics for gastrointestinal issues. So, while it does include sub-analyses on different strains in the results section, the conclusion section is tasked with the awkward challenge of summarizing the divergent findings on ‘probiotics’ as one entity. Worst of all, it uses ‘IBS’ as the studied condition.

IBS is not a condition. It is simply a languid term that tells us nothing about the underlying issues, other than there are symptoms occurring somewhere between the mouth and the anus. In other words, those with bloating get a diagnosis of IBS, those who report constipation get a diagnosis of IBS and those who present with diarrhoea get a diagnosis of IBS. The label provided is the same whether the observed patterns occur due to stress, appear after specific foods or are present on a constant basis. Such symptoms may be driven by excess bacteria thriving in the wrong places (eg. small intestinal bacterial overgrowth, aka SIBO), others by raw imbalance in the bacterial populations (eg. dysbiosis). Some of these issues are not driven by bacterial imbalance (eg. acute stress will shut down intestinal activity regardless of the composition of our microbiome, undue tension can drive dysfunction of the ileocecal value no matter what the microbial populations). Others are caused via difficulties in breaking down food compounds (eg. lactose intolerance, reactions to lectins, etc). Other tend to occur due to interactions of microbial balance with other factors (eg. food intolerances, whereby the interactions of the bacteria with the gut lining can prime the immune system into overactivity, with the problems then manifesting under the pressure of further triggers, such as dietary patterns, immune regulation and efficiency of digestive breakdown).

Despite having different causes and different symptoms, all of the above are diagnosed as ‘IBS’. This is not a sound basis upon which to base a study.

Nonetheless, researchers did just that. Primary outcomes were set as the effects of probiotics, compared with placebo, on persistence of global IBS symptoms, abdominal pain, or abdominal bloating or distension after completion of therapy.

When these are the questions, it becomes purely academic how well highly the papers score for data presentation or on potential publication bias. Such analyses only serve to provide confidence in the answer, but the answer is irrelevant. Because any answer produced this way will only inform us on what we can expect if you randomly selected a probiotic product by lucky dip and gave it to someone with unidentified digestive issues.

No-one does this. Thus, no-one needs to know the statistical likelihood of such a haphazard intervention working, and no-one needs to deploy standardized grading tools to determine the confidence we should have in the figure calculated.


Evidence-Based Medicine and The Infatuation with Meta-Analyses

This study crunches a huge amount of trial data and, as always, there are a number of interesting questions we can pose.

Whenever we see a spread of results in trials that take a look into the effect of ‘probiotics’ on ‘IBS’, we end up with useless conclusions if we average out the different results and treat this average as The One True Effect. By definition, we are looking at the average effect of multiple probiotic formula across separate problems with different causes and varying symptoms, the one link between them being that someone in a white coat has awarded them the label of ‘IBS’.

This is why meta-analyses like these serve a logical role in assessing the relative efficacy of similar drugs in a specific disease, by which I mean a medical issue that is defined by a particular error of metabolism that drives distinctive symptom patterns (for just a few examples, consider Wilson’s disease, that results in cellular accumulation of copper; Zellweger syndrome, which affects processing of fatty acids; or Sickle Cell Anaemia). However, meta-analyses are limited to a statistical role outside of these circumstances.

What I mean here is that they can still serve a role in public health decisions but, even then, this only provides credible data if they are comparing the effectiveness of two rival drugs in a defined population (eg. comparing three brands of beta-blockers in a population with high blood pressure) and, even in these circumstances, any conclusions can only serve cost/benefit analysis at a public health level; a well-conducted meta-analysis can allow healthcare trusts can make accurate predictions on what percentage of the population are likely to respond to each intervention as well as the expected rate of side-effects and construct policies accordingly (whether each drug is worth paying for, which drug should be considered ‘first line therapy’, etc).

However, any conclusions that are produced this way do not translate to individual care. Yet, according the rules of Evidence-Based Medicine (the prevailing paradigm that holds modern healthcare in its grasp), meta-analyses are the ‘best’ form of science and, when available, should automatically take precedence over other types of evidence. See the ‘EBM Hierarchy’:

Just one problem. Treating individuals is about providing an intervention that achieves a therapeutic response in that individual, not pushing something on them because it statistically outperformed a placebo in a population that shared some key symptoms.

Decisions for individual care should be based not on WHAT is effective, but WHEN something is effective and in WHO. This will always involve consideration of HOW an item works (the mechanism) and what the root cause is in any specific problem we are seeing.

Let’s use dizziness as an example. Dizziness can be experienced and described but not measured, making it a perfect example of a condition that can be labelled but not a disease. Let’s oversimplify things and say, in a randomly-selected sample of 10,000 dizzy people, 40% are suffering due to a lack of hydration, 20% due to neurotransmitter imbalances, 10% due to traumatic injury, 10% due to postural issues affecting blood flow in the cerebellum, 9% due to the side-effects of a medication and the remaining 1% are simply dizzy because they have spent too long spinning round in their office chair.

What is the ‘best’ intervention here? It obviously depends on which sub-group you fall into, that is to say, what the root cause is in your case. An Evidence-Based Medicine approach would prioritize the conclusions made from meta-analyses conducted on Dizziness Disorder, and find that the most effective intervention is electrolyte drinks because, on average, they recorded the biggest improvement versus placebo in this population. But this was only a function of the population under study, in that it contained 4x more people who were primed to respond to this particular intervention versus, say, chiropractic adjustments.

What I hope is obvious from the above is that we better serve individuals by understanding why the intervention helped. This goes back to HOW, WHEN and in WHO. If you began experiencing dizziness after you picked up a neck injury playing rugby and are already well hydrated, then the likelihoods of your dizziness improving from the ‘most effective’ intervention (electrolyte drinks) is nearly zero. However, we can be confident that taking steps to resolve this postural issue will prove very helpful, even through it is ‘less effective’ according to the meta-analysis. If you have spent too long spinning around clockwise in your office chair, then you can expect quick relief from simply spinning anti-clockwise (even though your doctor has repeatedly explained to you that multiple meta-analysis have found that ‘Spinning Anti-Clockwise On An Office Chair Is Not Effective To Resolve Dizziness’).


The Average Response is Not The One True Effect

While a little facetious, the last point once again speaks to EBM’s insistence on treating the average response as The One True Effect. If you have been awarded the same label as the rest of the group on the basis of similar symptoms, yet the underlying cause is particularly unusual, then the existing paradigms are mathematically guaranteed to sell you short. If only 1% of the group are suffering due to a specific cause, then you can have a treatment that is fully effective for this minority but, no matter how spectacular the effect, it hardly impacts the average response across the entire population under study. And, thus, this intervention – the one that may be exactly what you need – is judged “ineffective”.

We see this play out especially in the case of multifactoral maladies, most famously Chronic Fatigue Syndrome. EBM insists there is no effective treatment options for the condition, as no trial has shown good results in the literature.

But consider this. Perhaps you have ME/CFS and your central nervous system is in a state of sympathetic dominance (‘fight-and-flight’) and you need to tend to mould exposure, adrenal output, Magnesium status and mitochondrial performance (which, in your case, calls for the supply of Carnitine and B1). Attend to these issues in a suitable order and at a suitable pace – and alongside steps that give adequate attention to the central nervous system conditions – and we can expect a great outcome.

Does this mean magnesium + carnitine + B1 + licorice root + anti-stress measures are effective for Chronic Fatigue Syndrome? Nope. Take the next 99 CFS patients, some of which are parasympathetic dominance (‘freeze’-type stress response), have mast cell issues, glutamate/GABA imbalances and difference mitochondrial blockages (lack of copper and B2), and others whose rate-limiting factors come in the form of allergies, dysbiosis and insufficient glutathione, etc etc, and they will not respond to the same plan.

In other words, what works well for you might only help 1% of the studied population and, as a result, can never have a chance of achieving a statistically significant improvement across the entire group. Such is the folly of treating the group average as the One True Response.


Are meta-analyses useless?

We should be careful to throw the baby out with the bath water. We can often advance our understanding of a topic by such simple steps as looking into which trials showed unusually good results and which did not. Then considering why. Was it the strain used? What was different about the population that responded versus those that did not? What does this tell us?

To highlight just one example, we can take a brief glance at the Forest plots provided for the trials that used ‘mixed probiotic’ formulas:

We can see straight away that there are five trials (Michael, 2011; Ko, 2013; Mezzasalma, 2016; Ishaque, 2018; Bonfrate, 2020) that reported substantially better results than the other 28. How much better? Each of these trials observed a 67-74% reduction in symptoms, to be exact, when the average response was a 28% drop. What is different about these particular trials?

Michael 2011 used a bifido and lactobacillus combination, in “IBS-D” (IBS with diarrhoea as the dominant symptom) and saw a 70% drop. Ko 2013 also used a bifido and lactobacillus combination, and also studied an IBS-D population. Of the remaining three trials to demonstrate great results, two of them – namely, Mezzasalma 2016, and Bonfrate 2020 – also used a bifido/lactobacillus combo and also conducted the trial on an IBS-D population.

We are immediately presented with questions as to why probiotics might regularly demonstrate such obvious improvements in diarrhoea, but fail to do so reliably in cases of constipation. But there is always nuance, and this comes in the form of an outlier study (Hod, 2017). In this trial, the researchers also provided a bifido/lactobacillus combination and also applied these to an IBS-D population. Yet, unlike the studies above, they did not see such a powerful reduction in symptoms. In fact, they saw a 32% increase in symptoms. What gives?

When presented with such obviously conflicting results, human nature often pushes us towards picking the study that appears correct and then finding some sort of flaw that allows us to comfortably dismiss the competing study as invalid (“they didn’t adjust for socioeconomic status” and “they used the wrong dose” being classic dismissals; they are often valid but, just as often, used for hand-waving).  And while it is true that some articles are just pure garbage – as the CDC have amply demonstrated repeatedly since March 2020 – there is often value in identifying why the researchers arrived at such different conclusions, starting with the population under study and the methodology.

Therefore, we want to ask: was there anything in the inclusion/exclusion criteria in the Hod study that would generate a population that respond differently to probiotics? Unsurprisingly, there was. This study selected individuals who had both a) raised CRP, a marker of systemic inflammation yet b) no inflammatory symptoms in the month prior to recruitment. This does not invalidate the findings in any way, but undoubtedly generates a particularly unusual group in that there is ‘something’ causing inflammation without falling into any of the usual bracket. Does this group have a strong leaning towards mast cell issues? Do such individuals have a high rate of chronic infections? If so, what underlying burdens leave them prone to these problems? How stressed must they be to sustain this metabolic set-up?

Of course, no such answers are available because, in the current scientific climate, it is very rare to find trials that measure the metabolic drivers in play in each individual. We have no access to the results of their stool tests; in almost all cases, no such measurement was even taken. We cannot know if the participants even had a fair chance of responding to probiotics; not one study screens for stomach acid production or supply of digestive enzymes / bile / etc (something any experienced practitioner would do by default before making recommendations for probiotics). We don’t know if the individual was even ready to invest in their digestive function (by which I refer to their autonomic tone, aka stress level, which can be quantified through tests for adrenal output and heart rate variability).

This remains the biggest ‘gap’ between conclusions produced through academics that lean on EBM paradigms versus learning made by practitioners from years of observing individual responses on the frontline, where fair consideration is made to both the cause of the intestinal issues and the factors that impact on conditions in the gut.

In short, it is not logical to take an individual who is suffering from rampant insomnia, has rampant nutritional deficiencies, demonstrates a total collapse of stress tolerance (aka ‘freeze’-type stress response) and provide them with a probiotic capsule to see if probiotics work.

As a result, this study follows many that have gone before it in that it provides some pointers (probiotics aren’t likely to help when there is already bacterial overgrowth, eg. SIBO and bloating) but leaves us with more questions than answers: what separated the responders from the non-responders (eg. what were the differences in baseline microbial populations between the two sub-groups)? What was the autonomic tone of those who didn’t respond versus those that did (eg. how many were subject to excessive stress), given the impact this has on the local environment in the gut? Were there obvious differences between the two groups in regards to the patterns of their diarrhoea (eg. constant versus stress-induced)?

In summary

This study follows many that went before it in that it focuses on an interesting area, assesses masses of trial data and deploys strong scientific methodology. And yet, because of its adherence to EBM orthodoxy, it is unable to tell us what we want to know and its legacy will ultimately be that it is used as media fodder to sow confusion and generate more clicks. However, there are two conclusions we can make:

  1. providing different types of probiotics in populations with an assortment of digestive conditions, brought on by a medley causes and with variable metabolic obstacles will see a mishmash of outcomes. Averaging these patterns in an attempt to establish The One True Effect is neither sensible or useful.
  2. probiotics are a cornerstone of treating digestive issues and remain one of the most reliable interventions open to those with such challenges, provided that the condition is driven by microbial imbalance and that the individual taking them had a fair chance to respond to them (which is the case only some of the time… more on this in Part II, where we look at the wins and the limitations when using probiotics on the frontline).


Leave a comment

This website uses cookies to improve your web experience.