Background: Noninferiority testing is used to demonstrate that a new treatment is not unacceptably worse than an existing treatment. Such analyses are useful, for example when placebocontrolled studies are unethical, or when there may be other considerations (e.g. convenience, cost) where the new treatment has an advantage. Prospective, noninferiority trials in orphan diseases are difficult to coordinate because they often require large sample sizes to detect small margins of difference between treatments. In this report, we present a retrospective study of noninferiority testing in the rare disease, Guillain–Barre syndrome (GBS).
Methods: Metaanalysis results of PE versus a control group (n=623) were used to derive noninferiority margins for two endpoints:
Results: For endpoint 1, the noninferiority margin of the risk ratio was 0.865. The risk ratio of IVIg versus PE was 1.08 (95% confidence interval CI: 0.94 to 1.23). Since the lower bound of the CI is above the noninferiority margin (0.865), IVIg can be considered noninferior to PE on this endpoint. For endpoint 2 assessing change from baseline on GBS disability scale, the noninferiority margin was 0.315. The treatment difference (IVIg – PE) was 0.02 (95% CI:0.25 to 0.20). Since the upper bound of 95% CI (0.20) is less than 0.315 (the noninferiority margin), IVIg can be considered noninferior to PE.
Conclusion: The results demonstrate noninferiority of IVIg to PE in GBS when the noninferiority margins are retrospectively applied. Retrospective noninferiority analyses may also be used in evaluation of treatment effects for other rare diseases.
Keywords: non inferiority margin, orphan disease, guillain barre syndrome, plasma exchange, intravenous immunoglobulin
Randomized, placebocontrolled trials are the standard method by which the efficacy of medical treatments is determined. However, in serious diseases where there is a known effective treatment, allocating one patient group to a placebo arm may be unethical. Furthermore, in some situations, a new treatment may not be expected to be more effective than an existing treatment on the primary endpoint, but may have advantages in terms of secondary endpoints, such as safety, convenience, compliance or cost.^{1}^{,}^{2} The noninferiority trial is a vital tool when evaluating the efficacy of a novel therapy compared with an existing therapy. It aims to demonstrate that the test product is not worse than the comparator by more than a prespecified, small amount. This amount is known as the noninferiority margin, or delta (Δ).^{1} Guidelines developed by the European Medicines Agency and the US Food and Drug Administration^{3} recommend predefining the noninferiority margin. This margin can be derived from previous studies using historical data, and the study medication is typically expected to retain at least 50% of the original treatment effect over placebo or the standard of care to be considered noninferior.
After a noninferiority margin is established, a prospective noninferiority trial is usually conducted to confirm the noninferiority of the new product when compared to the existing product. Noninferiority trials typically require considerably larger sample sizes than placebocontrolled trials.^{4} This is due to the fact that the margin of equivalence (noninferiority) is often much smaller than the treatment difference, which a placebocontrolled trial must be powered to detect. It is therefore important for noninferiority trials to have large sample sizes, and for this reason, trials of orphan drugs in rare diseases face significant challenges in terms of recruiting sufficient sample sizes to formally assess prospectively defined noninferiority and of completing the trial within a realistic timeframe.
Here, we present a practical method for demonstrating noninferiority of drugs for rare disease. This method is based on aggregated data from smaller studies that have been analyses in previously published metaanalyses,^{5}^{,}^{6} and is illustrated using the example of intravenous immunoglobulin (IVIg) compared with plasma exchange (PE) for the treatment of Guillain–Barre syndrome (GBS).
GBS is a rare inflammatory disease affecting the peripheral nerves and causing weakness, numbness, breathing difficulty and paralysis. The disease affects between 0.5 and 2 per 100,000 persons per year.^{6} Although still under investigation, the cause of GBS is believed to be an autoimmune response.^{5}^{,}^{6} In some patients, the condition can have a lasting impact after the end of its acute phase.^{5} Supportive care for GBS can include the administration of heparin, and the use of pressure stockings to prevent the onset of deep vein thrombosis in bedbound patients, along with the monitoring of pulse, blood pressure, autonomic disturbances and respiration. Rehabilitation focuses on exercise to encourage strengthening, proper limb positioning, posture and orthotics.^{7}
There are two effective immune therapies for GBS: PE, which involves separation of plasma from cells and reinfusion of those cells back into the patient, and IVIg, which uses antibodies purified from plasma that has been pooled from at least 1000 donors.^{5}^{,}^{6} Administering IVIg is simple compared with PE. PE requires access to two veins, of which one has to permit high flow volumes, and frequently necessitates the insertion of a central venous line, a PE machine and specially trained personnel. IVIg requires access to only a single peripheral vein and no special equipment or specially trained staffs are necessary. Consistent with the difference in ease of administration, a Cochrane Review found that the risk ratio (RR) of treatment being discontinued was 0.14 less in the IVIg than in the PE group (95% confidence interval CI: 0.05 to 0.36). In addition, there is some evidence that adverse events are more frequent with PE than IVIg.^{5}^{,}^{7}
The clinical benefits of PE in GBS as measured by improvement on the GBS disability scale developed by Hughes et al.,^{8} have been confirmed in a Cochrane Review,^{6} which included six randomized, controlled trials (RCTs).^{9}^{–}14 Few trials comparing IVIg with placebo have been conducted because PE was the standard of care when IVIg was introduced for GBS. However, a number of studies.15^{–}21 show that IVIg speeds recovery from GBS to a similar extent as PE, as concluded by a Cochrane Review.^{5}
Due to the rarity of GBS, the majority of studies comparing IVIg and PE has used small sample sizes with limited statistical power and were not formally designed as therapeutic equivalence or noninferiority trials.^{5} This may help to explain some inconsistency in the findings, and it is possible that some studies finding no significant difference between treatments reflect a lack of power to detect a significant difference rather than indicating true noninferiority. The Cochrane Review by Hughes et al.,^{5} thoroughly reviewed all individual studies and performed a metaanalysis, but did not formally assess therapeutic equivalence or noninferiority. The conclusion of no treatment difference cannot be automatically translated into either equivalence or noninferiority.
Given the strong safety profile of IVIg,22 as well as the convenience of its use in the clinic,15 the current analysis was undertaken to formally establish the noninferiority of IVIg to PE using existing studies from comparisons of PE versus supportive care, where much more data are available. A Cochrane Review of the benefits of PE in GBS^{6} was used to establish the noninferiority margin, and then this derived noninferiority margin was retrospectively applied to results from a Cochrane Review of IVIg benefits in GBS^{5} to demonstrate the noninferiority of IVIg to PE.
In noninferiority trials, one of the critical steps is to define the noninferiority margin. This margin can be derived from previous studies using historical data, and the study medication is typically expected to retain at least 50% of the original treatment effect over placebo or the standard of care to be considered noninferior. In the example in GBS, the noninferiority margin was derived using results from the metaanalysis of previous trials comparing PE versus supportive care.
Raphael et al.,^{6} conducted a metaanalysis of five studies (623subjects, a summary of the included trials is shown in (Table 1). The RR of PE versus supportive care for the proportion of subjects with improvement of at least one grade on the GBS disability scale was calculated as 1.64 (95% CI: 1.37 to 1.96) (Table 2). For the proportion of subjects with improvement of at least one grade on the GBS disability scale, the noninferiority margin for the RR can be derived using the fixedmargin method or the two 95% CI approach.^{1}^{,}^{3}^{,}23^{–}24 For the purposes of this study, the new treatment is IVIg and the active control is PE. The fixedmargin approach involves determining the treatment effect (M1) of the active control group over the placebo (or no treatment) group by using the lower bound (or upper bound, depending on the direction) of the 95% CI from previous placebocontrolled trials or metaanalyses of trials. i.e., M1 = 1.37 which is the lower limit of 95% CI of the RR. Typically, preserving at least 50% of M1 from active control versus placebo (or no treatment) is recommended.^{3} i.e., RR of IVIg versus no treatment is greater or equal to $1+(M11)\times 50\%=1.185$. The noninferiority margin (M2) is excluded by ensuring that the lower bound (or upper bound, depending on the direction) of the 95% CI is >M2. i.e., RR of IVIg versus PE is greater than $M2=\frac{1.185}{1,37}=0.865$.

Trial design 
Participants 
Interventions 
Endpoint 
Notes 
Greenwood^{11 } 
RCT, multicentre, open, parallel groups 
n=29, acute GBS only All ages No mild forms 
PE versus supportive care Five PE in 10 days, 55 mL/kg per PE 
1, 2 
Unblinded 
McKhann^{9} 
RCT, multicentre, open, parallel groups 
n=245, acute GBS only All ages No mild forms 
PE versus supportive care Three to five PE in 5 days, 40 mL/kg per PE

1, 2 
Unblinded
SD of the mean was not available; mean difference could not be estimated 
Osterman^{12} 
RCT, multicentre, open, parallel groups 
n=38, acute GBS only Adults only No mild forms 
PE versus supportive care Three to eight PE in 7 to 10 days, 3 L per PE 
1 
Alternate randomization Unblinded Disability scale used was different from that used by all other trials; omitted from analysis of endpoint 2 
Raphael^{13} 
RCT, multicentre, open, parallel groups 
n=220, acute GBS only Adults only All forms 
PE versus supportive care Four PE in 8 days, 3 L per PE, diluted albumin or fresh frozen plasma 
1, 2 
Unblinded 
Raphael^{14} 
RCT, multicentre, open, parallel groups 
n=91, acute GBS only Adults only Mild forms 
PE versus supportive care Two PE every other day, 3 L per PE, diluted 
1, 2 
Unblinded 
Table 1 Trials of PE versus supportive care included in metaanalysis of endpoint 1 and 2^{6}
GBS, guillainbarre syndrome; PE, plasma exchange; RCT, randomised controlled trial; SD, Standard deviation
PE 
Control (supportive care) 
Statistical test 
Point estimate (95% CI) 
M1 (PE/Control) 
M2 (IVIg/PE) 
Endpoint 1: The proportion of subjects with improvement by at least one grade after 4 weeks 

176/308 (57.1%) 
110/315 (34.9%) 
Risk ratio 
1.64 (1.37 to 1.96) 
1.37 
0.865 
Endpoint 2: Mean disability grade improvement after 4 weeks 

N=290 
N=295 
Mean difference 
0.89 (1.14 to 0.63) 
0.63 
0.315 
Table 2 Metaanalysis results and derivation of M1 and M2
CI, Confidence interval; IVIg, Intra Venous immunoglobulin; PE, Plasma exchangeA further metaanalysis was also performed on four studies (585 subjects) (Table 2) to assess change from baseline to week 4 using the GBS disability scale (endpoint 2). The treatment difference (PE–supportive care) was calculated as0.89 (95% CI:1.14 to 0.63).^{6} For the mean change from baseline on the GBS disability scale, the noninferiority margin for the treatment difference can be derived using the fixedmargin method or the two 95% CI approach.^{1}^{,}^{3}^{,}23^{–}24 The fixedmargin approach involves determining the treatment effect (M1) of the active control group over the placebo (or no treatment) group by using the upper bound (or lower bound, depending on the direction) of the 95% CI from previous placebocontrolled trials or metaanalyses of trials. i.e., M1 =0.63 which is the upper limit of 95% CI of the treatment difference. Preserving at least 50% of M1 from active control versus placebo (or no treatment) is recommended.^{3} i.e., treatment difference of IVIgno treatment is less or equal to $M1\times 50\%=0.315$. The noninferiority margin (M2) is excluded by ensuring that the upper bound (or lower bound, depending on the direction) of the 95% CI is < $M2=0.315(0.63)=0.315$. The detailed derivation is shown below.
Endpoint 1: Improvement of at least one grade on the GBS disability scale
The treatment effect (M1) for PE versus Control (supportive care) is defined as the lower limit of the 95% CI of the RR.
$M1=\frac{P(PE)}{P(Control)}=1.37$ (i.e. lower limit of CI ) (Table 2), where P is proportion of subjects with improvement of at least one grade on the GBS disability scale. Assuming a need to preserve 50% of the treatment effect of PE versus Control to show that IVIg is noninferior to PE, the treatment effect of IVIg must be:The noninferiority margin (M2) for IVIg versus PE can be calculated as follows:
$\frac{P(IVIg)}{P(PE)}=\raisebox{1ex}{$\left[\raisebox{1ex}{$P(IVIg)$}\!\left/ \!\raisebox{1ex}{$P(Control)$}\right.\right]$}\!\left/ \!\raisebox{1ex}{$\left[\raisebox{1ex}{$P(PE)$}\!\left/ \!\raisebox{1ex}{$P(Control)$}\right.\right]$}\right.=\frac{1.185}{1.37}=0.865$
Therefore, the noninferiority margin of the RR is 0.865, and IVIg is noninferior to PE if the lower bound of the 95% CI of the RR of IVIg versus PE is greater than 0.865.
Hughes et al.^{5} conducted a metaanalysis of six studies (567 subjects).^{15}^{–}^{20} An overview of the trials included is given in (Table 3). The RR of IVIg versus PE was 1.08 (95% CI: 0.94 to 1.23) for the proportion of subjects with improvement of at least one grade on the GBS disability scale. Since the lower bound of the 95% CI (0.94) is above the noninferiority margin (0.865), IVIg can be considered noninferior to PE on this endpoint (Table 4).

Trial design 
Participants 
Interventions 
Endpoint 
Notes 
van der Meche^{20} 
Randomized, national, multicentre, parallel group 
Adults and children N=150 
IVIg 0.4 g/kg daily for 5 days versus PE 200 to 250 mL/kg over 7 to 14 days 
1, 2 
Unblinded 
Bril^{16} 
Randomized, singlecentre, parallel group 
Adult N=50 
IVIg 0.5 g/kg daily for 4 days versus PE 40 to 50 mL/kg on five occasions over 7 to 10 days 
1, 2 
Unblinded 
PSGBS Study Group^{15} 
Randomized, international, multicentre, parallel group 
Adult N=383 
IVIg 0.4 g/kg daily for 5 days versus PE 250 mL/kg over 8 to 13 days versus PE followed by IVIg 
1, 2 

Diener^{17} 
Randomized, multicentre, parallel group 
Adults (possibly children) N=74 
IVIg 0.4 g/kg daily for 5 days versus PE 40 to 50 mL/kg on five occasions within 14 days versus immune absorption on five occasions (4000 mL on two occasions and then 2000 mL on three occasions) within 14 days 
1, 2 
Unblinded 
Nomura^{19} 
Randomized, multicentre, parallel group 
Adult N=47 
IVIg (Teijin brand) 0.4 g/kg daily for 5 days versus PE total 200 to 250 mL/kg in up to seven sessions over 4 weeks 
1, 2 
Unblinded 
ElBayoumi^{18} 
Open, parallelgroup, randomized, controlled trial 
Children (age not specified) with GBS requiring artificial ventilation 
IVIg 0.4 g/kg daily for 5 days versus one plasma volume PE daily for 5 days

1 
Unblinded 
Table 3 Trials of IVIg versus PE included in metaanalysis of endpoints 1 and 2^{5}
GBS, guillain–barre syndrome; IVIg, intravenous immunoglobulin; PE, plasma exchange; PSGBS, plasma exchange/sandoglobulin guillain–barre syndrome

IVIg 
PE 
Statistical test 
Point estimate (95% CI) 
Noninferiority margin (M2) 
Noninferiority of IVIg versus PE 
Endpoint 1: The proportion of subjects with improvement by at least one grade after 4 weeks 


177/293 (60.4%) 
154/274 (56.2%) 
Risk ratio 
1.08 
0.865 
Yes 
Endpoint 2: Mean disability grade improvement after 4 weeks 


N=273 
N=263 
Mean difference 
0.02 
0.315 
Yes 
Table 4 Metaanalysis results and determination of noninferiority
CI, confidence interval; IVIg, intravenous immunoglobulin; PE, plasma exchangeEndpoint 2: Mean change from baseline on the GBS disability scale
The noninferiority margin can be derived using the two 95% CI approach.^{1}^{,}.^{23} Treatment effect (M1) for PE versus Control (supportive care) is defined as the upper limit of 95% CI of treatment difference.
M1 = PE  Control =  0.63 (i.e. upper limit of CI) (Table 2)
To demonstrate noninferiority of IVIg versus PE, the treatment effect of IVIg must preserve 50% of M1.$IVIgControl=0.63\times 50\%=0.315$
The noninferiority margin (M2) for IVIg versus PE is calculated as follows:
$IVIgPE=(IVIgControl)(PEControl)=0.315(0.63)=0.315$
Therefore, 0.315 is the noninferiority margin for the mean change from baseline on the GBS disability scale. IVIg can be considered noninferior to PE if the upper bound of the 95% CI of mean difference of IVIg versus PE is less than 0.315.
Hughes et al.,^{5} Conducted a metaanalysis of five studies (536 subjects) (Table 3).^{15}^{17}^{,}^{19}^{,}^{20} The treatment difference (IVIg–PE) was0.02 (95% CI:0.25 to 0.20). Since the upper bound of 95% CI (0.20) is less than 0.315 (the noninferiority margin), IVIg can be considered noninferior to PE (Table 4).
This analysis provides an illustration of how data collated from a number of small studies may be used to enable retrospective noninferiority comparisons of treatments for rare diseases, for which it is often impossible to have adequate sample sizes for prospectively designed noninferiority studies. In the example analysis presented here, the treatment effect of IVIg for GBS was compared with that of an established treatment (PE) for this condition with efficacy proven in RCTs. Based on this evaluation, we can conclude that IVIg is noninferior to PE for the treatment of GBS.
Posthoc analyses of noninferiority have limitations, such as differences in study design, treatment regimens and patient characteristics across trials. Ideally, a prospective clinical trial should be undertaken to assess the noninferiority of IVIg. However, based on the derived noninferiority margin in this study, a sample size of more than 462 subjects would be needed without drop out consideration to conduct a prospective clinical trial to assess the noninferiority of IVIg versus PE with 80% power for endpoint 1 assuming a rate of 60% for IVIg and 56% for PE. Similarly, a sample size of more than 622 subjects would be needed without drop out consideration to conduct a prospective clinical trial to assess the noninferiority of IVIg versus PE with 80% power for endpoint 2 assuming no treatment difference between IVIg and PE and standard deviation of 1.4 for both treatments, which would be a considerable challenge for a disease that is as rare as GBS. In addition, since the previous studies have showed the benefit of the IVIg in treating GBS, it is quite challenging for a sponsor to perform a large scale, prospective noninferiority study. Instead, this retrospective assessment made use of previously collected data, permitting noninferiority of IVIg compared with PE to be demonstrated.
A 1997 study conducted by the Plasma Exchange/Sandoglobulin Guillain–Barre Syndrome (PSGBS) Trial Study Group established that IVIg is therapeutically equivalent to PE. Treatments were considered equivalent if the 95% CI of the difference in mean improvement in GBS disability scale after 4 weeks between the two groups excluded a true mean difference of more than 0.5 of a grade. Although a change of 1.0 of a grade could be reliably measured and was clinically meaningful, a mean change of less than 0.5 of a grade was considered to be insignificant; however this equivalence value is subjective, and is not based on data from randomized clinical trials. In the current study, a noninferiority margin of 0.315 of a grade was derived using retrospective data, this is therefore more stringent than the equivalence margin of 0.5 of a grade used in the PSGBS Study Group study.^{15}
This study demonstrates that, in the case of rare diseases where formal prospective noninferiority design is rendered unfeasible by the large sample sizes required, retrospective data analyses can be undertaken to ascertain whether a new treatment meets criteria for noninferiority. We recommend that this strategy be considered in other orphan diseases as a practical means to establish noninferiority of treatment efficacy when prospectively designed noninferiority studies are not feasible.
Using the example from GBS, this study presents practical methodology for retrospective noninferiority analyses which can be used in evaluation of treatments for rare diseases where formal, prospective noninferiority studies are not possible.
The authors are all employees of Grifols Inc., manufacturer of Gamunex®C and Flebogamma® (both IVIg products).
CD and KH initiated the idea for developing this paper. CD and JC carried out the calculations and performed the statistical analysis. CD drafted the manuscript, the final version of which was reviewed and approved by all authors.
None.
Author declares that there are no conflicts of interest.
© . This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and build upon your work noncommercially.