效应量的作用有两个方面,一是弥补了统计检验的不足,二是使得效应有可比性。结合统计显著性和效应量,才能得出适当的统计结论。效应量应当具有一些基本性质,包括与测量单位无关、单调性、不受样本容量的影响。国际上流行的中介效应量κ平方就是因为缺乏单调性而引发质疑和研究,从而被彻底终结了其作为中介效应量的合法性。R平方型中介效应量同样有缺乏单调性的问题。文末讨论了如何报告中介效应量以及有待研究的问题。
Since Preacher and Kelley(2011) proposed kappa-squared( ? 2) as a mediation effect size measure, it has become popular in mediation analyses, as shown by its appearance in research literature(e.g., Athay, 2012; Field,2013). Furthermore, a special on-line calculator for computing kappa-squared also became available, making its use in research practice very convenient. Unfortunately, Wen and Fan(2015) recently demonstrated both logically and mathematically that kappa-squared has fatal flaws in its definition and calculation, which should put an end to its use in mediation analysis. This article evaluates the appropriateness of the current mediation effect size measures, based on the considerations of the expected characteristics of an effect size.Effect size plays at least two roles in research practice. First, it provides supplemental information that compensates for the limitation of null hypothesis significance testing(NHST). Second, it makes the research findings comparable across studies in which different measures may have been used. For example, in the context of difference analysis involving two groups, the mean group difference is often the quantity of our research interest. When statistically "significant" difference is revealed by NHST, we are informed that the difference between the two group means is statistically different beyond what would be expected as a result of sampling error; but we are not entirely clear about how large the difference is. Primarily for this reason, it has been advocated that an effect size measure be used to supplement the statistical NHST(Fan Konold, 2010;Wilkinson the Task Force on Statistical Inference, 1999). Why can't we directly report the effect(such as the mean group difference) that represents the original quantity of interest? It turns out that the original quantity(e.g., mean group difference) is usually not comparable across studies because different measures across the studies usually have different and arbitrary measureme