Review The Learning Resources Related To Hypothesis Testing Meaningfulness And Statistical Significance
Question Description
As a scholar-practitioner, it is important for you to understand that just because a hypothesis test indicates a relationship exists between an intervention and an outcome, there is a difference between groups, or there is a correlation between two constructs, it does not always provide a default measure for its importance. Although relationships are significant, they can be very minute relationships, very small differences, or very weak correlations. In the end, we need to ask whether the relationships or differences observed are large enough that we should make some practical change in policy or practice.
For this Discussion, you will explore statistical significance and meaningfulness.
To prepare for this Discussion:
- Review the Learning Resources related to hypothesis testing, meaningfulness, and statistical significance.
- Review Magnusson’s web blog found in the Learning Resources to further your visualization and understanding of statistical power and significance testing.
- Review the American Statistical Association’s press release and consider the misconceptions and misuse of p-values.
- Consider the scenario:
- A research paper claims a meaningful contribution to the literature based on finding statistically significant relationships between predictor and response variables. In the footnotes, you see the following statement, “given this research was exploratory in nature, traditional levels of significance to reject the null hypotheses were relaxed to the .10 level.”
By Day 3
Post your response to the scenario in which you critically evaluate this footnote. As a reader/reviewer, what response would you provide to the authors about this footnote?
Scenarios are listed as follows:
1. The p-value was slightly above conventional threshold, but was described as“rapidly approaching significance” (i.e., p =.06).An independent samples t test was used to determine whether student satisfactionlevels in a quantitative reasoning course differed between the traditional classroomand on-line environments. The samples consisted of students in four face-to-faceclasses at a traditional state university (n = 65) and four online classes offered atthe same university (n = 69). Students reported their level of satisfaction on a fivepointscale, with higher values indicating higher levels of satisfaction. Since thestudy was exploratory in nature, levels of significance were relaxed to the .10 level.The test was significant t(132) = 1.8, p = .074, wherein students in the face-to-faceclass reported lower levels of satisfaction (M = 3.39, SD = 1.8) than did those in theonline sections (M = 3.89, SD = 1.4). We therefore conclude that on average,students in online quantitative reasoning classes have higher levels of satisfaction.The results of this study are significant because they provide educators withevidence of what medium works better in producing quantitatively knowledgeablepractitioners.2. A results report that does not find any effect and also has small sample size(possibly no effect detected due to lack of power).A one-way analysis of variance was used to test whether a relationship existsbetween educational attainment and race. The dependent variable of educationwas measured as number of years of education completed. The race factor hadthree attributes of European American (n = 36), African American (n = 23) andHispanic (n = 18). Descriptive statistics indicate that on average, EuropeanAmericans have higher levels of education (M = 16.4, SD = 4.6), with AfricanAmericans slightly trailing (M = 15.5, SD = 6.8) and Hispanics having on averagelower levels of educational attainment (M = 13.3, SD = 6.1). The ANOVA was notsignificant F (2,74) = 1.789, p = .175, indicating there are no differences ineducational attainment across these three races in the population. The results ofthis study are significant because they shed light on the current social conversationabout inequality.3. Statistical significance is found in a study, but the effect in reality is very small (i.e.,there was a very minor difference in attitude between men and women). Were theresults meaningful?An independent samples t test was conducted to determine whether differencesexist between men and women on cultural competency scores. The samplesconsisted of 663 women and 650 men taken from a convenience sample of public,private, and non-profit organizations. Each participant was administered aninstrument that measured his or her current levels of cultural competency. The© 2016 Laureate Education, Inc. Page 2 of 2cultural competency score ranges from 0 to 10, with higher scores indicating higherlevels of cultural competency. The descriptive statistics indicate women havehigher levels of cultural competency (M = 9.2, SD = 3.2) than men (M = 8.9, SD =2.1). The results were significant t (1311) = 2.0, p <.05, indicating that women aremore culturally competent than are men. These results tell us that gender-specificinterventions targeted toward men may assist in bolstering cultural competency.4. A study has results that seem fine, but there is no clear association to socialchange. What is missing?A correlation test was conducted to determine whether a relationship existsbetween level of income and job satisfaction. The sample consisted of 432employees equally represented across public, private, and non-profit sectors. Theresults of the test demonstrate a strong positive correlation between the twovariables, r =.87, p < .01, showing that as level of income increases, jobsatisfaction increases as well.
Press release as follows:
AMERICAN STATISTICAL ASSOCIATION RELEASES STATEMENT ONSTATISTICAL SIGNIFICANCE AND P-VALUESProvides Principles to Improve the Conduct and Interpretation of QuantitativeScienceMarch 7, 2016The American Statistical Association (ASA) has released a “Statement on Statistical Significanceand P-Values” with six principles underlying the proper use and interpretation of the p-value[http://amstat.tandfonline.com/doi/abs/10.1080/00031305.2016.1154108#.Vt2XIOaE2MN]. The ASAreleases this guidance on p-values to improve the conduct and interpretation of quantitativescience and inform the growing emphasis on reproducibility of science research. The statementalso notes that the increased quantification of scientific research and a proliferation of large,complex data sets has expanded the scope for statistics and the importance of appropriatelychosen techniques, properly conducted analyses, and correct interpretation.Good statistical practice is an essential component of good scientific practice, the statementobserves, and such practice “emphasizes principles of good study design and conduct, a varietyof numerical and graphical summaries of data, understanding of the phenomenon under study,interpretation of results in context, complete reporting and proper logical and quantitativeunderstanding of what data summaries mean.”“The p-value was never intended to be a substitute for scientific reasoning,” said RonWasserstein, the ASA’s executive director. “Well-reasoned statistical arguments contain muchmore than the value of a single number and whether that number exceeds an arbitrarythreshold. The ASA statement is intended to steer research into a ‘post p<0.05 era.’”“Over time it appears the p-value has become a gatekeeper for whether work is publishable, atleast in some fields,” said Jessica Utts, ASA president. “This apparent editorial bias leads to the‘file-drawer effect,’ in which research with statistically significant outcomes are much morelikely to get published, while other work that might well be just as important scientifically isnever seen in print. It also leads to practices called by such names as ‘p-hacking’ and ‘datadredging’ that emphasize the search for small p-values over other statistical and scientificreasoning.”The statement’s six principles, many of which address misconceptions and misuse of the pvalue,are the following:1. P-values can indicate how incompatible the data are with a specified statistical model.2. P-values do not measure the probability that the studied hypothesis is true, or theprobability that the data were produced by random chance alone.3. Scientific conclusions and business or policy decisions should not be based only onwhether a p-value passes a specific threshold.4. Proper inference requires full reporting and transparency.5. A p-value, or statistical significance, does not measure the size of an effect or theimportance of a result.6. By itself, a p-value does not provide a good measure of evidence regarding a model orhypothesis.The statement has short paragraphs elaborating on each principle.In light of misuses of and misconceptions concerning p-values, the statement notes thatstatisticians often supplement or even replace p-values with other approaches. These includemethods “that emphasize estimation over testing such as confidence, credibility, or predictionintervals; Bayesian methods; alternative measures of evidence such as likelihood ratios orBayes factors; and other approaches such as decision-theoretic modeling and false discoveryrates.”“The contents of the ASA statement and the reasoning behind it are not new—statisticians andother scientists have been writing on the topic for decades,” Utts said. “But this is the first timethat the community of statisticians, as represented by the ASA Board of Directors, has issued astatement to address these issues.”“The issues involved in statistical inference are difficult because inference itself is challenging,”Wasserstein said. He noted that more than a dozen discussion papers are being published inthe ASA journal The American Statistician with the statement to provide more perspective onthis broad and complex topic. “What we hope will follow is a broad discussion across thescientific community that leads to a more nuanced approach to interpreting, communicating,and using the results of statistical methods in research.”
Leave a Reply
Want to join the discussion?Feel free to contribute!