By Qinling Li
The Five Thirty Eight published an article titled People Want News About Iran, But the News They Get is About Canada, which finds out that the heaviest coverage in the U.S. newspaper is not significantly positively correlated to people’s searching interests. The coefficient correlation between the number of newspaper articles and relative searches is 0.3, comparatively close to zero.
The story compares two data bases: one is the Nexis data, incorporating news coverage from top 50 English-language newspaper published in the U.S. , and the other is the Google trends search data. After excluding the outliner, France, the Five Thirty Eight discovers the interesting mismatch between these two databases: Canada ranked the first in the Nexis data while Iran is the most popular country in the Google search data. To further showcase the mismatch between Nexis data and Google search data, the article scored the popularity of each country’s coverage on both databases, according to standard that the first-rank country from each database is regarded as 100.
To some degree, the article did circle out the outliers that distinguish two databases, but it did not solve two questions: how different it is between the news coverage and Google search records in regards to the appearance of countries? Are these two databases independent of each other? For these two questions, we can actually consider using χ2 test for homogeneity and χ 2 test for independence to verify the correlation between these two databases. A good method is to select 10 to 20 countries based on their media influence that is defined from a third standard. Then we can use χ2 test for homogeneity to test if these countries have the same popularities among two databases, and χ2 test for independence to test if these two databases affect each other. This way, we may get something more convincing than the correlation result, and thus the story angle may be more surprising than the focus on the outliers.