To carry out this study, we used the media collection that Media Cloud called the US Latino Media subset (CUNY Project). This collection contains 41 news outlets in Spanish, detailed in the following table:
Media Cloud collects content sent through RSS (Really Simple Syndication) feeds as published by the media outlets it monitors. The collection of stories frequency depends on each outlet, but is daily for most publications. The Media Cloud system can also follow hyperlinks contained in the articles to collect other stories that might match user searches. In the case of Univision, the content for the years 2019 and 2020 was obtained directly from that outlet, which provided us with a database of URLs for each article, which was then integrated into the Media Cloud search system.
Since the end of February of 2020, Media Cloud has been monitoring the content of nearly 500 US Spanish-language media outlets using our directory from The State of the Latino News Media. The new collection is now available to the public and can be used for future analysis with a broader and more diverse data source.
In total, we analyzed 667,247 articles published between January 20, 2017 and January 20, 2020. We explored the data collected using logical operators (Booleans) that generate a new database from the result of each search.
Thus, for example, to analyze journalistic material that in some way addresses the issue of immigration, we looked for all of the stories that contain Spanish words derived from the root immigra*, including: “immigration,” “immigrants” or “immigrant.” Then, we validated a random sample of stories to verify that each one actually mentions the phenomenon of immigration or people’s status as immigrants.
Another example is media coverage of federal health programs. In this case we did the following Boolean search: Medicare OR Medicaid OR Obamacare OR Trumpcare.
In an attached section at the end of this report we have included each of the Boolean searches that we made to the Media Cloud database.
The data we obtained with each search was analyzed using normalized percentages of daily coverage over three years. Additionally, we explored the monthly and weekly figures. The Media Cloud platform provides the number of stories that match the search parameters, the number of total stories published by all media and provides a complete list of URLs for each story, the name of the outlet that published it, the publication date and an identification number for each one.
This also allowed us to perform manual reviews of thousands of articles to detect and analyze the use of terms such as “illegal immigrant,” “illegal immigrants,” “__’s lady” (wife), “___’s woman” (wife), “___’s man,” “their wives,” “Latinx,” etc.
We were also able to analyze the language used in the articles by counting the most used words and the lists of bigrams and trigrams, that is, the two- and three-word phrases that are repeated most frequently in the articles.