January 29th, 2021

Sources, Authors, Publishers, and even Filetypes: Are They Reputable? Can They Be Manipulated?

As a research librarian for the federal government, there are a few considerations that my colleagues and I take while vetting information for credibility. They are as follows:

1.  Information Source:   If it is a news media organization, is it typically held in high regard or sponsored by entities such as Fortune 500 companies or Government Organizations? Is their content cited often by the same kind of entities, as well as scholarly content from think tanks and peer-reviewed journals? My team typically avoids social networking; however, if it was considered, we would look to see whether. If so, we would seek answers to the questions aforementioned. If it is a think-tank, what are their political leanings? 

2. Author: Who produced the information? What organizations are they affiliated with? What are their credentials? In short, to what extent are they an authority on said topic or subject?

3. Filetype: Is it published as a PDF, Word, HTML, or Text file? Though this is subjective, I perceive information presented in PDF format to have a higher chance of being credible than others. In my observation, most PDFs are finished works published by peer reviewed journals, official government websites, think tanks, etc. Information produced in any other format could be  manipulated to mislead the public. Filetype is not the end all be all, but it is something to consider. 

We compile these resources in a handout that posts to our library website, and we create Libguides that organizes them by topic or subject. Research consultations and presentations are also conducted for one or more patrons wanting to know more about the resources. Either way, they are disseminated to the people we serve. 

Whether it is in a specialized setting such as mine or a Public/Academic/School setting, librarians have an obligation to provide access to information in a wide range of formats. With all of the misinformation we encounter, the expectations we are asked to meet with regard to mining the most credible sources will remain high. In addition, our resources should offer information reflecting diversity of thought. Understanding multiple sides of a topic or issue helps foster the common ground we need to move forward as a community.

Tags: Censorship, Disinformation, media literacy, Sources

() |
Comments (5)

Comments (5)

Wow. I never thought about file type -- and the fact that other docs can be changed or manipulated. That seems like one of those super easy filters people could use to start a search.

| Reply

Comment deleted by user.

Absolutely, and its very simple to use when searching! In Google, you would use "filetype:"x" (keyword). Here's some examples you can try:

filetype:doc coronavirus OR covid
filetype:pdf coronavirus OR covid
filetype:xls coronavirus OR covid

With search engines like Google being so familiar, we might as well teach communities the tricks of using it so they don't have to spend hours sifting through millions of hits. Below is a comprehensive list of filetypes indexed by Google:


Side Note: It is also a good tool for data scraping.


Filetype is an interesting metric! I haven't seen that before. So much of credibility is determining context and this adds another technical level. I wonder if there is any other metadata that might be appropriate to analyze.

| Reply

Hi Paris, thanks for commenting! Yes, when I took a training on the realms of the internet, the section of course that stuck out to me the most was searching engines like Google by filetype. The point stressed was that the "good stuff" was found in PDFs, but as I mentioned before, everything should be evaluated. It's certainly something to consider.