Google Introduces Fact Checking in Search

Three years ago Google launched “Fact Checking” in order to enable more informed judgements about the information they encounter online. The new 2020 venture by Google intends to place power in the hands of journalists and webmasters when rating “truthfulness” of other claims.

The Fact Check Search Engine.

Fact Checking was implemented three years ago by Google and appears more than 11 million times (11,000,000) per day in Search results and in Google News. Currently Fact Checking for Google News only occurs in Brazil, France, India, the United Kingdom, and the United States.

The repository contains 40,000+ facts, and has a dedicated Search Engine where you can discover and research specific facts about people, things, or topics.

Researchers can access the repository and contribute through an open API.

The Search Engine is called Fact Check Explorer and is clean, easy to use, and sophisticated. Much like Google’s primary Search Engine.

In action, the search contains very clearly annotated media and text, detailing the overall reliability of the text.

Now, in the left screenshot the “rating” has been determined to be “Called in 2020,” which is obviously a misuse of structured data by the webmaster. This is currently the case for more than 60% of the reviews, making the Search Engine very difficult to use in it’s current state.

However, the robust structure and leveraging of structured data by webmasters will enable the tool to become powerful over time.

While this is the current Search Engine and repository, the way Google is integrating Fact Check into Search is completely different. In both Search and Google News, the Fact Check occurs at the top of the search, detailing the “claim,” the “claimant,” and the “Fact Checker.” In this system, the Fact Checker is the site that gives their opinion in a rating of 1-5: 1 being the most true, and 5 being the most untrue. However even these values can be restructured by webmasters if they find it applicable.

Fact Checking in a Google Search

Any site owner or journalist can leverage this this technology to signal whether a specific claim is true or false. This gives dramatic power to the webmasters of the Internet when producing content.

But isn’t this the exact problem that causes misinformation? Too much power to media sites that publish the content? Well the idea is that by involving more and more webmasters, the quality of the information will eventually tend to the real value. This is because the expectation is that those who really know whether or not the information is true or not will contribute and cause the “rating” to shift from a 1 (True) toward a 5 (False) or vice versa.

The only problem with this theory is that it expects the general population to be competent about all topics they rate in terms of truth. There may very well be many trolls (and there always will be) that mis-rate information for the sake of it, and wind up causing an even further misinformation spread.

Luckily, the penalty for misinforming the public, is being pinned in the “Fact check by xyz” section of the rich data extract. Knowing who has provided you with wrong information is a key deterrent for sites online, and a great reward for those who get it correct. This is particularly valuable in news and politics, where the dissemination of false information has real, serious consequences.

The Google team aim to make the Fact Check seamless and natural in the Search. It will be lite in terms of processing, but will slow down search results in the slightest, as an additional repository of data has to be searched, ranked, and displayed.

This raises important questions for SEOs and media sites that want to feature in the “Fact Check” in a search. How are they ranked? What are some important indicators of “truthfulness”?

What we have uncovered here at SEOSPIDRE is that for the moment Google relies on general SEO principles, such as user traffic, pogo-sticking rate, and Domain Authority. This will likely change over time, as powerful domains do not necessarily mean truthful domains.

How it Works

The technology leveraged in Fact Checking requires explicit Schema from Webmasters. The great thing about the way Schema has approached this, is that it’s available in all four major structured data forms:

  • Non-markup
  • Microdata
  • RDFa
  • JSON-LD

Now one question is how does Google know when scrubbing sites if non-markup statements are meant for fact checking purposes? Based on the examples provided on Schema.org, their assumption is that the explicit statement of “Claim: something something” and “Rating: 1.” For example:

This would be part of a page such as
http://www.politifact.com/texas/statements/2014/jul/23/rick-perry/rick-perry-claim-about-3000-homicides-illegal-immi/
Simple example based on material from that page.
Earlier examples included Review and ClaimReview types while the latter
design was under discussion, but this is not strictly needed now.
<p>
An example paragraph reviewing a claim expressed in another document.
<dl>
  <dt>Date published:</dt>
  <dd>2014-07-23</dd>
  <dt>Review url:</dt>
  <dd>http://www.politifact.com/texas/statements/2014/jul/23/rick-perry/rick-perry-claim-about-3000-homicides-illegal-immi/</dd>
  <dt>Review by:</dt>
  <dd><a href="http://www.politifact.com/">Politifact</a>
  <img src="http://static.politifact.com/mediapage/jpgs/politifact-logo-big.jpg" alt="Politifact" />
  </dd>
</dl>
<h3>Claim reviewed:</h3>
<blockquote>
More than 3,000 homicides were committed by 'illegal aliens' over the past six years.
</blockquote>
<div>Rating: 1 (best score: 6), "True".</div>
<img src="http://static.politifact.com.s3.amazonaws.com/rulings/tom-pantsonfire.gif" alt="Politifact Pants on Fire rating logo" />
<h4>Item reviewed:</h4>
<ul>
  <li>Claim author's name: Rich Perry. Job title: "Former Governor of Texas".</li>
  <li>Claim original document: "The St. Petersburg Times interview" (2014-07-17)</li>
</ul>
<img
 src="https://upload.wikimedia.org/wikipedia/commons/thumb/1/15/Gov._Perry_CPAC_February_2015.jpg/440px-Gov._Perry_CPAC_February_2015.jpg"
 alt="photo of R.Perry."/>
</p>

The key point here is, that no explicit structured data is being used, as opposed to this micro data example which is extremely time consuming to accomplish:

<div itemscope="" itemtype="http://schema.org/ClaimReview">
  An example paragraph reviewing a claim expressed in another document.
  <dl>
    <dt>Date published:</dt>
    <dd itemprop="datePublished">2014-07-23</dd>
    <dt>Review url:</dt>
    <dd itemprop="url">http://www.politifact.com/texas/statements/2014/jul/23/rick-perry/rick-perry-claim-about-3000-homicides-illegal-immi/</dd>
    <dt>Review by:</dt>
    <dd>
     <span itemprop="author" itemscope="" itemtype="http://schema.org/Organization">
         <span itemprop="name"><a itemprop="url" href="http://www.politifact.com/">Politifact</a></span>
         <img itemprop="image" src="http://static.politifact.com/mediapage/jpgs/politifact-logo-big.jpg" alt="Politifact" />
         <link itemprop="sameAs" href="http://twitter.com/politifact"/>
     </span>
    </dd>
  </dl>
  <h3>Claim reviewed:</h3>
    <blockquote itemprop="claimReviewed">
    More than 3,000 homicides were committed by 'illegal aliens' over the past six years.
    </blockquote>
    <span itemprop="reviewRating" itemscope="" itemtype="http://schema.org/Rating">
      Rating: <span itemprop="ratingValue">1</span>
      (best score: <span itemprop="bestRating">6</span>),
      "<span itemprop="alternateName">True</span>".
      <img itemprop="image" src="http://static.politifact.com.s3.amazonaws.com/rulings/tom-pantsonfire.gif" alt="Politifact Pants on Fire rating logo" />
    </span>
  <h4>Item reviewed:</h4>
  <div itemprop="itemReviewed" itemscope="" itemtype="http://schema.org/CreativeWork">
   <ul>
    <li itemprop="author" itemscope="" itemtype="http://schema.org/Person">Claim author's name: <span itemprop="name">Rich Perry</span>.
        Job title: "<span itemprop="jobTitle">Former Governor of Texas</span>".
        <link itemprop="sameAs" href="https://en.wikipedia.org/wiki/Rick_Perry"/>
        <a itemprop="sameAs" href="https://rickperry.org/">rickperry.org</a>
        <img itemprop="image"
         src="https://upload.wikimedia.org/wikipedia/commons/thumb/1/15/Gov._Perry_CPAC_February_2015.jpg/440px-Gov._Perry_CPAC_February_2015.jpg"
         alt="photo of R.Perry."/>
    </li>
    <li>Claim original document: "<span itemprop="name">The St. Petersburg Times interview</span>"
      (<span itemprop="datePublished">2014-07-17</span>)</li>
   </ul>
  </div>
</div>

Realistically, if you’re a media site, you don’t have time to write micro data for each of your entries. However, you don’t want to risk your data being overlooked by the Google crawler, so a simple balance between the two is the simpler JSON-LD script which can be placed in the header of each article.

JSON-LD serves to centralise the code in an equally legible way to micro data (although less accurate), and less time consumingly. However, it provides greater accuracy as towards what is expressed in the article when compared to non-markup. Here’s an example:

<script type="application/ld+json">
{
    "@context": "http://schema.org",
    "@type": "ClaimReview",
    "datePublished": "2014-07-23",
    "url": "http://www.politifact.com/texas/statements/2014/jul/23/rick-perry/rick-perry-claim-about-3000-homicides-illegal-immi/",
    "author": {
        "@type": "Organization",
        "url": "http://www.politifact.com/",
        "image": "http://static.politifact.com/mediapage/jpgs/politifact-logo-big.jpg",
        "sameAs": "https://twitter.com/politifact"
    },
    "claimReviewed": "More than 3,000 homicides were committed by \"illegal aliens\" over the past six years.",
    "reviewRating": {
        "@type": "Rating",
        "ratingValue": 1,
        "bestRating": 6,
        "image": "http://static.politifact.com.s3.amazonaws.com/rulings/tom-pantsonfire.gif",
        "alternateName": "True"
    },
    "itemReviewed": {
        "@type": "CreativeWork",
        "author": {
            "@type": "Person",
            "name": "Rich Perry",
            "jobTitle": "Former Governor of Texas",
            "image": "https://upload.wikimedia.org/wikipedia/commons/thumb/1/15/Gov._Perry_CPAC_February_2015.jpg/440px-Gov._Perry_CPAC_February_2015.jpg",
            "sameAs": [
                "https://en.wikipedia.org/wiki/Rick_Perry",
                "https://rickperry.org/"
            ]
        },
        "datePublished": "2014-07-17",
        "name": "The St. Petersburg Times interview [...]"
    }
}
</script>

Google are also busy working with Duke Reporters’ Lab and the International Fact-Checking Network to implement Fact Checking metadata in multimedia. This would allow not only text to be fact-checked, but also the often misleading images distributed throughout the web. Fact Checking by Google is intended to go as far as be able to identify false videos, or even false aspects of videos where alterations have been made.

Multimedia checking requires far greater effort, and has been mentioned by Alexios Mantzarlis, the News and Information Credibility Lead at Google as a venture for 2020.

Interested in optimising your images? Check out this article: Image SEO.

SEO and Webmaster Tools

Both SEOs and Webmasters have great control over the exercise of the Fact Check power. Google explains the implementation of the structured data in this reference.

There are three key data properties. They can all be used simultaneously regarding the same claim depending on the circumstances of the claim, or be used individually:

  • ClaimReview
  • Claim
  • Rating

claimReview

When writing your claimReview, a few properties are required to be addressed:

  • claimReviewed: Short summary of the claim being evaluated. Keep it shorter than 75 characters.
  • reviewRating: Numerical rating of “truthfulness.” 1 – True / 2 – Mostly True / 3 – Half True / 4 – Mostly False / 5 – False
  • url: Link to the page hosting the claim, subject to the Fact Check

ClaimReview may also have some additional, recommended properties:

  • author: Publisher of the information, include both the “name” and the “url”
  • datePublished: Date when the information was published
  • itemReviewed: An object describing the claim being made

Claim

When writing your Claim, a few properties are required to be addressed:

  • appearance: A link or inline description of a CreativeWork in which this claim appears
  • author: The author of the claim, not the author of the Fact Check. Do not include if the claim does not have an author
  • datePublished: Date when the claim was made public or popular
  • firstAppearance: A link to, or inline description of

Rating

When writing your Rating, a few properties are required to be addressed:

  • alternateName: A text description of the truthfulness. For example: “mostly true but overall claim can be misleading.”

Rating may also have some additional, recommended properties:

  • bestRating: Upper limit of range you intend to use. Can be any real integer, but must be greater than worstRating. E.g. 100
  • name: Used instead of alternateName if necessary. Use alternateName unless otherwise specified in guidelines
  • ratingValue: Rating of the claim, the closer it is to bestRating the more true, and the closer to worstRating the more false. Must lie in between bestRating and worstRating
  • worstRating: Lower limit of range you intend to use. Can be any real integer, but must be lower than bestRating. E.g. 1

Impact on SEO Landscape

Beyond highlighting fact checks on our surfaces, Google has for years supported fact-checking projects around the world. In 2020, we’ll explore new models to support the long-term sustainability of the fact-checking field. Fact-checking matters, to Google and everyone who uses our products. We’ll continue to find ways to surface and support quality journalism on our products and beyond. 

Alexios Mantzarlis | News and Information Credibility Lead @ Google News Lab

This 2020 Fact Checking venture will likely open up new positions within media and news agencies looking for full-time fact checkers. These individuals will have the responsibility of writing structured markup on journalists’ articles that clarify falsities in other sites’ information.

The movement will likely, overtime improve the quality of information present on Google as a result of encouraged penalty / reward for being truthful or deceitful. Few news sites could risk damage to their public reputation, and thus the dissemination of false information is naturally greatly discouraged.

In terms of repeal, there is no plan yet. Google doesn’t have a method of contact, or system similar to DMCA to remove information that has been falsely posted. So the information will still be live on the Internet, but will be tagged as false. This may slow the adoption of false information, but can’t stem it completely.

Further, if you’re on the receiving end of a black-hat strategy to mark your content as “False,” there is technically no remedy for this. at SEOSPIDRE we expect this to be the next surge of black-hat activity across the web, with the only remedy being the “weight” of these reviews as is related to Domain Authority and traffic. Let’s hope Google has considered this!

Leave a Reply