Structured data: what is it & what can you do with it? Structured data allows web developers and SEOs to give meaning and context to content for search engines and their crawlers, allowing that content to be displayed understandably. In this article, I describe what structured data is, how you can use it, what it has to do with SEO and its importance in the future.
The term structured data is primarily a general term. Structured data – no matter in what context – is always present when it follows a certain pattern or format, that is when it is organized according to a certain schedule. The term schema is relevant in this case, which I will come back to later.
Since search engines deal with an almost infinite amount of information, the challenge lies in sorting and interpreting this data. Only when the information is sufficiently understood can it be used to best answer queries. Structured data enables website administrators to help search engines understand and interpret their website and information.
The Schema.org collaborative project
In 2011, the most prominent search engines joined forces and launched the Schema.org initiative. Google, Yahoo, Bing (Microsoft), and Yandex have now joined forces with this open-source project. The goal is to ensure that website content is uniformly labeled so that search engines can interpret it more easily.
Schema.org, in a sense, represents an ontology, that is, an all-encompassing conceptual scheme. For example, if I sell books in my online shop, I can mark individual product pages, so search engines know, “This is about [Book]!”
The entity [Book] usually stands for concrete literary works with specific characteristics. I can bring these characteristics closer to the search engines in a second step. I fill in a more or less extensive mask of properties, where each property represents a certain property. For example, it makes sense to define the Book’s title, the author, the number of pages, the language, the ISBN number, and the price, among other things.
A language is not complete with vocabulary alone.
So, with schema.org, the major search engines have built a constantly growing dictionary. To create an understandable language from this, grammar is needed, rules that make the language logical. Precisely this is complemented by the following markup languages: JSON-LD, HTML Microdata, and RDFa. Every grammar works differently, but they all offer me, as a website manager, the possibility to make my website easier to understand using structured data. What the pros and cons of these different languages are and what is the best choice at the moment, I will discuss in more detail later.
As a website manager, what good is structured data?
Of course, it is nice for search engines that they understand the information better, but what good is that to me as a website manager? The answer to this is visually stronger search engine results. According to the title, description, and URL, these so-called rich results show other elements such as review stars, breadcrumbs, and other notable elements.
Google search for Most people are okay with normal and rich results
Figure 2 shows a screenshot of a Google search for Most People Are Good, Rutger Bregman’s book. There is immediately a difference between the first and second results, namely the rating stars and the number of reviews. Besides, the first result shows a clear breadcrumb structure. The best example is shown last at amazon.com, where the price and availability are also displayed. For a user, this is, of course, valuable information, which in any case attracts more attention than the other results.
All these elements, so breadcrumbs, reviews, price, and availability, are fed by structured data. It happens that pages with structured data do not get these rich results. By contrast, pages that do not use structured data will certainly not get these rich results.
Another form, which can be especially important for mobile devices, are carousels. For example, the image below shows how Albert Hein has provided its kale recipe with structured data to be included in this carousel. Especially the probability of getting clicks in these types of carousels is much higher than in the normal search results.
Google search for ‘kale recipe’ including a carousel
By the way, don’t confuse these types of carousels with featured snippets, where Google gives a table, list, or text paragraph a prominent place at the top of the search query. Structured data does not affect these types of featured snippets.
Google search for ‘How do I cook an egg’ including a featured snippet
The fact that Google shows a featured snippet of a certain website here is solely due to the fact that Google has assessed the content of the page in such a way that it gives the best and most concise answer.
Google shows other possible ways to enrich the snippets to rich results using structured data in the Search Gallery.
How do you implement structured data?
First of all, you can distinguish between specifications for the quality of the data and for the technical implementation. One of the content’s quality requirements is that you may only mark unique content with structured data. In other words, only content created by myself or my users can I flag it. Also, the marked content must be visible to users. For example, if you provide a product with structured data, it must also be visible on the site. Also, you may not flag irrelevant or misleading content. This also applies to fake reviews or content that has nothing to do with the main content of a page.
Of course, it is a basic technical requirement that all content to be displayed in search results is accessible to search engines and their bots. This means that access should not be blocked by, for example, the robots.txt. You also need to use JSON-LD, RDFa, or Microdata for the implementation. To avoid possible errors and double marks, it is recommended to limit yourself to one of the three formats: JSON-LD, RDFa, or Microdata.
The implementation using RDFa is based on a slightly different approach. Instead of combining the structured data into one – also human-readable – block as in JSON-LD, the data is distributed over the entire content of a page. RDFa is ‘HTML5 extensions’: existing HTML elements are extended and provided with extra information to mark structured data.
HTML extensions are also used with Microdata. Microdata is, however, considered outdated and no longer actively developed. Nevertheless, I still regularly come across websites, also from customers who still work with Microdata. Hence it is relevant to mention this option briefly here.
Which format is best to use?
Have you not yet implemented structured data? Then I recommend using JSON-LD. Thanks to supporting from Google, Bing (since July 2018), and Baidu, this format is now ready for three of the big four search engines. Only the Russian search engine Yandex only supports Microdata and RDFa so far. If structured data has already been implemented using Microdata or RDFa, this can, of course, be maintained. However, it should be regularly checked how long the search engines will continue to support these two formats in the future.
Some of the markups have additional requirements that you should consider during implementation. These specific guidelines are all in the ‘Feature Guide’ in the Google Developers Portal.
Common mistakes with structured data
Due to the many different guidelines, it is easy to make certain mistakes in the markup. Even Google’s testing tool for structured data isn’t always helpful. Some common mistakes are easy to overlook, so here’s a quick summary.
Ratings or reviews (review snippet)
While using (third-party) widgets to implement ratings or reviews is generally not a problem, there are a few things to keep in mind. It is especially important that the review’s formatting always refers to the main content of a page. A common mistake is that a general review of a webshop is displayed page-wide via a widget, and can therefore also be seen on individual product pages.
Since this review relates to the webshop itself, Google can consider this as spam not to the individual products. In addition, reviews that do not come from the website itself should always contain a link to the source. It must therefore be clear to users and search engines where these reviews come from.
Invisible content for users
Flagged content must be visible on desktop and mobile devices. The user must be able to recognize where the content is located on a particular page, which is shown by a rich result in the search engine pages. The only option you have as a website administrator is to place content behind a tab, which opens to users after a mouse click. The condition is, of course, that the search engine bot can also read the tabbed content.
Misleading or Inappropriate Content
This may seem obvious, but the marked content must be relevant and fit a particular page’s content. For example, an all-inclusive trip to the Foo Fighters concert cannot be marked as an event. Marking the concert itself, on the other hand, is a correct implementation of the event markup. In short, be as specific as possible when tagging or marking structured data.
Is structured data relevant to SEO?
A frequently asked question from customers is how relevant structured data is for SEO. And this question, like many SEO questions, you can answer with: it depends. In most cases, the implementation of structured data has no direct influence on a page’s position in a search engine. Google and co. do not judge a page differently if it is marked with structured data. But in many cases, rich results increase the click-through rate because they provide more information and simply make the results stand out more from the regular results. Search results with review stars, for example, stand out more than results without such a markup. In this case, an indirect relevance for SEO can indeed be established.
In addition, John Müller (Senior Webmaster Trends Analyst at Google) recently emphasized during a webmaster hangout that even when a certain type of structured data does not contribute to obtaining rich results, it can still help Google understand certain content.
Another special feature is the results of articles, recipes, and some other search functions. If your content is correctly tagged, it has the chance to appear in special boxes, such as the recipe carousel. Such carousels are usually at the top of the first results page – and ranking high in Google is ultimately the goal of search engine optimization.
Structured data tools
Fortunately, there are numerous tools for structured data that can help webmasters with the implementation or even check the rich results gained.
If this topic is relatively new to you, Google’s Codelab will get you started pretty quickly. Based on practical examples, you walk step by step through the various options. Note: Currently, Google’s Codelab for structured data is only available in English.
Schema Markup Generator
The “Schema Markup Generator (JSON-LD)” from Technicalseo.com is suitable for both beginners and advanced users. It contains all the usual markups, and you can create a valid structured data code in a few clicks.
Rich Result Test Tool
One of the essential tools for structured data testing is Google’s own Test Tool. Previously, this was mainly the Structured Data Testing Tool, but Google recently announced that this tool would be discontinued. It is not yet clear when, but Google has offered a replacement in the Rich Result Test Tool form.
The problem with this change is that the Rich Result Testing Tool only performs tests on rich results, where the Structured Data Tool also checks certain flags that do not (yet) lead to rich results. These are annoying developments for many SEO’ers and webmasters. The question is whether there will be a replacement or whether changes will be made to the Rich Results Test Tool, which also allows you to check and validate other structured data markers.
Then when you surf to a page you want to check for structured data, all you have to do is click on the bookmark. A new browser tab will immediately open the corresponding URL in Google’s Test Tool.
Structured Data – conclusion
Since the introduction of structured data, new data types are added regularly. What is currently in beta and could potentially be added by default to schema.org in the (near) future can be found here.
For example, Google added the speakable content type in August 2018 and has supported it ever since. This type of structured data is intended for content that speech assistants and smart speakers can read easily. At the moment, the ‘speakable property’ is only intended for publishers. This is expected to be available for all websites in the future.
Google is likely to expand the number of structured data options in the coming years. In addition, the inclusion of structured data no longer only affects the classic search results in Google search results pages, as the example of ‘speakable content’ shows. It enables websites to position themselves in other search areas and thus increase their own traffic.