In this post we’re looking at a topic that a lot of people have requested: PDF content. PDF content is inaccessible, creates a poor user experience, and yet somehow, it persists. Whether you’re trying to understand if PDFs are really that bad, or you’re trying to make the case for getting rid of them in your organisation, this post is for you. It covers:
- Why PDF content is bad for accessibility, usability, SEO, and more (with evidence)
- Why people (stakeholders) like them anyway and how to respond to their concerns
- Examples of organisations that are moving away from PDFs – even for reports, research and ‘long’ content
- Times it’s actually okay to use a PDF
Why PDFs are worse than HTML webpages
PDFs are not accessible (usually)
A study by the non-profit organisation WebAIM found that over 75% of screenreader users said that PDF documents are likely to pose significant accessibility issues.Source: WebAIM
The best reason for getting rid of PDFs is that they’re hardly ever accessible. PDFs aren’t inherently inaccessible, but they often end up being so because of mistakes with the way they’re created, like:
- Image-based content with no alt text so users with visual impairments can’t access the information
- Missing or poorly structured tags (like headings and lists) making the layout and order confusing for people using a screenreader
- Complicated layouts that, again, make it hard for people who use screenreaders
- Design mistakes, like fonts that are hard to read or colour combinations that don’t have enough contrast.
So if you’re putting information into a PDF, you could be excluding disabled people. It’s even worse if that information is important stuff, like manuals, reports, research, job descriptions, forms, etc.
People don’t like PDFs and they’re hard to use
“Burying information in PDFs means that most people won’t read it. Participants in several of our recent usability studies on corporate websites and intranets did not appreciate PDFs and skipped right over them. They complained woefully whenever they encountered PDF files and many who opened PDFs quickly abandoned them.”Source: Nielsen Norman Group
The second best reason to get rid of PDFs is that a lot of people really don’t like them (and plenty of people actively hate them) because they create an awkward, clunky experience. Again, it’s not the fault of the format itself – just the way we’re using it. PDFs weren’t designed to be part of a web experience. They were intended as a way to share documents across operating systems while preserving formatting. But when they’re used online they are:
- Hard to skim read (especially compared to a well-formatted HTML page)
- Not responsive and hard to read on mobile
- Jarring because they don’t look or feel the same as a website, navigation is inconsistent, and you might need to download software to read them.
PDFs probably won’t perform as well in organic search as HTML pages
Google can read PDF content, and will index it and show it in search engine results. It will even try to read text in image content in your PDFs using Optical Character Recognition (OCR).Source: Google
It’s often said that PDFs don’t get indexed by search engines – but it’s not true. However, you might find that PDFs aren’t a great choice for SEO. This is because when they’re created, people often miss a lot of the steps that are needed to make them search friendly (which also overlap with the ones that make the PDF accessible) like:
- Adding a title and a meta description
- Structuring and tagging it properly (alt tags, H1, H2, H3, etc)
- Giving it a readable, relevant file name
Also, you can’t add the structured data/Schema markup that you need to get Rich Snippets or Rich Results (where things like user ratings, product details, event details, get pulled into the search results).
Lots of PDFs don’t get downloaded – ever
“Nearly one-third of their PDF reports had never been downloaded, not even once. Another 40 percent of their reports had been downloaded fewer than 100 times. Only 13 percent had seen more than 250 downloads in their lifetimes.”Source: World Bank/Washington Post
It’s an old report, but this World Bank case study is great evidence that because they’re inaccessible, a horrible user experience, and not great for search, many PDFs never get used.
They’re hard to update, and that can create risk
If all that wasn’t enough, PDFs are easier to forget about and tend to be harder to update than an HTML page.
You might need to have access to specialised software to edit a PDF, then you’ve got to upload the new file, and make sure all the new links are pointing to the right place. Logging into a CMS and making edits is a lot easier. And content that’s hard to update gets ignored, goes out-of-date, and that creates risk.
Anecdotally – based on working with a lot of organisations over the years – I can tell you that:
- Content that’s in a PDF is more likely to be out-of-date than content that’s on an HTML page and can be edited via a CMS
- Some content owners will put off editing content that’s in a PDF because it’s harder
- The biggest content-based liabilities I’ve ever seen have been in PDFs (think out-of-date advice that could have led to death)
The reasons stakeholders like PDFs don’t hold up
Here’s the tricky thing about PDFs – some of your stakeholders probably really like them. They’re going to create them, ask for them, and feel nervous when you try and move away from them. So you’re going to have to work with them to try and assuage their fears. Here are some objections you might hear, and how to respond:
- “They’re universal and consistent. You can read a PDF on any device or operating system and it will look good”
- The same is true of a good website – more true even, because PDFs are rarely mobile responsive
- “You have more design freedom with a PDF. You can make it look great with fancy layouts, fonts, images. Our website doesn’t do what I want it to do.”
- Fancy isn’t always readable or accessible. Investing in a content strategy, content model, and design will create a website that does what you need it to do.
- “It’s an application form – it has to be a PDF.”
- PDFs are hard to fill in for lots of users. And having to download, complete and reupload/send a form creates extra work for the user compared to completing a form on an HTML page.
- “A PDF is more official and professional.”
- If your website doesn’t look official or trustworthy enough you’ve got big problems that you need to address as soon as possible. Occasionally there might be a legal or compliance reason that compels you to use a PDF, but they’re not common for most organisations.
- “It’s too long to be a webpage. No one will scroll all the way to the bottom.”
- People scroll: People use the scrollbar on 76% of pages. 66% of attention on a normal media page is spent below the fold. On mobile, half of users start scrolling within 10 seconds and 90% within 14 seconds. Users can read long, scrolling pages faster than paginated ones. (Source: Multiple via UX Myths)
- “They’re easier for us to make than a webpage”
- This is a tricky one. If your reliance on PDFs is because your content owners don’t have the skills or access to create a webpage, creating a Word document and then converting it to a PDF might be the only way they can get something out. This will take more time and a bigger effort to address – but it’s worth it.
You can get rid of even the most stubborn PDF content
The kinds of content where I see people clinging onto PDFs most stubbornly are reports and research; things like annual reports and lengthy research papers. But this is all about habit and familiarity, rather than PDF being the best format – as per all the points in the section above.
If your stakeholders (or you) are struggling to imagine what non-PDF reports, research and long content could look like, here are some good examples:
- Fact checking charity Full Fact has been publishing its reports as web pages (with an accompanying PDF) for a few years now – see an example
- International affairs think tank Chatham House has a slick content type for publishing its long, detailed reports – see an example
- Healthcare policy organisation The Health Foundation also has a smart content type for its reports and long reads (again with the option to download a PDF) – see an example
There are a few scenarios where PDFs are fine
Begrudgingly, I can admit there are a few scenarios when PDFs are fine. It comes down to this: if the user needs to print the information or use it offline, a PDF might be the right choice.
This is an extreme example, but I was quizzing a stakeholder about why they had so many PDFs recently, and they told me that it was likely that users would need to reference the information while in an underground bunker with no phone signal or wifi. In that scenario, a PDF makes sense (but so does a Word document).
If you do need to share a PDF, just make sure it’s accessible. Go to the source: Adobe has information on how to do this.
Make PDFs secondary or use gateway pages
If you are going to use PDFs – whether it’s because it’s the right format or just to placate a stakeholder – you can make the experience better and less jarring for users by making the PDF secondary or using gateway pages.
By making the PDF secondary, I mean offering it as an option alongside the same content in HTML.
By a gateway page, I mean a page that hosts the PDFs, summarises the content, and makes it very clear to the user that they’re about to download a document. This Nielsen Norman Group article explains gateway pages in more detail.
You don’t have to make the switch from PDF to HTML overnight
The final thing I’ll leave you with is this: you don’t have to make the switch from PDF to HTML overnight. It can be a gradual journey.
Taking things slowly mean you can test and learn. Try moving one piece of PDF content to HTML, gather some data and see what the impact is. Test out a gateway page versus having HTML with the option to download a PDF. Ask your users what they think. Take your findings to your stakeholders and show them why and how HTML is better. Start upskilling people so they can create a webpage not just a PDF.
Thanks for reading.