Have feedback? Can't find your answer in our Help pages?
Adobe PDF Document
You will find most of that formatting is lost upon conversion. For this reason, we do not recommend uploading your content in PDF format.
There are a number of differences between the PDF format and the format Amazon KDP uses to display text on-screen. If you own the source from which the PDF was generated (such as a Microsoft Word .doc), we strongly suggest that you use the program you originally used to create the document to export or save the content as HTML. If HTML saving isn't possible, try to export or save the content in a format where you can open it with Microsoft Word, and save it as HTML from there. If you cannot get your content out of the PDF format, the following answers may help to explain the results you are getting.
1. Why doesn't Amazon KDP retain the exact layout of my content in PDF format?
Conversion usually cannot retain exact formatting because PDF is a fixed layout format designed for printing and Amazon KDP uses a reflowable format designed to be displayed on screens that differ greatly in size and width / height ratio.
2. Why is so much of the formatting lost?
The problem is that PDF is a "destructive" format in the sense that you lose all high level information. PDFs store characters and coordinates for each character or word. The information contained is very much like a scan of the printed pages. The notions of words, lines, paragraphs don't even exist in the format. Much less the flow of text (what follows what). So to convert the PDF document to a reflowable format which you can view on different screen sizes using any font size, our software has to apply "Artificial Intelligence" - like algorithms that simulate the way a human reader with knowledge of western typographical conventions would read the text. Using these algorithms, it reassembles characters into words, words into lines, lines into paragraphs, paragraphs into columns, etc. Even though we use a lot of complex algorithms, some of the formatting is lost. The goal of our software is to extract the text from the PDF without corrupting it and retain a minimum of formatting which is already an achievement.
3. Why are tables not converted?
As mentioned in the previous paragraph, all the information about paragraphs, columns, tables, etc. is lost in PDF files. For instance for tables, you have lines that are drawn at various locations on the page. As of today, our software is not able to determine that these lines drawn on the page form a table.
4. Why are images not the same size as in the PDF?
The pictures that are not the same size are either because they were of different DPI (dots per inch - screen resolution) in the PDF or because the conversion process had to resize them the keep the size of the file reasonable.
5. Wouldn't third party tools do a much better job at converting PDF to HTML?
If you are dissatisfied with the conversion results, we suggest you try other tools to convert your PDF content to HTML yourself, as it is possible that you may get better results. As said before, if you own the original format that the content was created in, try to use that to "Save As" or Export the content as HTML, and upload the HTML to Amazon KDP instead.
6. Is there anything I can do to get better results?
Use the source document whenever possible. If your PDF was produced by converting a Microsoft Word or RTF document for instance, use the Word or RTF import feature, it will give much better results. If your document was created with another publishing tool, try to find a "Save as" or an "Export" feature. If you can export either as HTML, RTF or Microsoft Word, re-importing that in the Creator will give much better results than importing the PDF file.
7. Why does PDF conversion lose so much information about formatting?
When storing digital assets, the key point is to store as much information as possible. The PDF format stores very little information and is not editable. As a general rule, be sure to have your digital assets in a structured format (XML, XHTML, Microsoft Word, raw text, etc.). This will allow you to reuse them in different types of electronic publications.
Please review our list of supported formats