You can now upload PDF files in the Projects App and translate them into your preferred format at no additional cost (i.e., without the use of Project Services).
How to Use PDF Translation
1 - Upload your PDF to the Projects App.
2 - Select your Pipeline Group.
3 - Choose the pipelines you want to translate to.
4 -Select the desired output format by choosing the correct file filter:
- PDF (default)
- Word (.docx)
- Powerpoint (.pptx)
- Plain text (.txt)
5 - Submit your project as usual.
Limitations
PDF processing comes with additional challenges due to the nature of the file format. More complex PDFs might face issues and be harder to process, leading to formatting inconsistencies.
- Formatting limitations may occur with:
- Non-Latin scripts
- Tables
- Emojis
- Non-standard fonts
- Image quality may be reduced in the output
For High-Quality Formatting Needs
If you require perfect formatting for your complex content, we recommend continuing to use DTP (Desktop Publishing) and PDF Handling Services for the best results.
What can I expect from the translation?
Depending on the on the output format you select, the translated file may look different and use different content from the source. See a description below.
PDF to PDF translation
The PDF to PDF converter successfully extracts and translates all main content, including tables, headers, footers, and text formatting (bold, italic, underline). It excludes document properties, comments, and graphical metadata. If the original file contains revisions, they are automatically accepted and included in the output.
What works well
- Styles like bold, italic, and underline are preserved.
- Revisions are accepted automatically.
- Headers and footers are translated.
Limitations
- Layout may shift due to text expansion, especially in tables.
- Fonts are replaced with defaults, which may impact visual consistency.
- Hyperlinks and other interactive elements may be not preserved.
Best suited for
- High quality pdfs with simple layouts. Customers should use DTP or convert PDF to DOCX for better control of the layout.
PDF to TXT
The PDF to TXT converter is for extracting plain text from PDF files. It’s ideal for use cases where layout, styling, and interactivity are not required.
What works well
- All text is extracted including headers and footers.
- Removes all formatting tags.
Limitations
- No preservation of layout, images, styles (e.g., bold, italic), or interactive elements.
- May not preserve text order on more complex documents due to multiple layers.
- Only plain text is extracted — visual structure and design are lost.
Best suited for
- Use cases focused on raw text extraction for linguistic processing, search indexing, or integration with other backend services. Not recommended if formatting or visual design matters.
PDF to DOCX
The PDF to DOCX converter successfully extracts and translates all main content, including tables, headers, footers, and text formatting (bold, italic, underline). It excludes document properties, comments, and graphical metadata. If the original file contains revisions, they are automatically accepted and included in the output.
What works well
- Styles like bold, italic, and underline are preserved.
- Revisions are accepted automatically.
- Headers and footers are translated.
- The resulting DOCX is editable, making it easier for teams to fix or adjust content.
- Suitable for presentation-style outputs.
Limitations
- Layout may shift due to text expansion, especially in tables.
- Fonts are replaced with defaults, which may impact visual consistency.
- Hyperlinks and other interactive elements may be not preserved.
- Suitable for outputs for internal usage or when the document manually cleaned up.
Best suited for
- Use cases where editable and visually structured output is important, and some manual cleanup is acceptable.
- Ideal for internal documents, or customer-facing material with adjustments.
PDF to PPTX
The PDF to PPTX converter provides a visually accurate and editable output, making it one of the more flexible options for post-processing. It works best with simple PDFs but can also handle moderately complex documents with acceptable results.
What works well
- All content is extracted and converted.
- Layout and visual structure are generally well preserved.
- The resulting PPTX is editable, making it easier for teams to fix or adjust content.
- Suitable for presentation-style outputs.
Limitations
- Fonts are often replaced with defaults, affecting visual consistency.
- Text styles like bold, italic, and color may not be preserved in complex documents.
- Text expansion may cause layout shifts or overlapping content.
Best suited for
- Use cases where editable and visually structured output is important, and some manual cleanup is acceptable.
- Ideal for internal presentations, design reviews, or customer-facing material requiring adjustments.