Pdf to rtf php

Saved searches

Use saved searches to filter your results more quickly

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

A pure PHP library for reading and writing word processing documents

License

Unknown and 2 other licenses found

Licenses found

PHPOffice/PHPWord

This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Sign In Required

Please sign in to use Codespaces.

Launching GitHub Desktop

If nothing happens, download GitHub Desktop and try again.

Читайте также:  Html form post to https

Launching GitHub Desktop

If nothing happens, download GitHub Desktop and try again.

Launching Xcode

If nothing happens, download Xcode and try again.

Launching Visual Studio Code

Your codespace will open once ready.

There was a problem preparing your codespace, please try again.

Latest commit

Git stats

Files

Failed to load latest commit information.

README.md

Latest Stable Version Code Quality Code Coverage Total Downloads License

PHPWord is a library written in pure PHP that provides a set of classes to write to and read from different document file formats. The current version of PHPWord supports Microsoft Office Open XML (OOXML or OpenXML), OASIS Open Document Format for Office Applications (OpenDocument or ODF), Rich Text Format (RTF), HTML, and PDF.

PHPWord is an open source project licensed under the terms of LGPL version 3. PHPWord is aimed to be a high quality software product by incorporating continuous integration and unit testing. You can learn more about PHPWord by reading the Developers’ Documentation.

If you have any questions, please ask on StackOverFlow

With PHPWord, you can create OOXML, ODF, or RTF documents dynamically using your PHP scripts. Below are some of the things that you can do with PHPWord library:

  • Set document properties, e.g. title, subject, and creator.
  • Create document sections with different settings, e.g. portrait/landscape, page size, and page numbering
  • Create header and footer for each sections
  • Set default font type, font size, and paragraph style
  • Use UTF-8 and East Asia fonts/characters
  • Define custom font styles (e.g. bold, italic, color) and paragraph styles (e.g. centered, multicolumns, spacing) either as named style or inline in text
  • Insert paragraphs, either as a simple text or complex one (a text run) that contains other elements
  • Insert titles (headers) and table of contents
  • Insert text breaks and page breaks
  • Insert and format images, either local, remote, or as page watermarks
  • Insert binary OLE Objects such as Excel or Visio
  • Insert and format table with customized properties for each rows (e.g. repeat as header row) and cells (e.g. background color, rowspan, colspan)
  • Insert list items as bulleted, numbered, or multilevel
  • Insert hyperlinks
  • Insert footnotes and endnotes
  • Insert drawing shapes (arc, curve, line, polyline, rect, oval)
  • Insert charts (pie, doughnut, bar, line, area, scatter, radar)
  • Insert form fields (textinput, checkbox, and dropdown)
  • Create document from templates
  • Use XSL 1.0 style sheets to transform headers, main document part, and footers of an OOXML template
  • . and many more features on progress

PHPWord requires the following:

  • PHP 7.1+
  • XML Parser extension
  • Laminas Escaper component
  • Zip extension (optional, used to write OOXML and ODF)
  • GD extension (optional, used to add images)
  • XMLWriter extension (optional, used to write OOXML and ODF)
  • XSL extension (optional, used to apply XSL style sheet to template )
  • dompdf library (optional, used to write PDF)

PHPWord is installed via Composer. To add a dependency to PHPWord in your project, either

Run the following to use the latest stable version

composer require phpoffice/phpword

or if you want the latest unreleased version

composer require phpoffice/phpword:dev-master

The following is a basic usage example of the PHPWord library.

 require_once 'bootstrap.php'; // Creating the new document. $phpWord = new \PhpOffice\PhpWord\PhpWord(); /* Note: any element you append to a document must reside inside of a Section. */ // Adding an empty Section to the document. $section = $phpWord->addSection(); // Adding Text element to the Section having font styled by default. $section->addText( '"Learn from yesterday, live for today, hope for tomorrow. ' . 'The important thing is not to stop questioning." ' . '(Albert Einstein)' ); /* * Note: it's possible to customize font style of the Text element you add in three ways: * - inline; * - using named font style (new font style object will be implicitly created); * - using explicitly created font style object. */ // Adding Text element with font customized inline. $section->addText( '"Great achievement is usually born of great sacrifice, ' . 'and is never the result of selfishness." ' . '(Napoleon Hill)', array('name' => 'Tahoma', 'size' => 10) ); // Adding Text element with font customized using named font style. $fontStyleName = 'oneUserDefinedStyle'; $phpWord->addFontStyle( $fontStyleName, array('name' => 'Tahoma', 'size' => 10, 'color' => '1B2232', 'bold' => true) ); $section->addText( '"The greatest accomplishment is not in never falling, ' . 'but in rising again after you fall." ' . '(Vince Lombardi)', $fontStyleName ); // Adding Text element with font customized using explicitly created font style object. $fontStyle = new \PhpOffice\PhpWord\Style\Font(); $fontStyle->setBold(true); $fontStyle->setName('Tahoma'); $fontStyle->setSize(13); $myTextElement = $section->addText('"Believe you can and you\'re halfway there." (Theodor Roosevelt)'); $myTextElement->setFontStyle($fontStyle); // Saving the document as OOXML file. $objWriter = \PhpOffice\PhpWord\IOFactory::createWriter($phpWord, 'Word2007'); $objWriter->save('helloWorld.docx'); // Saving the document as ODF file. $objWriter = \PhpOffice\PhpWord\IOFactory::createWriter($phpWord, 'ODText'); $objWriter->save('helloWorld.odt'); // Saving the document as HTML file. $objWriter = \PhpOffice\PhpWord\IOFactory::createWriter($phpWord, 'HTML'); $objWriter->save('helloWorld.html'); /* Note: we skip RTF, because it's not XML-based and requires a different example. */ /* Note: we skip PDF, because "HTML-to-PDF" approach is used to create PDF documents. */

More examples are provided in the samples folder. For an easy access to those samples launch php -S localhost:8000 in the samples directory then browse to http://localhost:8000 to view the samples. You can also read the Developers’ Documentation for more detail.

We welcome everyone to contribute to PHPWord. Below are some of the things that you can do to contribute.

  • Read our contributing guide.
  • Fork us and request a pull to the master branch.
  • Submit bug reports or feature requests to GitHub.
  • Follow @PHPWord and @PHPOffice on Twitter.

Источник

Convert PDF to RTF via Free App or PHP

GroupDocs.Conversion Cloud is a cloud-based document conversion service which allows developers to convert various document formats to and from over 153 different file formats, including but not limited to PDF, RTF, PDF, DOCX, XLSX, PPTX, HTML, EPUB, and more.

GroupDocs.Conversion Cloud provides Free Apps and REST APIs that can be integrated into web and mobile applications, allowing developers to easily incorporate document conversion functionality into their applications without having to install any software locally. The service supports conversion of PDF, documents, spreadsheets, presentations, images, RTF, and other types of files, and provides advanced options such as setting conversion options, specifying output file formats, applying watermarks, and more.

With GroupDocs Cloud Conversion, developers can implement document conversion features in their applications to automate tasks such as converting documents for archival purposes, generating reports, extracting text and images from documents, and integrating document conversion capabilities into their workflow. The service offers high-quality and accurate document conversion capabilities that can help businesses streamline their document processing workflows and improve productivity.

How to convert PDF to RTF

  • Select the file by clicking the PDF to RTF App or simply drag & drop a PDF file.
  • Click the Convert button to upload PDF and convert it to a RTF file.
  • Click on the Save button when it appears after successful PDF to RTF format conversion.
  • That is all! You can use your converted RTF document as needed.

Frequently Asked Questions (FAQ)

I want to create my own app that can convert PDF to RTF?

Check our SDKs at GitHub if you are looking for the source code to convert PDF file format to RTF in the Cloud.

Can I try PDF to RTF conversion for free?

GroupDocs.Conversion Cloud App is completely free. You can convert as many PDF files to RTF as you may like. If you are a developer and want to integrate this feature in your own app, you can try GroupDocs.Conversion Low-Code APIs without any limitations.

I do not want to upload my confidential PDF or RTF files anywhere? What are my options?

GroupDocs.Conversion Cloud is also available as Docker image which can be used to self-host the service. Or you may build your own services using GroupDocs.Conversion High-code APIs which currently drive both PDF and RTF Free Conversion App and REST APIs.

Источник

ruiicheese

I’m using the PDF parser from https://www.pdfparser.org/. It is one of the easiest and most accurate PDF parser. You can download it from https://github.com/smalot/pdfparser.

  • This library requires PHP 5.3.
  • PDFParser is built on top of TCPDF parser.
  • This library will be automatically downloaded through Composer command line.

convert_pdf_to_text.php – A PHP file to call the PDF parser.

parseFile($server_file); if ($pdf != "") < $original_text = $pdf->getText(); if ($original_text != "") < $text = nl2br($original_text); // Paragraphs and line break formatting $text = clean_ascii_characters($text); // Check special characters $text = str_replace(array("


", "


"), "

", $text); // Optional $text = addslashes($text); // Backslashes for single quotes $text = stripslashes($text); $text = strip_tags($text); /**********************************************/ /* Additional step to check formatting issues */ // There may be some PDF formatting issues. I'm trying to check if the words are: // (a) Join. E.g., HelloWorld!Thereisnospacingbetweenwords // (b) splitted. E.g., H e l l o W o r l d ! E x c e s s i v e s p a c i n g $check_text = preg_split('/\s+/', $text, -1, PREG_SPLIT_NO_EMPTY); $no_spacing_error = 0; $excessive_spacing_error = 0; foreach($check_text as $word_key => $word) < if (strlen($word) >= 30) < // 30 is a limit that I set for a word length, assuming that no word would be 30 length long $no_spacing_error++; >else if (strlen($word) == 1) < // To check if the word is 1 word length if (preg_match('/^[A-Za-z]+$/', $word)) < // Only consider alphabetical words and ignore numbers. $excessive_spacing_error++; >> > // Set the boundaries of errors you can accept // E.g., we reject the change if there are 30 or more $no_spacing_error or 150 or more $excessive_spacing_error issues if ($no_spacing_error >= 30 || $excessive_spacing_error >= 150) < echo "Too many formatting issues
"; echo $text; > else < echo "Success!
"; echo $text; > /* End of additional step */ /**************************/ > else < echo "No text extracted from PDF."; >> else < echo "parseFile fns failed. Not a PDF."; >// Common function function clean_ascii_characters($string) < $string = str_replace(array('-', '–'), '-', $string); $string = preg_replace('/[\x00-\x1F\x80-\xFF]/', '', $string); return $string; >?>

Источник

Оцените статью