In this short article, you will learn how to merge or split two or more PDF files using command line and GUI based tools. This is suitable for both beginners and experienced Linux users, so let’s get started.
PDFTK is a command line tool used to manipulate PDF files. It enables users to carry out several operations on PDF files like splitting, merging, encrypting, decrypting and many more.
For a complete guide on how to install and use PDFTK to merge or split PDF documents on Linux, follow this guide.
QPDF is a lightweight program used to carry out content-preserving and structural transformations on PDF files. It allows to copy objects from one PDF document into another and to handle the list of available pages in a PDF file. This enables the QPDF tool, which has a low dependency on other utilities, to split and merge PDF documents.
Developers of PDF generating applications will find QPDF capabilities very useful indeed. It can also be used to create PDF documents from scratch. QPDF however is not a PDF viewer or a PDF file converter to other formats since it ignores the semantics of PDF files content streams.
In order to install the lightweight QPDF tool, issue the sudo command below :
sudo apt install qpdf
When invoking qpdf, the basic syntax is as follows :
qpdf [ options ] input_filename [ output_filename ]
This will actually convert the PDF file input_filename to the PDF file output_filename. The output document is identical in functionality to the input file though it may have been reorganized structure wise. The options outlined below will control many transformations on the PDF files. The parameter –empty may be provided, in place of input_filename attribute. if you would like to add pages from another file, you could use the –empty switch .
In the command below, qpdf is called with the –empty switch :
Two new pdf files are created separately by each command.
If @filename has been inserted at any position in the command-line, QPDF will read the filename line by line and treat them as a command line argument. The @ switch enables arguments to be retrieved or read from standard input. This will enabe qpdf to be called with any number of long arguments.
If the output_filename argument contains only “-”, it would tell QPDF to write to standard output. If you want to overwrite or replace the input file with the output document, the option –replace-input should be used along with the output file name omitted.
QPDF gives the possibility to merge and split PDF files by choosing pages from one or many input files. Any single input file given is considered as the primary input file and used as the starting point. This file’s pages will be replaced according to the specification in the arguments of the command.
–pages input-file [ –password=password ] [ page-range ] [ … ] —
It is possible to specify multiple input files. Each one is given an optional password if it is password-protected as well as the range of pages. The “–” indicates that the parsing of page selection flags is finished.
In order to merge PDF files into one single file, the following command should be executed :
qpdf –empty output_merged.pdf –pages input_file1.pdf input_file2.pdf
Where the files input_file1.pdf and input_file2.pdf will be merged into the PDF document ouput_merged.pdf .
If you wanted to merge all pdf files in the current directory into one single output file, you should run the command below :
qpdf –empty output_file.pdf –pages *.pdf —
When the option –collate is specified, the meaning of the option –pages will change so that the specified input files are collated instead of concatenated as modified by page ranges . For instance, if you add the two files odd.pdf and even.pdf where odd.pdf contains odd pages of a document whereas even.pdf contains the even pages, the command :
qpdf –collate –pages odd_file.pdf even_file.pdf — all.pdf
will collate the pages. The output will result in the picking page 1 from odd_file.pdf then page 1 from even_file.pdf, next would be page 2 from odd_file.pdf and then page 2 from even_file.pdf and so forth until all pages from both files have been included. It is possible to specify any number of files or page ranges. If any file has less pages than others, it will be skipped once all its pages have been included.
In order now to pick pages 1-7 from an input file named input_file.pdf while all metadata associated with that file is preserved, run the command below :
qpdf input_file.pdf –pages 1-7 — outfile.pdf
If you wanted pages 1 through 5 from infile.pdf but you wanted the rest of the metadata to be dropped, you could instead run
qpdf –empty –pages infile.pdf 1-5 — outfile.pdf
PDFUNITE is a utility that is part of the package poppler-utils, which means that you will get PDFUNITE when you install the package poppler-utils. After the installation is completed, you can immediately start merging your PDF files.
PDFUNITE has a pretty simple syntax :
pdfunite [options] Inputfile1.pdf Inputfile2.pdf .. MergedFile.pdf
Where the files Inputfile1.pdf, Inputfile2.pdf .. are the source files whereas the merged file should be placed at the end of the command line, i.e. MergedFile.pdf .
In order to merge PDF files into one single PDF document, the following command should be used :
pdfunite InputFile1.pdf InputFile2.pdf InputFile3.pdf merged_File.pdf
The input files need to belong to the same directory where PDFUNITE is executed. If your PDF files belong to different folders, you would have to provide the absolute path.
Much like its PDFUNITE, PDFSEPARATE is also a unit of the package poppler-utils.
The utility PDFSEPARATE has the following syntax:
pdfseparate [options] InputFile.pdf OutputFile_Pattern
PDFSEPARATE reads the input file InputFile.pdf and breaks it up into one or more PDF file OutputFile_Pattern each of which contains one page.
The OutputFile_Pattern should contain the wildcard %d which will be replaced by the page number at the end of the operation. The input file should not be password protected.
There are mainly two options in the PDFSEPARATE utility :
-f number : Indicates the first page to be extracted. If omitted, the extraction will start with the first page or page 1.
-l number : Indicates the last page to be extracted. Extraction ends with the last page if omitted.
The following command :
pdfseparate InputFile.pdf InputFile-%d.pdf
Would tell PDFSEPARATE to extract the entire pages from InputFile.pdf to as many files as the number of pages .i.e. if InputFile.pdf has 4 pages, there will be 4 files :
InputFile-1.pdf, InputFile-2.pdf, InputFile-3.pdf and InputFile-4.pdf
Not only command tools can carry out the merging and splitting of PDF files but other GUI based utilities can do the job as well. One of these applications is PDFSAM. It has the possibility to perform many other operations as well like rotating and extracting pages, splitting bookmarks and many others. PDFsam is a Java based tool which is available in most Linux distros. Its GUI is rather intuitive, simple and self-explanatory. For Ubuntu / Debian, you can run the APT command below in order to install PDFsam:
sudo apt-get install pdfsam
Once finished, just invoke the command :
This will being up the popup below which indicates that the application is starting up :
Finally PDFSAM graphical interface will show up as shown below :
Once you click on ‘Merge’ button, the following window will pop up:
Click on the ’Add’ button to select the input PDF files you want to merge. Next scroll down to the ‘Destination file’ section and click the ‘Browse’ button:
You will be able to select a location and a filename for the merged PDF file. Finally Click on ‘Run’ button and you are done !
To split a document, just click on the ‘Split’ button in the main interface :
The principle is the same here as in the previous section but in this case, you would need to choose the file to be broken up using the file browser. Next you would need to select the ‘Split settings’ before selecting the output directory where the splitted files would be generated. Finally click on ‘Run’ button.
You have seen how to merge and split PDF documents using command line based tools like PDFTK, PDFUNITE, PDFSEPARATE and QPDF. For those who do not feel at ease handling commands, they can choose PDFSAM which is a GUI based utility.