Class PdfReader

    • Field Detail

      • unethicalreading

        public static boolean unethicalreading
        The iText developers are not responsible if you decide to change the value of this static parameter.
        Since:
        5.0.2
      • debugmode

        public static boolean debugmode
      • LOGGER

        private static final Logger LOGGER
      • pageInhCandidates

        static final PdfName[] pageInhCandidates
      • endstream

        static final byte[] endstream
      • endobj

        static final byte[] endobj
      • xref

        protected long[] xref
      • objStmMark

        protected java.util.HashMap<java.lang.Integer,​IntHashtable> objStmMark
      • newXrefType

        protected boolean newXrefType
      • xrefObj

        protected java.util.ArrayList<PdfObject> xrefObj
      • acroFormParsed

        protected boolean acroFormParsed
      • encrypted

        protected boolean encrypted
      • rebuilt

        protected boolean rebuilt
      • freeXref

        protected int freeXref
      • tampered

        protected boolean tampered
      • lastXref

        protected long lastXref
      • eofPos

        protected long eofPos
      • pdfVersion

        protected char pdfVersion
      • password

        protected byte[] password
      • certificateKey

        protected java.security.Key certificateKey
      • certificate

        protected java.security.cert.Certificate certificate
      • certificateKeyProvider

        protected java.lang.String certificateKeyProvider
      • ownerPasswordUsed

        private boolean ownerPasswordUsed
      • strings

        protected java.util.ArrayList<PdfString> strings
      • sharedStreams

        protected boolean sharedStreams
      • consolidateNamedDestinations

        protected boolean consolidateNamedDestinations
      • remoteToLocalNamedDestinations

        protected boolean remoteToLocalNamedDestinations
      • rValue

        protected int rValue
      • pValue

        protected long pValue
      • objNum

        private int objNum
      • objGen

        private int objGen
      • fileLength

        private long fileLength
      • hybridXref

        private boolean hybridXref
      • lastXrefPartial

        private int lastXrefPartial
      • partial

        private boolean partial
      • encryptionError

        private boolean encryptionError
      • memoryLimitsAwareHandler

        MemoryLimitsAwareHandler memoryLimitsAwareHandler
        Handler which will be used for decompression of pdf streams.
      • appendable

        private boolean appendable
        Holds value of property appendable.
      • COUNTER

        protected static Counter COUNTER
      • readDepth

        private int readDepth
    • Constructor Detail

      • PdfReader

        private PdfReader​(RandomAccessSource byteSource,
                          boolean partialRead,
                          byte[] ownerPassword,
                          java.security.cert.Certificate certificate,
                          java.security.Key certificateKey,
                          java.lang.String certificateKeyProvider,
                          ExternalDecryptionProcess externalDecryptionProcess,
                          boolean closeSourceOnConstructorError)
                   throws java.io.IOException
        Constructs a new PdfReader. This is the master constructor.
        Parameters:
        byteSource - source of bytes for the reader
        partialRead - if true, the reader is opened in partial mode (PDF is parsed on demand), if false, the entire PDF is parsed into memory as the reader opens
        ownerPassword - the password or null if no password is required
        certificate - the certificate or null if no certificate is required
        certificateKey - the key or null if no certificate key is required
        certificateKeyProvider - the name of the key provider, or null if no key is required
        externalDecryptionProcess -
        closeSourceOnConstructorError - if true, the byteSource will be closed if there is an error during construction of this reader
        Throws:
        java.io.IOException
      • PdfReader

        private PdfReader​(RandomAccessSource byteSource,
                          ReaderProperties properties)
                   throws java.io.IOException
        Constructs a new PdfReader. This is the master constructor.
        Parameters:
        byteSource - source of bytes for the reader
        properties - the properties which will be used to create the reader
        Throws:
        java.io.IOException
      • PdfReader

        public PdfReader​(java.lang.String filename)
                  throws java.io.IOException
        Reads and parses a PDF document.
        Parameters:
        filename - the file name of the document
        Throws:
        java.io.IOException - on error
      • PdfReader

        public PdfReader​(ReaderProperties properties,
                         java.lang.String filename)
                  throws java.io.IOException
        Reads and parses a PDF document.
        Parameters:
        properties - the properties which will be used to create the reader
        filename - the file name of the document
        Throws:
        java.io.IOException - on error
      • PdfReader

        public PdfReader​(java.lang.String filename,
                         byte[] ownerPassword)
                  throws java.io.IOException
        Reads and parses a PDF document.
        Parameters:
        filename - the file name of the document
        ownerPassword - the password to read the document
        Throws:
        java.io.IOException - on error
      • PdfReader

        public PdfReader​(java.lang.String filename,
                         byte[] ownerPassword,
                         boolean partial)
                  throws java.io.IOException
        Reads and parses a PDF document.
        Parameters:
        filename - the file name of the document
        ownerPassword - the password to read the document
        partial - indicates if the reader needs to read the document only partially
        Throws:
        java.io.IOException - on error
      • PdfReader

        public PdfReader​(byte[] pdfIn)
                  throws java.io.IOException
        Reads and parses a PDF document.
        Parameters:
        pdfIn - the byte array with the document
        Throws:
        java.io.IOException - on error
      • PdfReader

        public PdfReader​(byte[] pdfIn,
                         byte[] ownerPassword)
                  throws java.io.IOException
        Reads and parses a PDF document.
        Parameters:
        pdfIn - the byte array with the document
        ownerPassword - the password to read the document
        Throws:
        java.io.IOException - on error
      • PdfReader

        public PdfReader​(java.lang.String filename,
                         java.security.cert.Certificate certificate,
                         java.security.Key certificateKey,
                         java.lang.String certificateKeyProvider)
                  throws java.io.IOException
        Reads and parses a PDF document.
        Parameters:
        filename - the file name of the document
        certificate - the certificate to read the document
        certificateKey - the private key of the certificate
        certificateKeyProvider - the security provider for certificateKey
        Throws:
        java.io.IOException - on error
      • PdfReader

        public PdfReader​(java.lang.String filename,
                         java.security.cert.Certificate certificate,
                         ExternalDecryptionProcess externalDecryptionProcess)
                  throws java.io.IOException
        Reads and parses a PDF document.
        Parameters:
        filename - the file name of the document
        certificate -
        externalDecryptionProcess -
        Throws:
        java.io.IOException - on error
      • PdfReader

        public PdfReader​(byte[] pdfIn,
                         java.security.cert.Certificate certificate,
                         ExternalDecryptionProcess externalDecryptionProcess)
                  throws java.io.IOException
        Reads and parses a PDF document.
        Parameters:
        pdfIn - the document as a byte array
        certificate -
        externalDecryptionProcess -
        Throws:
        java.io.IOException - on error
      • PdfReader

        public PdfReader​(java.io.InputStream inputStream,
                         java.security.cert.Certificate certificate,
                         ExternalDecryptionProcess externalDecryptionProcess)
                  throws java.io.IOException
        Reads and parses a PDF document.
        Parameters:
        inputStream - the PDF file
        certificate -
        externalDecryptionProcess -
        Throws:
        java.io.IOException - on error
      • PdfReader

        public PdfReader​(java.net.URL url)
                  throws java.io.IOException
        Reads and parses a PDF document.
        Parameters:
        url - the URL of the document
        Throws:
        java.io.IOException - on error
      • PdfReader

        public PdfReader​(java.net.URL url,
                         byte[] ownerPassword)
                  throws java.io.IOException
        Reads and parses a PDF document.
        Parameters:
        url - the URL of the document
        ownerPassword - the password to read the document
        Throws:
        java.io.IOException - on error
      • PdfReader

        public PdfReader​(java.io.InputStream is,
                         byte[] ownerPassword)
                  throws java.io.IOException
        Reads and parses a PDF document.
        Parameters:
        is - the InputStream containing the document. The stream is read to the end but is not closed
        ownerPassword - the password to read the document
        Throws:
        java.io.IOException - on error
      • PdfReader

        public PdfReader​(java.io.InputStream is)
                  throws java.io.IOException
        Reads and parses a PDF document.
        Parameters:
        is - the InputStream containing the document. The stream is read to the end but is not closed
        Throws:
        java.io.IOException - on error
      • PdfReader

        public PdfReader​(ReaderProperties properties,
                         java.io.InputStream is)
                  throws java.io.IOException
        Reads and parses a PDF document.
        Parameters:
        properties - the properties which will be used to create the reader
        is - the InputStream containing the document. The stream is read to the end but is not closed
        Throws:
        java.io.IOException - on error
      • PdfReader

        public PdfReader​(ReaderProperties properties,
                         RandomAccessFileOrArray raf)
                  throws java.io.IOException
        Reads and parses a PDF document.
        Parameters:
        properties - the properties which will be used to create the reader
        raf - the document location
        Throws:
        java.io.IOException - on error
      • PdfReader

        public PdfReader​(RandomAccessFileOrArray raf,
                         byte[] ownerPassword)
                  throws java.io.IOException
        Reads and parses a pdf document. Contrary to the other constructors only the xref is read into memory. The reader is said to be working in "partial" mode as only parts of the pdf are read as needed.
        Parameters:
        raf - the document location
        ownerPassword - the password or null for no password
        Throws:
        java.io.IOException - on error
      • PdfReader

        public PdfReader​(RandomAccessFileOrArray raf,
                         byte[] ownerPassword,
                         boolean partial)
                  throws java.io.IOException
        Reads and parses a pdf document.
        Parameters:
        raf - the document location
        ownerPassword - the password or null for no password
        partial - indicates if the reader needs to read the document only partially. See PdfReader(RandomAccessFileOrArray, byte[])
        Throws:
        java.io.IOException - on error
      • PdfReader

        public PdfReader​(PdfReader reader)
        Creates an independent duplicate.
        Parameters:
        reader - the PdfReader to duplicate
    • Method Detail

      • getCounter

        protected Counter getCounter()
      • getOffsetTokeniser

        private static PRTokeniser getOffsetTokeniser​(RandomAccessSource byteSource)
                                               throws java.io.IOException
        Utility method that checks the provided byte source to see if it has junk bytes at the beginning. If junk bytes are found, construct a tokeniser that ignores the junk. Otherwise, construct a tokeniser for the byte source as it is
        Parameters:
        byteSource - the source to check
        Returns:
        a tokeniser that is guaranteed to start at the PDF header
        Throws:
        java.io.IOException - if there is a problem reading the byte source
      • getSafeFile

        public RandomAccessFileOrArray getSafeFile()
        Gets a new file instance of the original PDF document.
        Returns:
        a new file instance of the original PDF document
      • getNumberOfPages

        public int getNumberOfPages()
        Gets the number of pages in the document. Partial mode: return the value stored in the COUNT field of the pageref Full mode: return the total number of pages found while loading in the entire document.
        Returns:
        the number of pages in the document
      • getCatalog

        public PdfDictionary getCatalog()
        Returns the document's catalog. This dictionary is not a copy, any changes will be reflected in the catalog.
        Returns:
        the document's catalog
      • getAcroForm

        public PRAcroForm getAcroForm()
        Returns the document's acroform, if it has one.
        Returns:
        the document's acroform
      • getPageRotation

        public int getPageRotation​(int index)
        Gets the page rotation. This value can be 0, 90, 180 or 270.
        Parameters:
        index - the page number. The first page is 1
        Returns:
        the page rotation
      • getPageSizeWithRotation

        public Rectangle getPageSizeWithRotation​(int index)
        Gets the page size, taking rotation into account. This is a Rectangle with the value of the /MediaBox and the /Rotate key.
        Parameters:
        index - the page number. The first page is 1
        Returns:
        a Rectangle
      • getPageSizeWithRotation

        public Rectangle getPageSizeWithRotation​(PdfDictionary page)
        Gets the rotated page from a page dictionary.
        Parameters:
        page - the page dictionary
        Returns:
        the rotated page
      • getPageSize

        public Rectangle getPageSize​(int index)
        Gets the page size without taking rotation into account. This is the value of the /MediaBox key.
        Parameters:
        index - the page number. The first page is 1
        Returns:
        the page size
      • getPageSize

        public Rectangle getPageSize​(PdfDictionary page)
        Gets the page from a page dictionary
        Parameters:
        page - the page dictionary
        Returns:
        the page
      • getCropBox

        public Rectangle getCropBox​(int index)
        Gets the crop box without taking rotation into account. This is the value of the /CropBox key. The crop box is the part of the document to be displayed or printed. It usually is the same as the media box but may be smaller. If the page doesn't have a crop box the page size will be returned.
        Parameters:
        index - the page number. The first page is 1
        Returns:
        the crop box
      • getBoxSize

        public Rectangle getBoxSize​(int index,
                                    java.lang.String boxName)
        Gets the box size. Allowed names are: "crop", "trim", "art", "bleed" and "media".
        Parameters:
        index - the page number. The first page is 1
        boxName - the box name
        Returns:
        the box rectangle or null
      • getInfo

        public java.util.HashMap<java.lang.String,​java.lang.String> getInfo()
        Returns the content of the document information dictionary as a HashMap of String.
        Returns:
        content of the document information dictionary
      • getNormalizedRectangle

        public static Rectangle getNormalizedRectangle​(PdfArray box)
        Normalizes a Rectangle so that llx and lly are smaller than urx and ury.
        Parameters:
        box - the original rectangle
        Returns:
        a normalized Rectangle
      • isTagged

        public boolean isTagged()
        Checks if the PDF is a tagged PDF.
      • readPdf

        protected void readPdf()
                        throws java.io.IOException
        Parses the entire PDF
        Throws:
        java.io.IOException
      • readPdfPartial

        protected void readPdfPartial()
                               throws java.io.IOException
        Partially parses the pdf
        Throws:
        java.io.IOException
      • equalsArray

        private boolean equalsArray​(byte[] ar1,
                                    byte[] ar2,
                                    int size)
      • readDecryptedDocObj

        private void readDecryptedDocObj()
                                  throws java.io.IOException
        Throws:
        java.io.IOException
      • getPdfObjectRelease

        public static PdfObject getPdfObjectRelease​(PdfObject obj)
        Parameters:
        obj -
        Returns:
        a PdfObject
      • getPdfObject

        public static PdfObject getPdfObject​(PdfObject obj)
        Reads a PdfObject resolving an indirect reference if needed.
        Parameters:
        obj - the PdfObject to read
        Returns:
        the resolved PdfObject
      • getPdfObjectRelease

        public static PdfObject getPdfObjectRelease​(PdfObject obj,
                                                    PdfObject parent)
        Reads a PdfObject resolving an indirect reference if needed. If the reader was opened in partial mode the object will be released to save memory.
        Parameters:
        obj - the PdfObject to read
        parent -
        Returns:
        a PdfObject
      • getPdfObject

        public static PdfObject getPdfObject​(PdfObject obj,
                                             PdfObject parent)
        Parameters:
        obj -
        parent -
        Returns:
        a PdfObject
      • getPdfObjectRelease

        public PdfObject getPdfObjectRelease​(int idx)
        Parameters:
        idx -
        Returns:
        a PdfObject
      • getPdfObject

        public PdfObject getPdfObject​(int idx)
        Parameters:
        idx -
        Returns:
        aPdfObject
      • resetLastXrefPartial

        public void resetLastXrefPartial()
      • releaseLastXrefPartial

        public void releaseLastXrefPartial()
      • releaseLastXrefPartial

        public static void releaseLastXrefPartial​(PdfObject obj)
        Parameters:
        obj -
      • setXrefPartialObject

        private void setXrefPartialObject​(int idx,
                                          PdfObject obj)
      • readPages

        protected void readPages()
                          throws java.io.IOException
        Throws:
        java.io.IOException
      • readDocObjPartial

        protected void readDocObjPartial()
                                  throws java.io.IOException
        Throws:
        java.io.IOException
      • readSingleObject

        protected PdfObject readSingleObject​(int k)
                                      throws java.io.IOException
        Throws:
        java.io.IOException
      • readOneObjStm

        protected PdfObject readOneObjStm​(PRStream stream,
                                          int idx)
                                   throws java.io.IOException
        Throws:
        java.io.IOException
      • dumpPerc

        public double dumpPerc()
        Returns:
        the percentage of the cross reference table that has been read
      • readDocObj

        protected void readDocObj()
                           throws java.io.IOException
        Throws:
        java.io.IOException
      • checkPRStreamLength

        private void checkPRStreamLength​(PRStream stream)
                                  throws java.io.IOException
        Throws:
        java.io.IOException
      • readObjStm

        protected void readObjStm​(PRStream stream,
                                  IntHashtable map)
                           throws java.io.IOException
        Throws:
        java.io.IOException
      • killIndirect

        public static PdfObject killIndirect​(PdfObject obj)
        Eliminates the reference to the object freeing the memory used by it and clearing the xref entry.
        Parameters:
        obj - the object. If it's an indirect reference it will be eliminated
        Returns:
        the object or the already erased dereferenced object
      • ensureXrefSize

        private void ensureXrefSize​(int size)
      • readXref

        protected void readXref()
                         throws java.io.IOException
        Throws:
        java.io.IOException
      • readXrefSection

        protected PdfDictionary readXrefSection()
                                         throws java.io.IOException
        Throws:
        java.io.IOException
      • readXRefStream

        protected boolean readXRefStream​(long ptr)
                                  throws java.io.IOException
        Throws:
        java.io.IOException
      • rebuildXref

        protected void rebuildXref()
                            throws java.io.IOException
        Throws:
        java.io.IOException
      • readDictionary

        protected PdfDictionary readDictionary()
                                        throws java.io.IOException
        Throws:
        java.io.IOException
      • readArray

        protected PdfArray readArray()
                              throws java.io.IOException
        Throws:
        java.io.IOException
      • readPRObject

        protected PdfObject readPRObject()
                                  throws java.io.IOException
        Throws:
        java.io.IOException
      • FlateDecode

        public static byte[] FlateDecode​(byte[] in)
        Decodes a stream that has the FlateDecode filter.
        Parameters:
        in - the input data
        Returns:
        the decoded data
      • FlateDecode

        static byte[] FlateDecode​(byte[] in,
                                  java.io.ByteArrayOutputStream out)
        Decodes a stream that has the FlateDecode filter.
        Parameters:
        in - the input data
        Returns:
        the decoded data
      • decodePredictor

        public static byte[] decodePredictor​(byte[] in,
                                             PdfObject dicPar)
        Parameters:
        in -
        dicPar -
        Returns:
        a byte array
      • FlateDecode

        public static byte[] FlateDecode​(byte[] in,
                                         boolean strict)
        A helper to FlateDecode.
        Parameters:
        in - the input data
        strict - true to read a correct stream. false to try to read a corrupted stream
        Returns:
        the decoded data
      • FlateDecode

        private static byte[] FlateDecode​(byte[] in,
                                          boolean strict,
                                          java.io.ByteArrayOutputStream out)
      • ASCIIHexDecode

        public static byte[] ASCIIHexDecode​(byte[] in)
        Decodes a stream that has the ASCIIHexDecode filter.
        Parameters:
        in - the input data
        Returns:
        the decoded data
      • ASCIIHexDecode

        static byte[] ASCIIHexDecode​(byte[] in,
                                     java.io.ByteArrayOutputStream out)
      • ASCII85Decode

        public static byte[] ASCII85Decode​(byte[] in)
        Decodes a stream that has the ASCII85Decode filter.
        Parameters:
        in - the input data
        Returns:
        the decoded data
      • ASCII85Decode

        static byte[] ASCII85Decode​(byte[] in,
                                    java.io.ByteArrayOutputStream out)
      • LZWDecode

        public static byte[] LZWDecode​(byte[] in)
        Decodes a stream that has the LZWDecode filter.
        Parameters:
        in - the input data
        Returns:
        the decoded data
      • LZWDecode

        static byte[] LZWDecode​(byte[] in,
                                java.io.ByteArrayOutputStream out)
      • isRebuilt

        public boolean isRebuilt()
        Checks if the document had errors and was rebuilt.
        Returns:
        true if rebuilt.
      • getPageN

        public PdfDictionary getPageN​(int pageNum)
        Gets the dictionary that represents a page.
        Parameters:
        pageNum - the page number. 1 is the first
        Returns:
        the page dictionary
      • getPageNRelease

        public PdfDictionary getPageNRelease​(int pageNum)
        Parameters:
        pageNum -
        Returns:
        a Dictionary object
      • releasePage

        public void releasePage​(int pageNum)
        Parameters:
        pageNum -
      • resetReleasePage

        public void resetReleasePage()
      • getPageOrigRef

        public PRIndirectReference getPageOrigRef​(int pageNum)
        Gets the page reference to this page.
        Parameters:
        pageNum - the page number. 1 is the first
        Returns:
        the page reference
      • getPageContent

        public byte[] getPageContent​(int pageNum,
                                     RandomAccessFileOrArray file)
                              throws java.io.IOException
        Gets the contents of the page.
        Parameters:
        pageNum - the page number. 1 is the first
        file - the location of the PDF document
        Returns:
        the content
        Throws:
        java.io.IOException - on error
      • getPageContent

        public static byte[] getPageContent​(PdfDictionary page)
                                     throws java.io.IOException
        Gets the content from the page dictionary.
        Parameters:
        page - the page dictionary
        Returns:
        the content
        Throws:
        java.io.IOException - on error
        Since:
        5.0.6
      • getPageResources

        public PdfDictionary getPageResources​(int pageNum)
        Retrieve the given page's resource dictionary
        Parameters:
        pageNum - 1-based page number from which to retrieve the resource dictionary
        Returns:
        The page's resources, or 'null' if the page has none.
        Since:
        5.1
      • getPageResources

        public PdfDictionary getPageResources​(PdfDictionary pageDict)
        Retrieve the given page's resource dictionary
        Parameters:
        pageDict - the given page
        Returns:
        The page's resources, or 'null' if the page has none.
        Since:
        5.1
      • getPageContent

        public byte[] getPageContent​(int pageNum)
                              throws java.io.IOException
        Gets the contents of the page.
        Parameters:
        pageNum - the page number. 1 is the first
        Returns:
        the content
        Throws:
        java.io.IOException - on error
      • killXref

        protected void killXref​(PdfObject obj)
      • setPageContent

        public void setPageContent​(int pageNum,
                                   byte[] content)
        Sets the contents of the page.
        Parameters:
        content - the new page content
        pageNum - the page number. 1 is the first
      • setPageContent

        public void setPageContent​(int pageNum,
                                   byte[] content,
                                   int compressionLevel)
        Sets the contents of the page.
        Parameters:
        content - the new page content
        pageNum - the page number. 1 is the first
        compressionLevel - the compressionLevel
        Since:
        2.1.3 (the method already existed without param compressionLevel)
      • setPageContent

        public void setPageContent​(int pageNum,
                                   byte[] content,
                                   int compressionLevel,
                                   boolean killOldXRefRecursively)
        Sets the contents of the page.
        Parameters:
        content - the new page content
        pageNum - the page number. 1 is the first
        compressionLevel - the compressionLevel
        killOldXRefRecursively - if true, old contents will be deeply removed from the pdf (i.e. if it was an array, all its entries will also be removed). Use careful when a content stream may be reused. If false, old contents will not be removed and will stay in the document if not manually deleted.
        Since:
        5.5.7 (the method already existed without param killOldXRefRecursively)
      • decodeBytes

        public static byte[] decodeBytes​(byte[] b,
                                         PdfDictionary streamDictionary)
                                  throws java.io.IOException
        Decode a byte[] applying the filters specified in the provided dictionary using default filter handlers.
        Parameters:
        b - the bytes to decode
        streamDictionary - the dictionary that contains filter information
        Returns:
        the decoded bytes
        Throws:
        java.io.IOException - if there are any problems decoding the bytes
        Since:
        5.0.4
      • decodeBytes

        public static byte[] decodeBytes​(byte[] b,
                                         PdfDictionary streamDictionary,
                                         java.util.Map<PdfName,​FilterHandlers.FilterHandler> filterHandlers)
                                  throws java.io.IOException
        Decode a byte[] applying the filters specified in the provided dictionary using the provided filter handlers.
        Parameters:
        b - the bytes to decode
        streamDictionary - the dictionary that contains filter information
        filterHandlers - the map used to look up a handler for each type of filter
        Returns:
        the decoded bytes
        Throws:
        java.io.IOException - if there are any problems decoding the bytes
        Since:
        5.0.4
      • getStreamBytes

        public static byte[] getStreamBytes​(PRStream stream,
                                            RandomAccessFileOrArray file)
                                     throws java.io.IOException
        Get the content from a stream applying the required filters.
        Parameters:
        stream - the stream
        file - the location where the stream is
        Returns:
        the stream content
        Throws:
        java.io.IOException - on error
      • getStreamBytes

        public static byte[] getStreamBytes​(PRStream stream)
                                     throws java.io.IOException
        Get the content from a stream applying the required filters.
        Parameters:
        stream - the stream
        Returns:
        the stream content
        Throws:
        java.io.IOException - on error
      • getStreamBytesRaw

        public static byte[] getStreamBytesRaw​(PRStream stream,
                                               RandomAccessFileOrArray file)
                                        throws java.io.IOException
        Get the content from a stream as it is without applying any filter.
        Parameters:
        stream - the stream
        file - the location where the stream is
        Returns:
        the stream content
        Throws:
        java.io.IOException - on error
      • getStreamBytesRaw

        public static byte[] getStreamBytesRaw​(PRStream stream)
                                        throws java.io.IOException
        Get the content from a stream as it is without applying any filter.
        Parameters:
        stream - the stream
        Returns:
        the stream content
        Throws:
        java.io.IOException - on error
      • eliminateSharedStreams

        public void eliminateSharedStreams()
        Eliminates shared streams if they exist.
      • isTampered

        public boolean isTampered()
        Checks if the document was changed.
        Returns:
        true if the document was changed, false otherwise
      • setTampered

        public void setTampered​(boolean tampered)
        Sets the tampered state. A tampered PdfReader cannot be reused in PdfStamper.
        Parameters:
        tampered - the tampered state
      • getMetadata

        public byte[] getMetadata()
                           throws java.io.IOException
        Gets the XML metadata.
        Returns:
        the XML metadata
        Throws:
        java.io.IOException - on error
      • getLastXref

        public long getLastXref()
        Gets the byte address of the last xref table.
        Returns:
        the byte address of the last xref table
      • getXrefSize

        public int getXrefSize()
        Gets the number of xref objects.
        Returns:
        the number of xref objects
      • getEofPos

        public long getEofPos()
        Gets the byte address of the %%EOF marker.
        Returns:
        the byte address of the %%EOF marker
      • getPdfVersion

        public char getPdfVersion()
        Gets the PDF version. Only the last version char is returned. For example version 1.4 is returned as '4'.
        Returns:
        the PDF version
      • isEncrypted

        public boolean isEncrypted()
        Returns true if the PDF is encrypted.
        Returns:
        true if the PDF is encrypted
      • getPermissions

        public long getPermissions()
        Gets the encryption permissions. It can be used directly in PdfWriter.setEncryption().
        Returns:
        the encryption permissions
      • is128Key

        public boolean is128Key()
        Returns true if the PDF has a 128 bit key encryption.
        Returns:
        true if the PDF has a 128 bit key encryption
      • getTrailer

        public PdfDictionary getTrailer()
        Gets the trailer dictionary
        Returns:
        the trailer dictionary
      • equalsn

        static boolean equalsn​(byte[] a1,
                               byte[] a2)
      • getFontName

        static java.lang.String getFontName​(PdfDictionary dic)
      • getSubsetPrefix

        static java.lang.String getSubsetPrefix​(PdfDictionary dic)
      • shuffleSubsetNames

        public int shuffleSubsetNames()
        Finds all the font subsets and changes the prefixes to some random values.
        Returns:
        the number of font subsets altered
      • createFakeFontSubsets

        public int createFakeFontSubsets()
        Finds all the fonts not subset but embedded and marks them as subset.
        Returns:
        the number of fonts altered
      • getNamedDestination

        public java.util.HashMap<java.lang.Object,​PdfObject> getNamedDestination()
        Gets all the named destinations as an HashMap. The key is the name and the value is the destinations array.
        Returns:
        gets all the named destinations
      • getNamedDestination

        public java.util.HashMap<java.lang.Object,​PdfObject> getNamedDestination​(boolean keepNames)
        Gets all the named destinations as an HashMap. The key is the name and the value is the destinations array.
        Parameters:
        keepNames - true if you want the keys to be real PdfNames instead of Strings
        Returns:
        gets all the named destinations
        Since:
        2.1.6
      • getNamedDestinationFromNames

        public java.util.HashMap<java.lang.String,​PdfObject> getNamedDestinationFromNames()
        Gets the named destinations from the /Dests key in the catalog as an HashMap. The key is the name and the value is the destinations array.
        Returns:
        gets the named destinations
        Since:
        5.0.1 (generic type in signature)
      • getNamedDestinationFromNames

        public java.util.HashMap<java.lang.Object,​PdfObject> getNamedDestinationFromNames​(boolean keepNames)
        Gets the named destinations from the /Dests key in the catalog as an HashMap. The key is the name and the value is the destinations array.
        Parameters:
        keepNames - true if you want the keys to be real PdfNames instead of Strings
        Returns:
        gets the named destinations
        Since:
        2.1.6
      • getNamedDestinationFromStrings

        public java.util.HashMap<java.lang.String,​PdfObject> getNamedDestinationFromStrings()
        Gets the named destinations from the /Names key in the catalog as an HashMap. The key is the name and the value is the destinations array.
        Returns:
        gets the named destinations
      • removeFields

        public void removeFields()
        Removes all the fields from the document.
      • removeAnnotations

        public void removeAnnotations()
        Removes all the annotations and fields from the document.
      • getLinks

        public java.util.ArrayList<PdfAnnotation.PdfImportedLink> getLinks​(int page)
        Retrieves links for a certain page.
        Parameters:
        page - the page to inspect
        Returns:
        a list of links
      • iterateBookmarks

        private void iterateBookmarks​(PdfObject outlineRef,
                                      java.util.HashMap<java.lang.Object,​PdfObject> names)
      • makeRemoteNamedDestinationsLocal

        public void makeRemoteNamedDestinationsLocal()
        Replaces remote named links with local destinations that have the same name.
        Since:
        5.0
      • convertNamedDestination

        private boolean convertNamedDestination​(PdfObject obj,
                                                java.util.HashMap<java.lang.Object,​PdfObject> names)
        Converts a remote named destination GoToR with a local named destination if there's a corresponding name.
        Parameters:
        obj - an annotation that needs to be screened for links to external named destinations.
        names - a map with names of local named destinations
        Since:
        iText 5.0
      • consolidateNamedDestinations

        public void consolidateNamedDestinations()
        Replaces all the local named links with the actual destinations.
      • replaceNamedDestination

        private boolean replaceNamedDestination​(PdfObject obj,
                                                java.util.HashMap<java.lang.Object,​PdfObject> names)
      • close

        public void close()
        Closes the reader, and any underlying stream or data source used to create the reader
      • removeUnusedNode

        protected void removeUnusedNode​(PdfObject obj,
                                        boolean[] hits)
      • removeUnusedObjects

        public int removeUnusedObjects()
        Removes all the unreachable objects.
        Returns:
        the number of indirect objects removed
      • getAcroFields

        public AcroFields getAcroFields()
        Gets a read-only version of AcroFields.
        Returns:
        a read-only version of AcroFields
      • getJavaScript

        public java.lang.String getJavaScript​(RandomAccessFileOrArray file)
                                       throws java.io.IOException
        Gets the global document JavaScript.
        Parameters:
        file - the document file
        Returns:
        the global document JavaScript
        Throws:
        java.io.IOException - on error
      • getJavaScript

        public java.lang.String getJavaScript()
                                       throws java.io.IOException
        Gets the global document JavaScript.
        Returns:
        the global document JavaScript
        Throws:
        java.io.IOException - on error
      • selectPages

        public void selectPages​(java.lang.String ranges)
        Selects the pages to keep in the document. The pages are described as ranges. The page ordering can be changed but no page repetitions are allowed. Note that it may be very slow in partial mode.
        Parameters:
        ranges - the comma separated ranges as described in SequenceList
      • selectPages

        public void selectPages​(java.util.List<java.lang.Integer> pagesToKeep)
        Selects the pages to keep in the document. The pages are described as a List of Integer. The page ordering can be changed but no page repetitions are allowed. Note that it may be very slow in partial mode.
        Parameters:
        pagesToKeep - the pages to keep in the document
      • selectPages

        protected void selectPages​(java.util.List<java.lang.Integer> pagesToKeep,
                                   boolean removeUnused)
        Selects the pages to keep in the document. The pages are described as a List of Integer. The page ordering can be changed but no page repetitions are allowed. Note that it may be very slow in partial mode.
        Parameters:
        pagesToKeep - the pages to keep in the document
        removeUnused - indicate if to remove unsed objects. @see removeUnusedObjects
      • getSimpleViewerPreferences

        public int getSimpleViewerPreferences()
        Returns a bitset representing the PageMode and PageLayout viewer preferences. Doesn't return any information about the ViewerPreferences dictionary.
        Returns:
        an int that contains the Viewer Preferences.
      • isAppendable

        public boolean isAppendable()
        Getter for property appendable.
        Returns:
        Value of property appendable.
      • setAppendable

        public void setAppendable​(boolean appendable)
        Setter for property appendable.
        Parameters:
        appendable - New value of property appendable.
      • isNewXrefType

        public boolean isNewXrefType()
        Getter for property newXrefType.
        Returns:
        Value of property newXrefType.
      • getFileLength

        public long getFileLength()
        Getter for property fileLength.
        Returns:
        Value of property fileLength.
      • isHybridXref

        public boolean isHybridXref()
        Getter for property hybridXref.
        Returns:
        Value of property hybridXref.
      • hasUsageRights

        public boolean hasUsageRights()
        Checks if this PDF has usage rights enabled.
        Returns:
        true if usage rights are present; false otherwise
      • removeUsageRights

        public void removeUsageRights()
        Removes any usage rights that this PDF may have. Only Adobe can grant usage rights and any PDF modification with iText will invalidate them. Invalidated usage rights may confuse Acrobat and it's advisable to remove them altogether.
      • getCertificationLevel

        public int getCertificationLevel()
        Gets the certification level for this document. The return values can be PdfSignatureAppearance.NOT_CERTIFIED, PdfSignatureAppearance.CERTIFIED_NO_CHANGES_ALLOWED, PdfSignatureAppearance.CERTIFIED_FORM_FILLING and PdfSignatureAppearance.CERTIFIED_FORM_FILLING_AND_ANNOTATIONS.

        No signature validation is made, use the methods available for that in AcroFields.

        Returns:
        gets the certification level for this document
      • isOpenedWithFullPermissions

        public final boolean isOpenedWithFullPermissions()
        Checks if the document was opened with the owner password so that the end application can decide what level of access restrictions to apply. If the document is not encrypted it will return true.
        Returns:
        true if the document was opened with the owner password or if it's not encrypted, false if the document was opened with the user password
      • getCryptoMode

        public int getCryptoMode()
        Returns:
        the crypto mode, or -1 of none
      • isMetadataEncrypted

        public boolean isMetadataEncrypted()
        Returns:
        true if the metadata is encrypted.
      • computeUserPassword

        public byte[] computeUserPassword()
        Computes user password if standard encryption handler is used with Standard40, Standard128 or AES128 encryption algorithm.
        Returns:
        user password, or null if not a standard encryption handler was used, if standard encryption handler was used with AES256 encryption algorithm, or if ownerPasswordUsed wasn't use to open the document.