PDFExplorer Component

Properties   Methods   Events   Config Settings   Errors  

The PDFExplorer component provides access to the low-level PDF document structure.

Syntax

nsoftware.SecurePDF.PDFExplorer

Remarks

The PDFExplorer component can be used to inspect the internals of a PDF document and make changes to the low-level document structure.

Object Types and Document Structure

The PDF specification defines eight object types:

  • Name
  • String
  • Real
  • Integer
  • Boolean
  • Array
  • Dictionary
  • Stream
In PDFExplorer, name, string, real, integer, and boolean objects are categorized as "primitive" objects, and array, dictionary, and stream objects are categorized as "container" objects.

Before accessing individual objects with the component, it is important to understand how they are structured in the document. PDFExplorer aims to distinguish between the logical and physical representations of objects.

The logical representation is that a PDF document is a tree of objects that can be traversed to extract data. For example, every document contains a document catalog that references a next-level object /Pages, which in turn references individual pages via a /Kids array. So to get a page, you would first look for the /Root object in the document trailer, then proceed to its /Pages element, and then work with the /Kids array.

Then, there is the physical structure that consists of all the objects that constitute the document. Every object is recorded as either:

  • A direct (in-place) object (e.g., /Numbers [1 2 3 888]),
  • An indirect (numbered) object, or
  • A reference to an indirect object (e.g., /Numbers 8 0 R).
The way the objects are physically stored is generally independent from their logical structure. If you are looking for a page, it is of little importance whether each object that you need to traverse to reach it is stored in-place, in one of the indirect objects, or in a compressed object stream.

Note that most heavy objects (such as streams and dictionaries) are recorded in PDF files as indirect objects, with other objects referencing them. An indirect object is a global object that is uniquely identified by its object number followed by its generation number (e.g., 1 0 obj).

Navigating the Document

To navigate the object tree, first provide the input document as a file (InputFile), byte array (InputData), or stream (SetInputStream) and call the Open method. This method will populate the RootObjects collection with the existing objects in the document trailer, as the trailer is considered to be the root of the logical object tree. The keys in the document trailer will typically be /Size, /Info, /Root, /ID, and /Encrypt for encrypted documents.

These objects can then be used as a starting point for the document tree navigation, which is done using the Select method. This method and others operate the following syntax for specifying objects in the document:

  • Slashes separate levels of hierarchy, like in file paths.
  • The "root" slash (/) points to the document trailer dictionary.
  • A path that does not start with a slash specifies an indirect object in the list of global numbered objects.
  • The asterisk character (*) specifies all objects at the provided path.
Examples:

Consider the following PDF document:

%PDF-1.4
%cmmt
1 0 obj
<< /Type /Catalog
/Pages 2 0 R
>>
endobj

2 0 obj
<< /Type /Pages
/Kids [ 3 0 R ]
/Count 1
>>
endobj

3 0 obj
<< /Type /Page
/Parent 2 0 R
/MediaBox [ 0  0  612  792 ]
/Resources << /ProcSet 4 0 R >>
>>
endobj

4 0 obj
[ /PDF ]
endobj

xref
0 5
0000000000 65535 f
0000000015 00000 n
0000000065 00000 n
0000000125 00000 n
0000000234 00000 n
trailer
<< /Size 5
/Root 1 0 R
>>
startxref
259
%%EOF

Select would return the following results for the respective paths:

  • / - a dictionary object that corresponds to the trailer dictionary.
  • /Root - a dictionary object that corresponds to the dictionary at 1 0 obj, with its Disposition field set to Reference (as this is a reference to an indirect object).
  • /Size - an integer object whose Value is 5 and Disposition is Direct (as this is a direct, in-place object).
  • /Root/Type - a name object whose Value is Catalog (Disposition = Direct).
  • /Root/Pages - a dictionary object that corresponds to the dictionary at 2 0 obj (Disposition = Reference).
  • /Root/Pages/Kids - an array object (Disposition = Direct).
  • /Root/Pages/Kids[0] - a dictionary object that corresponds to the dictionary at 3 0 obj (Disposition = Reference).
  • /Root/Pages/Kids[0]/MediaBox - an array object with four integer elements (Disposition = Direct).
  • /Root/Pages/Kids[0]/MediaBox[2] - an integer object whose Value is 612 (Disposition = Direct).
  • 3 0 obj - a dictionary object that corresponds to the dictionary at 3 0 obj (Disposition = Indirect).
  • 3 0 obj/Type - a name object whose Value is Page (Disposition = Direct).
  • 3 0 obj/Parent - a dictionary object that corresponds to the dictionary at 2 0 obj (Disposition = Reference).
Once Select returns, the selected object(s) will be available in the SelectedObjects collection.

Adding and Modifying Objects

The below sections contain instructions for adding and modifying each type of object. Note that each of the following Add* methods returns the path of the newly added object in the document, making it easy to access the PDFObject object later using the Select method. These objects' values can then be adjusted to ensure the PDF document meets your requirements.

Primitive Objects

A primitive object is a non-container object that represents a name, string, real (double), integer, or boolean value. Primitive objects are typically stored in-place and referenced directly. Use the AddPrimitive method to add a direct primitive object and the AddObject method (with the Indirect parameter set to true) to add an indirect primitive object: // Adding a direct string object to the /Info dictionary string stringPath = pdfexplorer.AddPrimitive("/Info", "Creator", "Microsoft Word"); // Adding an indirect boolean object to the root string booleanPath = pdfexplorer.AddObject("", 5, "", "true", true);

5 0 obj
<<
...
/Creator (Microsoft Word)
>>
endobj

...

6 0 obj
true
endobj

The value of a primitive object can then be modified if desired: pdfexplorer.Select(stringPath, true); pdfexplorer.SelectedObjects[0].Value = "nsoftware.SecurePDF"; pdfexplorer.Select(booleanPath, true); pdfexplorer.SelectedObjects[0].Value = "false";

5 0 obj
<<
...
/Creator (nsoftware.SecurePDF)
>>
endobj

...

6 0 obj
false
endobj

Array and Dictionary Objects

Unlike primitives, arrays and dictionaries are objects that contain other objects. Elements within array objects are arranged sequentially and have implicit zero-based indices, whereas dictionary objects contain named key-value pairs that are unordered. Use the AddContainer method to add a direct or indirect array or dictionary object: // Adding a direct array object to the first page's /Page dictionary string arrayPath = pdfexplorer.AddContainer("/Root/Pages/Kids[0]", "CropBox", false, false); // Adding an indirect dictionary object to the root string dictPath = pdfexplorer.AddContainer("", "", true, true);

3 0 obj
<< /Type /Page
...
/CropBox [
]>>
endobj

...

7 0 obj
<<
>>
endobj

An array or dictionary object can then be modified by adding elements to it. The example below populates the /CropBox array with four integer objects and adds a /Type key to the newly created dictionary. string cropBox0Path = pdfexplorer.AddPrimitive(arrayPath, "", "0"); string cropBox1Path = pdfexplorer.AddPrimitive(arrayPath, "", "0"); string cropBox2Path = pdfexplorer.AddPrimitive(arrayPath, "", "612"); string cropBox3Path = pdfexplorer.AddPrimitive(arrayPath, "", "792"); string typePath = pdfexplorer.AddPrimitive(dictPath, "Type", "/SampleType");

3 0 obj
<< /Type /Page
...
/CropBox [
0
0
612
792
]>>
endobj

...

7 0 obj
<<
/Type /SampleType
>>
endobj

Stream Objects

A stream object is a compound object consisting of a dictionary and a sequence of bytes. Stream objects are always indirect and are used to store data such as images, fonts, and other resources. Use the AddStream method to add a stream object: // Adding a stream object to the root byte[] image1Data = File.ReadAllBytes("image1.png"); string streamPath = pdfexplorer.AddStream("", "", image1Data);

8 0 obj
<<
/Length 6317
>>stream
... % binary data for image1.png
endstream
endobj

To modify a stream object, use the SetObjectData or SetObjectStream method: byte[] image2Data = File.ReadAllBytes("image2.png"); pdfexplorer.SetObjectData(streamPath, image2Data); // or pdfexplorer.SetObjectStream(streamPath, new MemoryStream(image2Data));

8 0 obj
<<
/Length 197
>>stream
... % binary data for image2.png
endstream
endobj

Object References

An (indirect) object reference is a reference to an indirect object from another object. Its syntax consists of the destination object's object number, its generation number, and R (e.g., 1 0 R). Use the AddReference method to add a reference to an existing object: // Creating a reference to the stream at 8 0 obj and adding it to the dictionary at 7 0 obj string path = pdfexplorer.AddReference("7 0 obj", "Image", "8 0 obj");

7 0 obj
<<
/Image 8 0 R
/Type /SampleType
>>
endobj

The contents of the destination object can be modified using the path returned by AddReference in the same way as any other indirect object - the reference will remain intact because the object and generation numbers of the destination object will not be affected.

Removing Objects

The RemoveObject method can be used to remove an object from the document. While this method will invalidate the former path of the object itself, if it was an indirect object any references to it will not be removed. pdfexplorer.RemoveObject("7 0 obj/Image");
7 0 obj
<<
/Type /SampleType
>>
endobj

When finished adding, modifying, or removing objects, call the Close method to close the document and save the changes to either OutputFile, OutputData, or the stream set in SetOutputStream.

Property List


The following is the full list of the properties of the component with short descriptions. Click on the links for further details.

InputDataA byte array containing the PDF document to process.
InputFileThe PDF file to process.
OutputDataA byte array containing the PDF document after processing.
OutputFileThe path to a local file where the output will be written.
OverwriteWhether or not the component should overwrite files.
RootObjectsA collection of all the root objects contained in the document.
SelectedObjectsA collection of objects that match the current selection.

Method List


The following is the full list of the methods of the component with short descriptions. Click on the links for further details.

AddContainerAdds a dictionary or array object to the document.
AddObjectInserts an object into the document.
AddOpaqueAdds an opaque piece of PDF to the document.
AddPrimitiveAdds a primitive object to the document.
AddReferenceAdds an object reference to the document.
AddStreamAdds a stream object to the document.
CloseCloses the opened document.
ConfigSets or retrieves a configuration setting.
CreateNewCreates a new PDF document.
GetObjectDataReturns the content of a stream object.
GetObjectStreamWrites the unparsed and uninterpreted content of a stream object to a stream.
OpenOpens the document for processing.
RemoveObjectRemoves an object from the document.
ResetResets the component.
SelectSelects an object or multiple objects from the document.
SetInputStreamSets the stream containing the PDF document to process.
SetObjectDataSets the content of a stream object.
SetObjectStreamSets the content of a stream object from a stream.
SetOutputStreamSets the stream to write the processed document to.

Event List


The following is the full list of the events fired by the component with short descriptions. Click on the links for further details.

ErrorFired when information is available about errors during data delivery.
LogFired once for each log message.

Config Settings


The following is a list of config settings for the component with short descriptions. Click on the links for further details.

CloseInputStreamAfterProcessingWhether to close the input stream after processing.
CloseOutputStreamAfterProcessingWhether to close the output stream after processing.
LogLevelThe level of detail that is logged.
OwnerPasswordThe owner password to decrypt the document with.
SaveChangesWhether to save changes made to the document.
StringEncodingThe encoding to use for string objects.
TempPathThe location where temporary files are stored.
BuildInfoInformation about the product's build.
GUIAvailableWhether or not a message loop is available for processing events.
LicenseInfoInformation about the current license.
MaskSensitiveDataWhether sensitive data is masked in log messages.
UseInternalSecurityAPIWhether or not to use the system security libraries or an internal implementation.

InputData Property (PDFExplorer Component)

A byte array containing the PDF document to process.

Syntax

public byte[] InputData { get; set; }
Public Property InputData As Byte()

Remarks

This property is used to assign a byte array containing the PDF document to be processed.

This property is not available at design time.

InputFile Property (PDFExplorer Component)

The PDF file to process.

Syntax

public string InputFile { get; set; }
Public Property InputFile As String

Default Value

""

Remarks

This property is used to provide a path to the PDF document to be processed.

OutputData Property (PDFExplorer Component)

A byte array containing the PDF document after processing.

Syntax

public byte[] OutputData { get; }
Public ReadOnly Property OutputData As Byte()

Remarks

This property is used to read the byte array containing the produced output after the operation has completed. It will only be set if an output file and output stream have not been assigned via OutputFile and SetOutputStream respectively.

This property is read-only and not available at design time.

OutputFile Property (PDFExplorer Component)

The path to a local file where the output will be written.

Syntax

public string OutputFile { get; set; }
Public Property OutputFile As String

Default Value

""

Remarks

This property is used to provide a path where the resulting PDF document will be saved after the operation has completed.

Overwrite Property (PDFExplorer Component)

Whether or not the component should overwrite files.

Syntax

public bool Overwrite { get; set; }
Public Property Overwrite As Boolean

Default Value

False

Remarks

This property indicates whether or not the component will overwrite OutputFile, OutputData, or the stream set in SetOutputStream. If set to false, an error will be thrown whenever OutputFile, OutputData, or the stream set in SetOutputStream exists before an operation.

RootObjects Property (PDFExplorer Component)

A collection of all the root objects contained in the document.

Syntax

public PDFObjectList RootObjects { get; }
Public ReadOnly Property RootObjects As PDFObjectList

Remarks

This property is used to access the list of root objects of the document object tree.

The logical structure of the document (the "root") starts at the document trailer. The trailer contains such entries as /Info and /Root, which provide a pathway for accessing deeper objects such as pages, forms, and signatures.

This property is read-only and not available at design time.

Please refer to the PDFObject type for a complete list of fields.

SelectedObjects Property (PDFExplorer Component)

A collection of objects that match the current selection.

Syntax

public PDFObjectList SelectedObjects { get; }
Public ReadOnly Property SelectedObjects As PDFObjectList

Remarks

This property is used to access the list of objects that match the selection criteria specified in the Select method.

This property is read-only and not available at design time.

Please refer to the PDFObject type for a complete list of fields.

AddContainer Method (PDFExplorer Component)

Adds a dictionary or array object to the document.

Syntax

public string AddContainer(string basePath, string objectName, bool dictionary, bool indirect);

Async Version
public async Task<string> AddContainer(string basePath, string objectName, bool dictionary, bool indirect);
public async Task<string> AddContainer(string basePath, string objectName, bool dictionary, bool indirect, CancellationToken cancellationToken);
Public Function AddContainer(ByVal BasePath As String, ByVal ObjectName As String, ByVal Dictionary As Boolean, ByVal Indirect As Boolean) As String

Async Version
Public Function AddContainer(ByVal BasePath As String, ByVal ObjectName As String, ByVal Dictionary As Boolean, ByVal Indirect As Boolean) As Task(Of String)
Public Function AddContainer(ByVal BasePath As String, ByVal ObjectName As String, ByVal Dictionary As Boolean, ByVal Indirect As Boolean, cancellationToken As CancellationToken) As Task(Of String)

Remarks

This method is used to add a new dictionary or array object to the document at BasePath. If adding to an existing dictionary, set ObjectName to the key that the new object will be added or referenced under.

The Dictionary parameter specifies whether to create a dictionary object.

The Indirect parameter specifies whether to add the dictionary or array to the indirect (numbered) object list and reference it from BasePath instead of creating an in-place object.

This method returns the path of the new object in the document.

Please see Navigating the Document for more details about object paths.

AddObject Method (PDFExplorer Component)

Inserts an object into the document.

Syntax

public string AddObject(string basePath, int objectType, string objectName, string value, bool indirect);

Async Version
public async Task<string> AddObject(string basePath, int objectType, string objectName, string value, bool indirect);
public async Task<string> AddObject(string basePath, int objectType, string objectName, string value, bool indirect, CancellationToken cancellationToken);
Public Function AddObject(ByVal BasePath As String, ByVal ObjectType As Integer, ByVal ObjectName As String, ByVal Value As String, ByVal Indirect As Boolean) As String

Async Version
Public Function AddObject(ByVal BasePath As String, ByVal ObjectType As Integer, ByVal ObjectName As String, ByVal Value As String, ByVal Indirect As Boolean) As Task(Of String)
Public Function AddObject(ByVal BasePath As String, ByVal ObjectType As Integer, ByVal ObjectName As String, ByVal Value As String, ByVal Indirect As Boolean, cancellationToken As CancellationToken) As Task(Of String)

Remarks

This method is used to add a new object with value Value to the document at BasePath. If adding to an existing dictionary, set ObjectName to the key that the new object will be added or referenced under.

The ObjectType parameter specifies the type of the object and can be one of the following values:

  • 1 (Name)
  • 2 (String)
  • 3 (Real)
  • 4 (Integer)
  • 5 (Boolean)
  • 6 (Array)
  • 7 (Dictionary)
  • 8 (Stream)
The Indirect parameter specifies whether to add the object to the indirect (numbered) object list and reference it from BasePath instead of creating an in-place object.

NOTE: This method can be particularly useful to add a primitive object (name, string, real, integer, or boolean) to the list of indirect (numbered) objects.

This method returns the path of the new object in the document.

Please see Navigating the Document for more details about object paths.

AddOpaque Method (PDFExplorer Component)

Adds an opaque piece of PDF to the document.

Syntax

public string AddOpaque(string basePath, string objectName, string value);

Async Version
public async Task<string> AddOpaque(string basePath, string objectName, string value);
public async Task<string> AddOpaque(string basePath, string objectName, string value, CancellationToken cancellationToken);
Public Function AddOpaque(ByVal BasePath As String, ByVal ObjectName As String, ByVal Value As String) As String

Async Version
Public Function AddOpaque(ByVal BasePath As String, ByVal ObjectName As String, ByVal Value As String) As Task(Of String)
Public Function AddOpaque(ByVal BasePath As String, ByVal ObjectName As String, ByVal Value As String, cancellationToken As CancellationToken) As Task(Of String)

Remarks

This method is used to add an uninterpreted string of PDF objects with value Value to the document at BasePath. If adding to an existing dictionary, set ObjectName to the key that the new object will be added under.

Example: pdfexplorer.InputFile = "input.pdf"; pdfexplorer.OutputFile = "modified.pdf"; pdfexplorer.Open(); string value = "<< /Producer (Secure PDF)\r\n" + "/CreationDate (D:20250725102001Z00'00')\r\n" + "/ModDate (D:20250725102001Z00'00')\r\n" + "/Author (Edvard Grieg)\r\n" + "/Title (In the Hall of the Mountain King)\r\n" + ">>"; string path = pdfexplorer.AddOpaque("/", "Info", value); pdfexplorer.Close(); This method returns the path of the new object in the document.

Please see Navigating the Document for more details about object paths.

AddPrimitive Method (PDFExplorer Component)

Adds a primitive object to the document.

Syntax

public string AddPrimitive(string basePath, string objectName, string value);

Async Version
public async Task<string> AddPrimitive(string basePath, string objectName, string value);
public async Task<string> AddPrimitive(string basePath, string objectName, string value, CancellationToken cancellationToken);
Public Function AddPrimitive(ByVal BasePath As String, ByVal ObjectName As String, ByVal Value As String) As String

Async Version
Public Function AddPrimitive(ByVal BasePath As String, ByVal ObjectName As String, ByVal Value As String) As Task(Of String)
Public Function AddPrimitive(ByVal BasePath As String, ByVal ObjectName As String, ByVal Value As String, cancellationToken As CancellationToken) As Task(Of String)

Remarks

This method is used to add a new name, string, real (double), integer, or boolean object with value Value to the document at BasePath. If adding to an existing dictionary, set ObjectName to the key that the new object will be added under.

The component will automatically determine the type of the object based on the Value parameter.

Examples: pdfexplorer.InputFile = "input.pdf"; pdfexplorer.OutputFile = "modified.pdf"; pdfexplorer.Open(); // Adding a name object to the dictionary at 3 0 obj string namePath = pdfexplorer.AddPrimitive("3 0 obj", "Type", "/Font"); // Adding a string object to the dictionary at 1 0 obj string stringPath = pdfexplorer.AddPrimitive("1 0 obj", "Name", "John Doe"); // Adding a real object to an array string realPath = pdfexplorer.AddPrimitive("5 0 obj/Rect", "", "100.5"); // Adding an integer object to an array string integerPath = pdfexplorer.AddPrimitive("/Root/Pages/Kids[0]/MediaBox", "", "792"); // Adding a boolean object to a dictionary string booleanPath = pdfexplorer.AddPrimitive("/Root/AcroForm", "NeedAppearances", "true"); pdfexplorer.Close(); This method returns the path of the new object in the document.

Please see Navigating the Document for more details about object paths.

AddReference Method (PDFExplorer Component)

Adds an object reference to the document.

Syntax

public string AddReference(string basePath, string objectName, string refPath);

Async Version
public async Task<string> AddReference(string basePath, string objectName, string refPath);
public async Task<string> AddReference(string basePath, string objectName, string refPath, CancellationToken cancellationToken);
Public Function AddReference(ByVal BasePath As String, ByVal ObjectName As String, ByVal RefPath As String) As String

Async Version
Public Function AddReference(ByVal BasePath As String, ByVal ObjectName As String, ByVal RefPath As String) As Task(Of String)
Public Function AddReference(ByVal BasePath As String, ByVal ObjectName As String, ByVal RefPath As String, cancellationToken As CancellationToken) As Task(Of String)

Remarks

This method is used to create a new reference to an existing object, such as a page dictionary, at BasePath. If adding to an existing dictionary, set ObjectName to the key that the new reference will be added under.

The RefPath parameter specifies the destination object and must point to one of the indirect (numbered) objects.

This method returns the path of the new object in the document.

Please see Navigating the Document for more details about object paths.

AddStream Method (PDFExplorer Component)

Adds a stream object to the document.

Syntax

public string AddStream(string basePath, string objectName, byte[] value);

Async Version
public async Task<string> AddStream(string basePath, string objectName, byte[] value);
public async Task<string> AddStream(string basePath, string objectName, byte[] value, CancellationToken cancellationToken);
Public Function AddStream(ByVal BasePath As String, ByVal ObjectName As String, ByVal Value As Byte()) As String

Async Version
Public Function AddStream(ByVal BasePath As String, ByVal ObjectName As String, ByVal Value As Byte()) As Task(Of String)
Public Function AddStream(ByVal BasePath As String, ByVal ObjectName As String, ByVal Value As Byte(), cancellationToken As CancellationToken) As Task(Of String)

Remarks

This method is used to create a new stream object with value Value at BasePath. If adding to an existing dictionary, set ObjectName to the key that the new stream will be referenced under.

NOTE: Stream objects are always indirect (i.e., part of the numbered object list).

This method returns the path of the new object in the document.

Please see Navigating the Document for more details about object paths.

Close Method (PDFExplorer Component)

Closes the opened document.

Syntax

public void Close();

Async Version
public async Task Close();
public async Task Close(CancellationToken cancellationToken);
Public Sub Close()

Async Version
Public Sub Close() As Task
Public Sub Close(cancellationToken As CancellationToken) As Task

Remarks

This method is used to close the previously opened document. It should always be preceded by a call to the Open method.

Example: component.InputFile = "input.pdf"; component.Open(); // Some operation component.Close(); If any changes are made to the document, they will be saved automatically to OutputFile, OutputData, or the stream set in SetOutputStream when this method is called. To configure this saving behavior, set the SaveChanges configuration setting.

Config Method (PDFExplorer Component)

Sets or retrieves a configuration setting.

Syntax

public string Config(string configurationString);

Async Version
public async Task<string> Config(string configurationString);
public async Task<string> Config(string configurationString, CancellationToken cancellationToken);
Public Function Config(ByVal ConfigurationString As String) As String

Async Version
Public Function Config(ByVal ConfigurationString As String) As Task(Of String)
Public Function Config(ByVal ConfigurationString As String, cancellationToken As CancellationToken) As Task(Of String)

Remarks

Config is a generic method available in every component. It is used to set and retrieve configuration settings for the component.

These settings are similar in functionality to properties, but they are rarely used. In order to avoid "polluting" the property namespace of the component, access to these internal properties is provided through the Config method.

To set a configuration setting named PROPERTY, you must call Config("PROPERTY=VALUE"), where VALUE is the value of the setting expressed as a string. For boolean values, use the strings "True", "False", "0", "1", "Yes", or "No" (case does not matter).

To read (query) the value of a configuration setting, you must call Config("PROPERTY"). The value will be returned as a string.

CreateNew Method (PDFExplorer Component)

Creates a new PDF document.

Syntax

public void CreateNew();

Async Version
public async Task CreateNew();
public async Task CreateNew(CancellationToken cancellationToken);
Public Sub CreateNew()

Async Version
Public Sub CreateNew() As Task
Public Sub CreateNew(cancellationToken As CancellationToken) As Task

Remarks

This method is used to create a blank PDF document with one empty page. Having created the baseline document, use the component's methods (such as AddStream) to add objects to it.

GetObjectData Method (PDFExplorer Component)

Returns the content of a stream object.

Syntax

public byte[] GetObjectData(string path);

Async Version
public async Task<byte[]> GetObjectData(string path);
public async Task<byte[]> GetObjectData(string path, CancellationToken cancellationToken);
Public Function GetObjectData(ByVal Path As String) As Byte()

Async Version
Public Function GetObjectData(ByVal Path As String) As Task(Of Byte())
Public Function GetObjectData(ByVal Path As String, cancellationToken As CancellationToken) As Task(Of Byte())

Remarks

This method is used to retrieve the content of the PDF stream object at Path.

Please see Navigating the Document for more details about object paths.

GetObjectStream Method (PDFExplorer Component)

Writes the unparsed and uninterpreted content of a stream object to a stream.

Syntax

public void GetObjectStream(string path, System.IO.Stream outputStream);

Async Version
public async Task GetObjectStream(string path, System.IO.Stream outputStream);
public async Task GetObjectStream(string path, System.IO.Stream outputStream, CancellationToken cancellationToken);
Public Sub GetObjectStream(ByVal Path As String, ByVal OutputStream As System.IO.Stream)

Async Version
Public Sub GetObjectStream(ByVal Path As String, ByVal OutputStream As System.IO.Stream) As Task
Public Sub GetObjectStream(ByVal Path As String, ByVal OutputStream As System.IO.Stream, cancellationToken As CancellationToken) As Task

Remarks

This method is used to extract the content of the PDF stream object at Path to a stream.

Please see Navigating the Document for more details about object paths.

Open Method (PDFExplorer Component)

Opens the document for processing.

Syntax

public void Open();

Async Version
public async Task Open();
public async Task Open(CancellationToken cancellationToken);
Public Sub Open()

Async Version
Public Sub Open() As Task
Public Sub Open(cancellationToken As CancellationToken) As Task

Remarks

This method is used to open the document specified in InputFile, InputData, or SetInputStream before performing some operation on it, such as accessing or modifying individual PDF objects. When finished, call Close to complete or discard the operation.

It is recommended to use this method (alongside Close) when performing multiple operations on the document at once.

NOTE: This method will populate the RootObjects collection with the keys found in the document trailer dictionary.

RemoveObject Method (PDFExplorer Component)

Removes an object from the document.

Syntax

public void RemoveObject(string path);

Async Version
public async Task RemoveObject(string path);
public async Task RemoveObject(string path, CancellationToken cancellationToken);
Public Sub RemoveObject(ByVal Path As String)

Async Version
Public Sub RemoveObject(ByVal Path As String) As Task
Public Sub RemoveObject(ByVal Path As String, cancellationToken As CancellationToken) As Task

Remarks

This method is used to remove the object at Path from the document.

Note the following peculiarities of the PDF format:

  • Certain objects ("indirect objects") are global, numbered objects that can be referenced from other objects in the document. To remove an indirect object, all the references to it must be removed first, followed by the object itself.
  • Indirect objects may have more than one reference. Removing such an object may inadvertently invalidate other references in the document.
Please see Navigating the Document for more details about object paths.

Reset Method (PDFExplorer Component)

Resets the component.

Syntax

public void Reset();

Async Version
public async Task Reset();
public async Task Reset(CancellationToken cancellationToken);
Public Sub Reset()

Async Version
Public Sub Reset() As Task
Public Sub Reset(cancellationToken As CancellationToken) As Task

Remarks

This method is used to reset the component's properties and configuration settings to their default values.

Select Method (PDFExplorer Component)

Selects an object or multiple objects from the document.

Syntax

public void Select(string filter, bool clearExistingSelection);

Async Version
public async Task Select(string filter, bool clearExistingSelection);
public async Task Select(string filter, bool clearExistingSelection, CancellationToken cancellationToken);
Public Sub Select(ByVal Filter As String, ByVal ClearExistingSelection As Boolean)

Async Version
Public Sub Select(ByVal Filter As String, ByVal ClearExistingSelection As Boolean) As Task
Public Sub Select(ByVal Filter As String, ByVal ClearExistingSelection As Boolean, cancellationToken As CancellationToken) As Task

Remarks

This method is used to select objects from the document using an XPath-like language. Upon completion of this method, objects with paths matching the Filter parameter will be populated in the SelectedObjects collection.

The ClearExistingSelection parameter specifies whether SelectedObjects will be cleared before performing the select operation.

NOTE: Since streams are compound objects consisting of a dictionary and data, when selecting a stream object this method will select its dictionary. Use the GetObjectData or GetObjectStream methods to extract the content of stream objects.

Please see Navigating the Document for more details about object paths.

SetInputStream Method (PDFExplorer Component)

Sets the stream containing the PDF document to process.

Syntax

public void SetInputStream(System.IO.Stream inputStream);

Async Version
public async Task SetInputStream(System.IO.Stream inputStream);
public async Task SetInputStream(System.IO.Stream inputStream, CancellationToken cancellationToken);
Public Sub SetInputStream(ByVal InputStream As System.IO.Stream)

Async Version
Public Sub SetInputStream(ByVal InputStream As System.IO.Stream) As Task
Public Sub SetInputStream(ByVal InputStream As System.IO.Stream, cancellationToken As CancellationToken) As Task

Remarks

This method is used to set the stream from which the component will read the PDF document to be processed. If an input stream is set before the component attempts to perform operations on the document, the component will read the data from the input stream instead of from the InputFile or InputData properties.

NOTE: It may be useful to additionally set the CloseInputStreamAfterProcessing configuration setting to true when using input streams.

SetObjectData Method (PDFExplorer Component)

Sets the content of a stream object.

Syntax

public void SetObjectData(string path, byte[] value);

Async Version
public async Task SetObjectData(string path, byte[] value);
public async Task SetObjectData(string path, byte[] value, CancellationToken cancellationToken);
Public Sub SetObjectData(ByVal Path As String, ByVal Value As Byte())

Async Version
Public Sub SetObjectData(ByVal Path As String, ByVal Value As Byte()) As Task
Public Sub SetObjectData(ByVal Path As String, ByVal Value As Byte(), cancellationToken As CancellationToken) As Task

Remarks

This method is used to set the content of the PDF stream object at Path. The Value parameter specifies the data of the stream.

Please see Navigating the Document for more details about object paths.

SetObjectStream Method (PDFExplorer Component)

Sets the content of a stream object from a stream.

Syntax

public void SetObjectStream(string path, System.IO.Stream inputStream);

Async Version
public async Task SetObjectStream(string path, System.IO.Stream inputStream);
public async Task SetObjectStream(string path, System.IO.Stream inputStream, CancellationToken cancellationToken);
Public Sub SetObjectStream(ByVal Path As String, ByVal InputStream As System.IO.Stream)

Async Version
Public Sub SetObjectStream(ByVal Path As String, ByVal InputStream As System.IO.Stream) As Task
Public Sub SetObjectStream(ByVal Path As String, ByVal InputStream As System.IO.Stream, cancellationToken As CancellationToken) As Task

Remarks

This method is used to set the content of the PDF stream object at Path, providing the data in a stream.

Please see Navigating the Document for more details about object paths.

SetOutputStream Method (PDFExplorer Component)

Sets the stream to write the processed document to.

Syntax

public void SetOutputStream(System.IO.Stream outputStream);

Async Version
public async Task SetOutputStream(System.IO.Stream outputStream);
public async Task SetOutputStream(System.IO.Stream outputStream, CancellationToken cancellationToken);
Public Sub SetOutputStream(ByVal OutputStream As System.IO.Stream)

Async Version
Public Sub SetOutputStream(ByVal OutputStream As System.IO.Stream) As Task
Public Sub SetOutputStream(ByVal OutputStream As System.IO.Stream, cancellationToken As CancellationToken) As Task

Remarks

This method is used to set the stream to which the component will write the resulting PDF document. If an output stream is set before the component attempts to perform operations on the document, the component will write the data to the output stream instead of writing to OutputFile or populating OutputData.

NOTE: It may be useful to additionally set the CloseOutputStreamAfterProcessing configuration setting to true when using output streams.

Error Event (PDFExplorer Component)

Fired when information is available about errors during data delivery.

Syntax

public event OnErrorHandler OnError;

public delegate void OnErrorHandler(object sender, PDFExplorerErrorEventArgs e);

public class PDFExplorerErrorEventArgs : EventArgs {
  public int ErrorCode { get; }
  public string Description { get; }
}
Public Event OnError As OnErrorHandler

Public Delegate Sub OnErrorHandler(sender As Object, e As PDFExplorerErrorEventArgs)

Public Class PDFExplorerErrorEventArgs Inherits EventArgs
  Public ReadOnly Property ErrorCode As Integer
  Public ReadOnly Property Description As String
End Class

Remarks

The Error event is fired in case of exceptional conditions during message processing. Normally the component throws an exception.

The ErrorCode parameter contains an error code, and the Description parameter contains a textual description of the error. For a list of valid error codes and their descriptions, please refer to the Error Codes section.

Log Event (PDFExplorer Component)

Fired once for each log message.

Syntax

public event OnLogHandler OnLog;

public delegate void OnLogHandler(object sender, PDFExplorerLogEventArgs e);

public class PDFExplorerLogEventArgs : EventArgs {
  public int LogLevel { get; }
  public string Message { get; }
  public string LogType { get; }
}
Public Event OnLog As OnLogHandler

Public Delegate Sub OnLogHandler(sender As Object, e As PDFExplorerLogEventArgs)

Public Class PDFExplorerLogEventArgs Inherits EventArgs
  Public ReadOnly Property LogLevel As Integer
  Public ReadOnly Property Message As String
  Public ReadOnly Property LogType As String
End Class

Remarks

This event is fired once for each log message generated by the component. The verbosity is controlled by the LogLevel configuration setting.

The LogLevel parameter indicates the detail level of the message. Possible values are:

0 (None) No messages are logged.
1 (Info - default) Informational events such as the basics of the chain validation procedure are logged.
2 (Verbose) Detailed data such as HTTP requests are logged.
3 (Debug) Debug data including the full chain validation procedure are logged.

The Message parameter is the log message.

The LogType parameter identifies the type of log entry. Possible values are:

  • CertValidator
  • Font
  • HTTP
  • PDFInvalidSignature
  • PDFRevocationInfo
  • Timestamp
  • TSL

PDFObject Type

A single PDF object.

Remarks

This type provides access to the properties of an individual document object.

The following fields are available:

Fields

Container
bool (read-only)

Default: False

Whether the object is a container for other objects (i.e., a dictionary, array, or stream).

Disposition
int (read-only)

Default: 0

The method by which the object is addressed in the document.

Possible values are:

0 (Direct - default) The object is recorded in-place.
1 (Reference) The object is recorded as a reference to an indirect object.
2 (Indirect) The object is an indirect object.

Example:

5 0 obj
<<
  /KeyM (Electricity)
>>

...

<<
  /KeyA (Some Value)
  /KeyB << /X /Y >>
  /KeyC 5 0 R
>>

  • The value of /KeyA (Some Value) is a direct string object.
  • The value of /X (/Y) is a direct name object.
  • The value of /KeyB is a direct dictionary object.
  • The value of /KeyC is a reference to the indirect dictionary object 5 0 obj.
  • The value of 5 0 obj is an indirect dictionary object.

ElementCount
int (read-only)

Default: 0

The number of sub-elements in the object, such as keys in the dictionary or elements in the array.

GenNumber
int (read-only)

Default: 0

The generation number of the indirect (top-level) object.

Keys
string (read-only)

Default: ""

A CRLF-separated list of the keys of the dictionary or indices of the array.

ObjectNumber
int (read-only)

Default: 0

The object number of the indirect (top-level) object.

ObjectType
int (read-only)

Default: 0

The type of the object.

Possible values are:

0 (Undefined - default)
1 (Name)
2 (String)
3 (Real)
4 (Integer)
5 (Boolean)
6 (Array)
7 (Dictionary)
8 (Stream)

Offset
long (read-only)

Default: 0

The start offset of the object, in bytes, from the beginning of the PDF document.

Path
string (read-only)

Default: ""

The path to the object, for example /Root/Pages.

Size
long (read-only)

Default: 0

The physical length of the object in bytes.

Value
string

Default: ""

The value of the object.

NOTE: This field only applies to primitive objects (strings, names, integers, reals, and booleans). To access and modify contents of complex objects such as streams, use the GetObjectData and SetObjectData methods.

Constructors

public PDFObject();
Public PDFObject()
public PDFObject(int objectType, string path);
Public PDFObject(ByVal ObjectType As Integer, ByVal Path As String)

Config Settings (PDFExplorer Component)

The component accepts one or more of the following configuration settings. Configuration settings are similar in functionality to properties, but they are rarely used. In order to avoid "polluting" the property namespace of the component, access to these internal properties is provided through the Config method.

PDFExplorer Config Settings

CloseInputStreamAfterProcessing:   Whether to close the input stream after processing.

This setting determines whether the input stream specified in SetInputStream will be closed after processing is complete. The default value is true.

CloseOutputStreamAfterProcessing:   Whether to close the output stream after processing.

This setting determines whether the output stream specified in SetOutputStream will be closed after processing is complete. The default value is true.

LogLevel:   The level of detail that is logged.

This setting controls the level of detail that is logged through the Log event. Possible values are:

0 (None) No messages are logged.
1 (Info - default) Informational events such as the basics of the chain validation procedure are logged.
2 (Verbose) Detailed data such as HTTP requests are logged.
3 (Debug) Debug data including the full chain validation procedure are logged.
OwnerPassword:   The owner password to decrypt the document with.

This setting is used to provide the document owner password for decryption. Though it may be different from Password, most implementations use the same value for both.

SaveChanges:   Whether to save changes made to the document.

This setting specifies whether and how changes made to the PDF document will be saved when the Close method is called. Possible values are:

0 Discard all changes.
1 Save the document to OutputFile, OutputData, or the stream set in SetOutputStream, even if it has not been modified.
2 (default) Save the document to OutputFile, OutputData, or the stream set in SetOutputStream, but only if it has been modified.
StringEncoding:   The encoding to use for string objects.

This setting specifies how the component will encode strings when processing string objects. Possible values are:

Auto (default) Encode the string as a hex string if no human-readable text is identified; otherwise, encode it as a literal string.
Hex Encode the string as a hex string (e.g., hex:48656C6C6F20776F726C6421).
Binary Encode the string as a literal string, converting to human-readable text when possible (e.g., Hello world!).
TempPath:   The location where temporary files are stored.

This setting specifies an absolute path to the location on disk where temporary files are stored. It can be useful to reduce memory usage.

Base Config Settings

BuildInfo:   Information about the product's build.

When queried, this setting will return a string containing information about the product's build.

GUIAvailable:   Whether or not a message loop is available for processing events.

In a GUI-based application, long-running blocking operations may cause the application to stop responding to input until the operation returns. The component will attempt to discover whether or not the application has a message loop and, if one is discovered, it will process events in that message loop during any such blocking operation.

In some non-GUI applications, an invalid message loop may be discovered that will result in errant behavior. In these cases, setting GUIAvailable to false will ensure that the component does not attempt to process external events.

LicenseInfo:   Information about the current license.

When queried, this setting will return a string containing information about the license this instance of a component is using. It will return the following information:

  • Product: The product the license is for.
  • Product Key: The key the license was generated from.
  • License Source: Where the license was found (e.g., RuntimeLicense, License File).
  • License Type: The type of license installed (e.g., Royalty Free, Single Server).
  • Last Valid Build: The last valid build number for which the license will work.
MaskSensitiveData:   Whether sensitive data is masked in log messages.

In certain circumstances it may be beneficial to mask sensitive data, like passwords, in log messages. Set this to true to mask sensitive data. The default is true.

UseInternalSecurityAPI:   Whether or not to use the system security libraries or an internal implementation.

When set to false, the component will use the system security libraries by default to perform cryptographic functions where applicable. In this case, calls to unmanaged code will be made. In certain environments, this is not desirable. To use a completely managed security implementation, set this setting to true.

Setting this configuration setting to true tells the component to use the internal implementation instead of using the system security libraries.

On Windows, this setting is set to false by default. On Linux/macOS, this setting is set to true by default.

NOTE: This setting is static. The value set is applicable to all components used in the application.

When this value is set, the product's system dynamic link library (DLL) is no longer required as a reference, as all unmanaged code is stored in that file.

Trappable Errors (PDFExplorer Component)

PDFExplorer Errors

1301   Invalid path.
1302   Unsupported object type.
1304   Object with this name already exists.
1307   Cannot add direct object to root.
1308   Cannot add reference to root.

PDF Errors

804   PDF decompression failed.
805   Cannot add entry to cross-reference table.
806   Unsupported field size.
807   Unsupported Encoding filter.
808   Unsupported predictor algorithm.
809   Unsupported document version.
812   Cannot read PDF file stream.
813   Cannot write to PDF file stream.
814   OutputFile already exists and Overwrite is false.
815   Invalid parameter.
817   Bad cross-reference entry.
818   Invalid object or generation number.
819   Invalid object stream.
820   Invalid stream dictionary.
821   Invalid AcroForm entry.
822   Invalid Root entry.
823   Invalid annotation.
824   The input document is empty.
826   OpenType font error. The error description contains the detailed message.
828   Invalid CMS data. The error description contains the detailed message.
835   Cannot change decryption mode for opened document.
836   Unsupported Date string.
838   Cryptographic error. The error description contains the detailed message.
840   DecryptionCert error. The error description contains the detailed message.
841   Encryption failed. The error description contains the detailed message.
842   No proper certificate for encryption found.
846   Unsupported revision.
847   Unsupported security handler SubFilter.
848   Failed to verify permissions.
849   Invalid password.
850   Invalid password information.
852   Unsupported encryption algorithm.
859   Cannot encrypt encrypted document.
864   Cannot modify document after signature update.
868   Cannot encrypt or decrypt object.
869   Invalid security handler information.
870   Invalid encrypted data.
871   Invalid block cipher padding.
872   Failed to reload signature.
873   Object is not encrypted.
874   Unexpected cipher information.
877   Invalid document. Bad document catalog.
878   Invalid document Id.
880   Invalid document. Invalid requirements dictionary.
881   Invalid linearization dictionary.
882   Invalid signature information.
883   Unsupported document format.
890   Unsupported feature.
891   Internal error. The error description contains the detailed message.
892   Unsupported color.
893   This operation is not supported for this PDF/A level.
894   Interactive features (Action) are not supported by PDF/A. Set EnforcePDFA to false or clear the Action property of the field.
895   Font file not found.

Parsing Errors

1001   Bad object.
1002   Bad document trailer.
1003   Illegal stream dictionary.
1004   Illegal string.
1005   Indirect object expected.
1007   Invalid reference.
1008   Invalid reference table.
1009   Invalid stream data.
1010   Unexpected character.
1011   Unexpected EOF.
1012   Unexpected indirect object in cross-reference table.
1013   RDF object not found.
1014   Invalid RDF object.
1015   Cannot create element with unknown prefix.
1021   Invalid type in Root object list.