IPWorks Cloud 2020 .NET Edition

Questions / Feedback?

HadoopDFS Component

Properties   Methods   Events   Configuration Settings   Errors  

The HadoopDFS component provides easy access to files stored in HDFS clusters.

Syntax

nsoftware.IPWorksCloud.Hadoopdfs

Remarks

The HadoopDFS component offers an easy-to-use API compatible with any Hadoop distributed file system (HDFS) cluster that exposes Hadoop's standard WebHDFS REST API. Capabilities include uploading and downloading files, strong encryption support, creating folders, file manipulation and organization, and more.

Authentication

First, set the URL property to the base WebHDFS URL of the server (see URL for more details).

Depending on how the server is configured, there are a few different authentication mechanisms that might be used; or, the server might not require authentication at all). Refer to the AuthMechanism property for more information about configuring the component to authenticate correctly.

Addressing Resources

HDFS addresses resources (files, directories, and symlinks) using Linux-style absolute paths. Unless otherwise specified, the component always works in terms of absolute paths, and will always prepend a forward slash (/) to any path passed to it that does not already start with one.

Listing Directory Contents

ListResources lists resources (files, directories, and symlinks) within the specified directory. Calling this method will fire the ResourceList event once for each resource, and will also populate the Resources collection.

// ResourceList event handler.
hdfs.OnResourceList += (s, e) => {
  Console.WriteLine(e.Name);
};

hdfs.ListResources("/work_files/serious_business/cats");

for (int i = 0; i < hdfs.Resources.Count; i++) {
  // Process resources here.
}

Downloading Files

The DownloadFile method downloads files.

If a stream has been specified using SetDownloadStream, the file data will be sent through it. If a stream is not specified, and LocalFile is set, the file will be saved to the specified location; otherwise, the file data will be held by ResourceData.

To download and decrypt an encrypted file, set EncryptionAlgorithm and EncryptionPassword before calling this method.

Download Notes

In the simplest use-case, downloading a file looks like this:

hdfs.LocalFile = "../MyFile.zip";
hdfs.DownloadFile(hdfs.Resources[0].Path);

Resuming Downloads

The component also supports resuming failed downloads by using the StartByte property. If a download is interrupted, set StartByte to the appropriate offset before calling this method to resume the download.

string downloadFile = "../MyFile.zip";
hdfs.LocalFile = downloadFile;
hdfs.DownloadFile(hdfs.Resources[0].Path);

//The transfer is interrupted and DownloadFile() above fails. Later, resume the download:

//Get the size of the partially downloaded file
hdfs.StartByte = new FileInfo(downloadFile).Length;
hdfs.DownloadFile(hdfs.Resources[0].Path);

Resuming Encrypted File Downloads

Resuming encrypted file downloads is only supported when LocalFile was set in the initial download attempt.

If LocalFile is set when beginning an encrypted download, the component creates a temporary file in TempPath to hold the encrypted data until the download is complete. If the download is interrupted, DownloadTempFile will be populated with the path of the temporary file that holds the partial data.

To resume, DownloadTempFile must be populated, along with StartByte, to allow the remainder of the encrypted data to be downloaded. Once the encrypted data is downloaded it will be decrypted and written to LocalFile.

hdfs.LocalFile = "../MyFile.zip";
hdfs.EncryptionPassword = "password";
hdfs.DownloadFile(hdfs.Resources[0].Path);

//The transfer is interrupted and DownloadFile() above fails. Later, resume the download:

//Get the size of the partially download temp file
hdfs.StartByte = new FileInfo(hdfs.Config("DownloadTempFile")).Length;
hdfs.DownloadFile(hdfs.Resources[0].Path);

Uploading Files

The UploadFile method uploads new files.

If SetUploadStream has been used to set an upload stream, it will take priority as the file data source. If LocalFile is set the file will be uploaded from the specified path. If LocalFile is not set the data in ResourceData will be used.

To encrypt the file before uploading it, set EncryptionAlgorithm and EncryptionPassword.

hdfs.LocalFile = "../MyFile.zip";
hdfs.UploadFile("/MyFile.zip");

Additional Functionality

The HadoopDFS component offers advanced functionality beyond simple uploads and downloads. For instance:

Property List


The following is the full list of the properties of the component with short descriptions. Click on the links for further details.

AuthMechanismThe authentication mechanism to use when connecting to the server.
AuthorizationOAuth 2.0 Authorization Token.
DirSummaryDirectory content summary information.
EncryptionAlgorithmThe encryption algorithm.
EncryptionPasswordThe encryption password.
FirewallA set of properties related to firewall access.
IdleThe current status of the component.
LocalFileThe location of the local file.
LocalHostThe name of the local host or user-assigned IP interface through which connections are initiated or accepted.
OtherHeadersOther headers as determined by the user (optional).
OverwriteWhether to overwrite the local or remote file.
ParsedHeadersCollection of headers returned from the last request.
PasswordThe password to use for authentication.
ProxyA set of properties related to proxy access.
QueryParamsAdditional query parameters to be included in the request.
ReadBytesThe number of bytes to read when downloading a file.
ResourceDataThe data that was downloaded, or that should be uploaded.
ResourcesA collection of resources.
SSLAcceptServerCertInstructs the component to unconditionally accept the server certificate that matches the supplied certificate.
SSLCertThe certificate to be used during SSL negotiation.
SSLServerCertThe server certificate for the last established connection.
StartByteThe byte offset from which to start downloading a file.
TimeoutA timeout for the component.
URLThe URL of the Hadoop WebHDFS server.
UserThe user name to use for authentication.

Method List


The following is the full list of the methods of the component with short descriptions. Click on the links for further details.

AddQueryParamAdds a query parameter to the QueryParams properties.
AppendFileAppends data to an existing file.
ConfigSets or retrieves a configuration setting.
DeleteResourceDeletes a resource.
DoCustomOpExecutes an arbitrary WebHDFS operation.
DownloadFileDownloads a file.
GetDirSummaryGets a content summary for a directory.
GetResourceInfoGets information about a specific resource.
InterruptInterrupt the current method.
JoinFileBlocksJoins multiple files' blocks together into one file.
ListResourcesLists resources in a given directory.
MakeDirectoryMakes a directory.
MoveResourceMoves a resource.
ResetResets the component to its initial state.
SetDownloadStreamSets the stream to which downloaded data will be written.
SetFileReplicationSets the replication factor for a file.
SetOwnerSets a resource's owner and/or group.
SetPermissionAssigns the given permission to a resource.
SetTimesSets a resource's modification and/or access times.
SetUploadStreamSets the stream from which data is read when uploading.
TruncateFileTruncates a file to a given size.
UploadFileUploads a file.

Event List


The following is the full list of the events fired by the component with short descriptions. Click on the links for further details.

EndTransferFired when a document finishes transferring.
ErrorInformation about errors during data delivery.
HeaderFired every time a header line comes in.
LogFires once for each log message.
ProgressFires during an upload or download to indicate transfer progress.
ResourceListFires once for each resource returned when listing resources.
SSLServerAuthenticationFired after the server presents its certificate to the client.
SSLStatusShows the progress of the secure connection.
StartTransferFired when a document starts transferring (after the headers).
TransferFired while a document transfers (delivers document).

Configuration Settings


The following is a list of configuration settings for the component with short descriptions. Click on the links for further details.

CreatePermissionThe permission to assign when creating resources.
DownloadTempFileThe temporary file used when downloading encrypted data.
EncryptionIVThe initialization vector to be used for encryption/decryption.
EncryptionKeyThe key to use during encryption/decryption.
HomeDirCan be queried to obtain the current user's home directory path.
ProgressAbsoluteWhether the component should track transfer progress absolutely.
ProgressStepHow often the progress event should be fired, in terms of percentage.
RawRequestReturns the data that was sent to the server.
RawResponseReturns the data that was received from the server.
RecursiveDeleteWhether to recursively delete non-empty directories.
TempPathThe path to the directory where temporary files are created.
XChildCountThe number of child elements of the current element.
XChildName[i]The name of the child element.
XChildXText[i]The inner text of the child element.
XElementThe name of the current element.
XParentThe parent of the current element.
XPathProvides a way to point to a specific element in the returned XML or JSON response.
XSubTreeA snapshot of the current element in the document.
XTextThe text of the current element.
AcceptEncodingUsed to tell the server which types of content encodings the client supports.
AllowHTTPCompressionThis property enables HTTP compression for receiving data.
AllowHTTPFallbackWhether HTTP/2 connections are permitted to fallback to HTTP/1.1.
AllowNTLMFallbackWhether to allow fallback from Negotiate to NTLM when authenticating.
AppendWhether to append data to LocalFile.
AuthorizationThe Authorization string to be sent to the server.
BytesTransferredContains the number of bytes transferred in the response data.
ChunkSizeSpecifies the chunk size in bytes when using chunked encoding.
CompressHTTPRequestSet to true to compress the body of a PUT or POST request.
EncodeURLIf set to true the URL will be encoded by the component.
FollowRedirectsDetermines what happens when the server issues a redirect.
GetOn302RedirectIf set to true the component will perform a GET on the new location.
HTTP2HeadersWithoutIndexingHTTP2 headers that should not update the dynamic header table with incremental indexing.
HTTPVersionThe version of HTTP used by the component.
IfModifiedSinceA date determining the maximum age of the desired document.
KeepAliveDetermines whether the HTTP connection is closed after completion of the request.
KerberosSPNThe Service Principal Name for the Kerberos Domain Controller.
LogLevelThe level of detail that is logged.
MaxHeadersInstructs component to save the amount of headers specified that are returned by the server after a Header event has been fired.
MaxHTTPCookiesInstructs component to save the amount of cookies specified that are returned by the server when a SetCookie event is fired.
MaxRedirectAttemptsLimits the number of redirects that are followed in a request.
NegotiatedHTTPVersionThe negotiated HTTP version.
OtherHeadersOther headers as determined by the user (optional).
ProxyAuthorizationThe authorization string to be sent to the proxy server.
ProxyAuthSchemeThe authorization scheme to be used for the proxy.
ProxyPasswordA password if authentication is to be used for the proxy.
ProxyPortPort for the proxy server (default 80).
ProxyServerName or IP address of a proxy server (optional).
ProxyUserA user name if authentication is to be used for the proxy.
SentHeadersThe full set of headers as sent by the client.
StatusLineThe first line of the last response from the server.
TransferredDataThe contents of the last response from the server.
TransferredDataLimitThe maximum number of incoming bytes to be stored by the component.
TransferredHeadersThe full set of headers as received from the server.
TransferredRequestThe full request as sent by the client.
UseChunkedEncodingEnables or Disables HTTP chunked encoding for transfers.
UseIDNsWhether to encode hostnames to internationalized domain names.
UsePlatformDeflateWhether to use the platform implementation to decompress compressed responses.
UsePlatformHTTPClientWhether or not to use the platform HTTP client.
UserAgentInformation about the user agent (browser).
CloseStreamAfterTransferIf true, the component will close the upload or download stream after the transfer.
ConnectionTimeoutSets a separate timeout value for establishing a connection.
FirewallAutoDetectTells the component whether or not to automatically detect and use firewall system settings, if available.
FirewallHostName or IP address of firewall (optional).
FirewallListenerIf true, the component binds to a SOCKS firewall as a server (IPPort only).
FirewallPasswordPassword to be used if authentication is to be used when connecting through the firewall.
FirewallPortThe TCP port for the FirewallHost;.
FirewallTypeDetermines the type of firewall to connect through.
FirewallUserA user name if authentication is to be used connecting through a firewall.
KeepAliveIntervalThe retry interval, in milliseconds, to be used when a TCP keep-alive packet is sent and no response is received.
KeepAliveTimeThe inactivity time in milliseconds before a TCP keep-alive packet is sent.
LingerWhen set to True, connections are terminated gracefully.
LingerTimeTime in seconds to have the connection linger.
LocalHostThe name of the local host through which connections are initiated or accepted.
LocalPortThe port in the local host where the component binds.
MaxLineLengthThe maximum amount of data to accumulate when no EOL is found.
MaxTransferRateThe transfer rate limit in bytes per second.
ProxyExceptionsListA semicolon separated list of hosts and IPs to bypass when using a proxy.
TCPKeepAliveDetermines whether or not the keep alive socket option is enabled.
TcpNoDelayWhether or not to delay when sending packets.
UseIPv6Whether to use IPv6.
UseNTLMv2Whether to use NTLM V2.
CACertFilePathsThe paths to CA certificate files when using Mono on Unix/Linux.
LogSSLPacketsControls whether SSL packets are logged when using the internal security API.
ReuseSSLSessionDetermines if the SSL session is reused.
SSLCACertsA newline separated list of CA certificate to use during SSL client authentication.
SSLCheckCRLWhether to check the Certificate Revocation List for the server certificate.
SSLCipherStrengthThe minimum cipher strength used for bulk encryption.
SSLEnabledCipherSuitesThe cipher suite to be used in an SSL negotiation.
SSLEnabledProtocolsUsed to enable/disable the supported security protocols.
SSLEnableRenegotiationWhether the renegotiation_info SSL extension is supported.
SSLIncludeCertChainWhether the entire certificate chain is included in the SSLServerAuthentication event.
SSLNegotiatedCipherReturns the negotiated ciphersuite.
SSLNegotiatedCipherStrengthReturns the negotiated ciphersuite strength.
SSLNegotiatedCipherSuiteReturns the negotiated ciphersuite.
SSLNegotiatedKeyExchangeReturns the negotiated key exchange algorithm.
SSLNegotiatedKeyExchangeStrengthReturns the negotiated key exchange algorithm strength.
SSLNegotiatedVersionReturns the negotiated protocol version.
SSLProviderThe name of the security provider to use.
SSLSecurityFlagsFlags that control certificate verification.
SSLServerCACertsA newline separated list of CA certificate to use during SSL server certificate validation.
TLS12SignatureAlgorithmsDefines the allowed TLS 1.2 signature algorithms when UseInternalSecurityAPI is True.
TLS12SupportedGroupsThe supported groups for ECC.
TLS13KeyShareGroupsThe groups for which to pregenerate key shares.
TLS13SignatureAlgorithmsThe allowed certificate signature algorithms.
TLS13SupportedGroupsThe supported groups for (EC)DHE key exchange.
AbsoluteTimeoutDetermines whether timeouts are inactivity timeouts or absolute timeouts.
FirewallDataUsed to send extra data to the firewall.
InBufferSizeThe size in bytes of the incoming queue of the socket.
OutBufferSizeThe size in bytes of the outgoing queue of the socket.
BuildInfoInformation about the product's build.
GUIAvailableTells the component whether or not a message loop is available for processing events.
LicenseInfoInformation about the current license.
UseInternalSecurityAPITells the component whether or not to use the system security libraries or an internal implementation.

Copyright (c) 2022 /n software inc. - All rights reserved.
IPWorks Cloud 2020 .NET Edition - Version 20.0 [Build 8265]