Azure Blob Storage
Files.com's integration with Azure Blob Storage allows you to integrate with files on a Azure Blob Storage bucket in several different ways.
Files.com's Remote Server Mount feature gives you the ability connect a specific folder on Files.com to the remote server in a real time manner.
That folder then becomes a client, or window, accessing the files stored in your remote server or cloud.
Once you configure a Mount, any operation you perform on or inside that folder will act directly on the remote in real time. Whether you are dropping a file into that folder, deleting a file, creating a subfolder, or performing any other file/folder operations your Files.com user has permissions for, those operations will "pass through" to the remote in real time.
This powerful feature enables a wide variety of use cases such as accessing files on a counterparty (client or vendor)'s cloud without provisioning individual access to individual users, reducing storage costs by leveraging on-premise or bulk storage solutions, enabling applications to access 3rd party clouds via Files.com API, FTP, SFTP, or Files.com Apps and many more.
Alternatively, Files.com's Remote Server Sync feature give you the ability to push or pull files to or from remote servers. This means that the files will exist in both places at the end of the sync process.
A remote sync can be a "push", where files from your Files.com site are transferred to the remote server, a "pull" where files are transferred from the remote server to your Files.com site, or a two-way "sync" where files that are new or changed in either location are pushed and pulled to maintain a synchronized state between the folder on your Files.com site and that on the remote server.
Add Azure Blob Storage as a Remote Server
Create a new Remote Server in your Files.com site using the Azure Blob Storage server type.
You must provide an Internal name for this connection. If you're managing multiple remote servers, make the name clear enough to easily identify this particular connection.
You must provide the required Authentication information.
Once Azure Blob Storage has been added as a Remote Server, you can integrate it with Files.com as either a Remote Server Mount or Remote Server Sync.
Authentication Information
Unlike Amazon S3, Azure Blob container names are not globally unique, so we need to know the Account and Container name in order to connect to your Blob storage. Files.com can authenticate to Azure Blob Storage using Access Key or Shared Access Signature (SAS) token.
The following items are required for connecting Files.com to Azure Blob Storage:
Account - The name of your Azure Storage Account, as shown in your Microsoft Azure web portal > Home > Storage Accounts page.
Container - The name of your Azure Container, as shown in your Microsoft Azure web portal > Home > Storage Accounts > selected storage account > Containers page.
Use Hierarchical Namespace (Azure Data Lake Storage Gen2) - Select this option if your Azure Container has been configured for Data Lake Storage by having its Heirarchical Namespace option enabled.
Test Path for Bucket/Container: This is an optional field for full path of the container. This field is useful when the user credentials provided do not have root access on the remote bucket/container.
Access Key or Shared Access Signature (SAS) Token - The Access Key, or SAS Token, for the selected Azure storage account, as shown in your Microsoft Azure web portal > Home > Storage Accounts > selected storage account > Access Keys, or Microsoft Azure web portal > Home > Storage Accounts > selected storage account > Shared Access Signature page.
Files.com does not currently provide for pass-through authentication to Azure Blob Storage via Azure AD if you are also using Azure AD with Files.com. However, we would love to learn more about the use-case of any customer that might be interested in such a capability.
Once your Remote Server is added, now you need to integrate it to Files.com as either a Remote Server Mount or Remote Server Sync.
Access Key versus Shared Access Signature (SAS) Token
Both the Access Key and the Shared Access Signature (SAS) Token provide secure authentication and authorization to Azure. Whichever method you choose ultimately depends on whichever best fits your requirements. Please consult with your security team to determine which method will best fit your needs.
Generally speaking, the Access Key provides a global, root-like, permission to your Azure Blob. It should be the preferred method when your Blob will only be used by Files.com and doesn't have to share access permissions with other users or solutions.
The Shared Access Signature (SAS) Token provides a restricted, user-like, permission to your Azure Blob. It should be the preferred method when your Blob will be shared by multiple users or solutions. The Shared Access Signature (SAS) Token can more granularly limit access to specific parts of your Blob, allowing better segregation of access to data.
Whether you choose to use an Access Key or a SAS Token, it should be long lived. All connections and functionality to Azure will stop working when the Key or Token expires. If an expiration date is applied to a Key or Token then you will need to replace that Key or Token each time it expires. Do not use Keys or Tokens containing expiration dates unless you are willing to accept downtime at expiration time and you are prepared to manually replace the Key or Token each time it expires.
If in doubt, we recommend using a Shared Access Signature (SAS) Token due to its more granular security controls.
Add Remote Server Mount
Remote Server Mounts are created by mounting them onto an empty folder in Files.com. This folder should not be the Root of your site, although that is supported if you need it.
Add Remote Server Sync
After creating the Remote Server, you can use it to perform Remote Syncs between your remote server and Files.com.
Re-authenticating
Checksums on Azure Blob Storage
You can enable calculation of file integrity checksums of each file uploaded using Files.com. This allows you to easily prove that the file was uploaded correctly and matches the original source file. When 2 files have the same checksum, the is a near certain chance that those files are identical.
The MD5 checksum that Files.com provides is the actual MD5 checksum represented as a sequence of 32 hexadecimal digits. If you are comparing MD5s provided by Files.com with MD5 checksums provided by Azure, you'll find them to be different, because Azure displays MD5 checksums as a base64-encoded byte array.
In order to compare the MD5 checksum from Files.com with the MD5 checksum from Azure, you'll first need to convert the checksum from Files.com to a byte array, and then base64 encode that. The result should then match the checksum from Azure.
Case Sensitivity
Be aware of case sensitivity differences when copying, moving, or syncing files and folders between Azure Blob storage and other storage locations. Azure Blob storage is a case sensitive system whereas other systems may not be. This can cause files to be overwritten, and folders to have their contents merged, if their case insensitive names are a match.
Empty Folders/Directories
Azure Blob Storage is not a hierarchical file system and does not use directories (folders) to organize files. Files and data are stored in a Binary Large Object (BLOB or blob) but are presented in the illusion of a hierarchical file system.
This becomes most apparent when creating, syncing, or uploading an empty folder to Azure Blob Storage.
Azure Blob Storage will represent an empty folder as a zero-byte file of the same name.
Azure will manage these zero-byte files, and their corresponding empty folders, itself. However these zero-byte files may present themselves to other programs, applications, and services that use the Azure Blob. They should be considered a "normal" side effect of using Blob storage.
Files.com follows the same conventions used by other software to emulate folders on these non-hierarchical file systems. We aim to interoperate using as many reasonable conventions, standards, and best practices, as possible.
Hierarchical Namespace
An Azure Blob container can be configured for use as a Data Lake by enabling the Hierarchical Namespace option for the container. When connecting to a container that has been configured with Hierarchical Namespace, make sure to select the Use Hierarchical Namespace (Azure Data Lake Storage Gen2) option when configuring the Remote Server.
If you try to connect to an Azure Blob container that has not been enabled with the Hierarchical Namespace option, and have selected the Use Hierarchical Namespace (Azure Data Lake Storage Gen2) option then, when attempting to delete folders contained within the Azure Blob, you may see errors such as Cannot delete a directory. You may need to enable 'azure_blob_storage_hierarchical_namespace' setting on your remote server if you have the hierarchical namespace feature enabled on your azure storage account.
OneLake
Integration with Microsoft OneLake is not currently supported.
Although OneLake provides APIs that are compatible with Azure Data Lake Storage (ADLS) Gen2, there are differences that prohibit our Azure Blob Storage integration from working with it.
Please contact us if you'd like us to implement an integration with OneLake.