Metadata in Box provides an easy way to add context to unstructured data that is stored in Box, presenting it in an easily readable way alongside the content itself. This not only can metadata provide a way to easily search and filter for files that may be the most relevant to end users from a legal, sales or creative perspective, but it also adds an extra layer of information that can be utilized while navigating content in the Box Web application and beyond.
There are four main types of metadata to consider when you are thinking about your what your application does and the best way to utilize the data. The four types are global (or properties) metadata, metadata templates, classification metadata, and Skills metadata. We’ll cover the first two in depth in this article, and if you would like to learn more about Classification, you can read about it in the Integrating with Box Shield article, or in the Utilizing Box Skills article within this section.
Types of Metadata
Metadata within Box can follow one of two main approaches, loose key value pairs that can be applied by anyone or a structured metadata template with predefined keys and value types that are applied to content.
We refer to the first type as the “global” metadata values and they can be found within our web application when previewing content. If you click into the “Metadata” tab in the sidebar from the preview pane, you view existing metadata on a piece of content, or can choose to add your own. In the “Add Metadata” dialogue, the first option is “Custom Metadata” which will allow a user to define their own key(s) and value(s) to attach to the content. Since these datapoints are not well defined by the admins of the Box enterprise, it will limit the filtering capabilities of metadata. Any values that are added through this method will be scanned and indexed within our text Search Index which you can learn more about under the “Search Indexing'“ section.
The second type of metadata refers to predefined metadata templates that can be applied to content. These are a set of defined keys and value types expected that are generally created by the admin or co-admin of the Box enterprise for the users. Since these are available to all users by default (unless the template is marked as hidden for automation use cases), we currently do not allow individual users to create or modify existing templates. There are five different value types available when defining a template that you can read more about in our developer documentation. The value of metadata templates comes in searching for relevant content based not only on the search index described above, but by being able to specify both the template and ranges of values that it contains. For instance, if you need to see contracts expiring in the next month or find a specific range, metadata templates make this much easier to locate. Additionally, you can leverage these templates when searching through the Box web application or through a custom API-based user interface in your application. You can learn more about API based metadata search here
Integrating with Metadata
Since global metadata is going to be available to any user and does not require any special permissions, the rest of this guide will focus on utilizing Metadata Templates from a programmatic perspective.
Application Permissions
-
If you are using a JWT based authentication mechanism, you will need to make sure that your application is set up with the following settings:
-
App Access Level: App + Enterprise Access. As you will be performing actions such as applying metadata on behalf of enterprise users, you will need to be able to access the named users in the enterprise.
-
Application Scopes: these are the minimum permissions required to be able to get user information and apply metadata to files
-
Read all files and folders stored in Box
-
Write all files and folders stored in Box
-
Manage Users
-
Advanced Features
-
Issuing the API Call: select either, “Making API calls using the as-user header” or, “Generate user access tokens.” As Box checks permissions based on the user issuing the API call, you will need to impersonate a Box user when downloading content. These settings are the two ways to accomplish this requirement.
-
For OAuth 2.0 based authentication you will need to have the following settings:
-
Application Scopes: these are the minimum permissions required to be able to get user information and perform metadata actions on their behalf.
-
Read all files and folders stored in Box
-
Write all files and folders stored in Box
-
Manage Users
-
Advanced Features
-
Making API calls using the as-user header
User Permissions
-
For JWT based applications, there is a user that is automatically created during authentication that will mirror your application scopes, so no further work is required.
-
For OAuth 2.0 based application, you need to make sure that the main Admin of the Box enterprise authenticates into the application if you are looking to create or modify existing metadata templates. As mentioned before, since metadata templates are only available to admins or co-admins, you will need elevated privileges to change them. In some cases, you may want to have a dedicated Co-Admin level service account for your application. Since Box will take the lower of the permission sets between the Application Scopes and the Co-Admin scopes, the following permissions would need to be enabled
-
Users and Groups: Manage users
-
Files and Folders: View users’ content, Edit users’ content, Log into users’ accounts
-
Metadata: Create and edit metadata templates for your company
Once you have both your application and your authenticating user set up correctly, you can start utilizing Box Metadata.
API Endpoints
There are two primary API endpoints that you will need to utilize to see if the enterprise already has metadata templates that you can import into your application. There are a set of global metadata templates which are applied to any Box enterprise with metadata enabled—these are not ones you or enterprise admins can modify, but they are available to be used. You can retrieve those by making an API call to get Global templates.
For enterprise-specific metadata templates, you can retrieve the current set of metadata templates by making an API call using the token from the automation user or admin/co-admin through this API call to List Metadata Templates. This will provide you with the full schema of the metadata templates in the enterprise.
In case your application needs to create a new metadata template, we also have an API endpoint that allows for creation and updating existing templates. Note that when you create a template, it will by default be available to all users in the enterprise, so make sure that the metadata template name as well as keys can be understood by end users. If you would prefer to have the metadata applied to files and/or folders, but not be visible or editable by end users, you can choose to mark your metadata template as “Hidden” when creating it. This will maintain the functionality of the template and associated key value pairs through the API, but will not be seen by users in the Box web application.
Choosing which metadata type to use will highly depend on the workflows that your application is empowering. If you are extracting information from rich media like videos, the Skills Cards metadata may work best. For contract or legal information extraction, a well defined metadata template at the enterprise level can help with discoverability of data through metadata filters. Finally, if you would just like to add context to files on an ad-hoc basis, the global template may be sufficient—it’s up to you to make this decision, but if you have any questions, feel free to reach out to our Partner team at integrate@box.com for guidance on your integration.