What is Search?
Encrypting data has always come with one costly tradeoff: it makes searching over your data very complicated. Basis Theory’s search feature aims to make that as simple and painless as possible, without any need to detokenize or decrypt your data.
How It Works
When the Token is created the data is securely indexed in several data patterns using blind indexes. Combining the blind indexes with the existing metadata on the tokens allows you to search your entire Vault with a simple query. You can search over the following fields:
data
type
metadata.[key]
created_at
Currently, Basis Theory only supports searching the data
for the social_security_number
and employer_id_number
Token Types. The data is indexed in several patterns, allowing for flexible searching. For instance, you can search for a social security number by the full number with or without dashes (123-45-6789
and 123456789
), and only the last four (6789
).
Permissions
In order to search Tokens using the API, you must use an Application with at least one read
permission. The search results are filtered based on the permissions associated with the Application; you cannot search for tokens you do not have permission to access. Searching the type
, metadata
, and created_at
fields does not require any additional permissions; however, in order to search the data
field, the Application’s permissions must allow unrestricted access to plaintext Token data.
Query Syntax
Basis Theory uses a Lucene-based query syntax to power the search engine. Searching is case-insensitive, however only exact matches are currently supported. Wildcard searches are not available at this time. The search query is a string with one or more terms in the format field:value
. For instance, if you want to find all social security numbers, you would query:
type:social_security_number
To search Token data
on Types that support it, you can search for the indexed data patterns. To search for Tokens containing the data 123-45-6789
, you would query:
data:123-45-6789
If you are searching over data that contains multiple words or spaces, you can wrap the search value in quotes:
data:"data containing multiple words"
Searching Tokens using metadata is supported as well. Metadata search terms use dot notation for key in the form of metadata.key:value
. For example, to search for Tokens having the metadata { customer_id: "123456" }
, query for:
metadata.customer_id:123456
Date range searches are supported using the Lucene bracketed range syntax. [START_DATE TO END_DATE]
denotes an inclusive range and {START_DATE TO END_DATE}
denotes an exclusive range. Values are formatted as a string in ISO 8601 format and can either represent a date or date and time in UTC. For example, to search for tokens that were created in the year 2021, you can query:
created_at:[2021-01-01 TO 2021-12-31T23:59:59Z]
To search a range without a start or end date, use the wildcard *
in place of the start or end date, for example:
created_at:{* TO 2022-01-01}
Multiple terms may be combined using the AND
and OR
operators and grouped using parentheses. For example:
(type:social_security_number AND metadata.user_id:1234) OR data:111-11-1111
Only the Lucene query operators described above are supported at this time. If you would like to have support for any additional Lucene features, please let us know.
FAQ
What is the difference between Search and Fingerprints?
Fingerprints are a measure of uniqueness, not a representation of the underlying data. Fingerprints can be used to locate duplicate data, for instance, but do not allow you to find specific data. You cannot find a card number ending in 4242
with fingerprints.
How does searching Tokens affect my Monthly Active Token usage?
Each Token that matches the search query made via the API and is returned in the result set becomes an Active Token for that month. Searching through the Portal as a logged in user will not affect MAT usage.