Document Search Response Schema

Archer Evolv Compliance “Document Search API” enables retrieving various document-type specific attributes for filtered content.

The tables below outline various document-level attributes that can be retrieved using this method, broken down by the meta categories of document types.

The latest json schema can be viewed and downloaded through this link.

06/30/2025 – The format of this document was moved from a json view to a table view
03/13/2024 – added available attributes for different Document Types
02/20/2024 – added agency_update, agenda_rule, applicability_date, audit_entries, authors, citation_ids, comment, content_authorization, content_qualification,eitl_labels, enforcement, implementation_date, language, languages, languages_of_language_related_docs, mainstream_news, obligations, pipeline_status, presentation_id, presidential_document, proposed_rule, publisher, whitepaper
05/18/2023 – added “concepts_tags, has_unofficial_publication_date, Hidden, translated, user_has_access” attributes
04/13/2022 – deprecated “bookmarked” attribute
02/09/2022 – clarified attribute descriptions for “docket_ids” and “dockets”
09/01/2021 – added attribute for “deprecated”
08/24/2021 – added attribute for “cai_category_name”

(last updated 06/30/25)

All Properties

agencies	Agency object	Source (agency) from which a specific document originated.
agency_ids	long	Unique identifier of the Source (agency) within Compliance.ai
agency_update	Agency Update object
agenda_rule	Agenda Rule object
DEPRECATED - alt_summaries	Alt Summaries object	Unique identifier of the Source (agency) within Compliance.ai
applicability_date	date	The applicability date for the entity.
audit_entries	Audit Entry Object
authors	Authors Object
"DEPRECATED - bookmarked"	boolean	Specifies whether the document has been bookmarked or not
cai_category_id	long	Specifies the ID for the type of document (category) as seen in the Compliance.ai platform. A full list of document categories can be retrieved from the Document Type endpoint available through Compliance.ai API.
cai_category_name	keyword	Specifies the name for the type of document (category) as seen in the Compliance.ai platform. cai_category_name will always be returned in the results with cai_category_id. A full list of document categories can be retrieved from the Document Type endpoint available through Compliance.ai API.
category	keyword	Specifies the type of document (category) Compliance.ai scraped from the source itself.
cfr_parts	keyword	Lists the portions of CFR referenced in the document
children	Children object	List of children of the current document along with associated data for the child document
citation_ids	Citations object	The schema for storing citation identifiers.
cited_associations	Cited Association object	List of Regulations, Acts, Business names and Concepts associated with this citation"
clean_citations	keyword	Citations as listed by the source
comment	Comments object
complianceai_id	keyword	Unique document identifier assigned by Compliance.ai.
concept_tags	long	Concepts that documents are tagged with
content_authorization	boolean	Specifies whether the content is privacy restricted.
content_qualification	long	The percentage of content qualification.
created_at	date	Date and time the document was added to Compliance.ai
deprecated	boolean	Determines if the document has been deprecated from the Compliance.ai platform. Document is deprecated if the attribute is included in the response and is marked as true. Deprecated documents will not be returned in results of a request with filters, but they are still accessible if directly filtered with doc_id in the request.
docket_ids	keyword	List of unique Docket_ids associated with this document. The dockets attribute will have more details for each docket_id.
dockets	Dockets object	Listing of Docket_id along with more detailed information realted to each docket_id. Currently, only docket_id is the sub attribute available.
document_location	Document Location object	A hierarchal order of the document's parent child relationships. This is only available for the following document types: 'Admin Code', 'CFR', 'State Code', 'Statute','US Code', 'US Public Law', 'Admin Code Navigation', 'CFR Navigation', 'State Code Navigation', 'Statute Navigation', 'US Code Navigation', 'US Public Law Navigation'
document_version_docket_id	keyword	A grouping of all the versions of a document.
document_version_latest	boolean	Specifies whether the document is the latest version.
eitl_labels	EITL Labels object	Schema for EITL labels.
enforcement	Enforcements object	Schema for enforcement actions.
flagged	boolean	Specifies if the document has been flagged by Compliance.ai.
DEPRECATED - full_path	keyword	URL for the location of the document on the source site
full_text	string	Full text of the document
full_text_hash	keyword	Compliance.ai Hashed reference to the document's text
full_xml_hash	keyword	Compliance.ai Hashed reference to the document's xml
has_comments	Has Comments object	Determines if there are comments associated with a regulatory document within Compliance.ai
DEPRECATED - has_obligations	boolean	This field is no longer in use
has_sentences	boolean	Specifies whether the document has been split into sentences.
has_unofficial_publication_date	boolean	Specifies whether the document has has_unofficial_publication_date or not
hidden	boolean	Specifies whether the document is hidden or not
id	long	Unique identifier of the document within Compliance.ai
implementation_date	date	The implementation date for the entity.
DEPRECATED - important_dates	Important Dates object	Important regulatory dates extracted from the document
DEPRECATED - incoming_citation_ids	long	This is no longer in use
jurisdiction	keyword	The Jurisdiction associated with this document (US, UK, etc.)
language	keyword	The language of the document.
languages	Language object	Information about the detected languages.
languages_of_language_related_docs	Language Related Docs object	Information about the detected languages of language-related documents.
mainstream_news	Mainstream News object	Mainstream news entry.
INTERNAL - meta_table	keyword	An internal reference used by compliance.ai.
obligations	Obligations object	Obligations associated with a specific entity.
official_id	keyword	A unique document identifier within compliance.ai.
parent	Parent object	List of parents of the current document along with associated data for the child document
pdf_hash	keyword	Compliance.ai Hashed reference to the document's pdf
pdf_url	keyword	The document attribute pdf_url stores a direct link to the file from the source, and will be null if not available. Currently supports .pdf and .doc urls.
INTERNAL - pipeline_status	keyword	An internal reference used by compliance.ai.
INTERNAL - presentation_id	keyword	The ID of the presentation.
presidential_document	Presidential Documents object	Information about a presidential document.
proposed_rule	Proposed_Rule object	Information about a proposed rule.
INTERNAL - provenance	keyword	An internal reference used by compliance.ai.
publication_date	date	The Publication_date Schema
publisher	keyword	The publisher of the document.
related_documents	Related Document object	Pre-defined relationships between documents
rule	Rule object	If the document is a regulatory rule, provides all the key date information about the rule
sentence_main	Sentence Main object	A list of the sentences, along with associated data, that make up the document.
INTERNAL - spider_name	keyword	Compliance.ai data collector that fetched this document
summaries	Summaries object	Summary created by Compliance.ai based on this document
summary_text	string	summary text created for the document
DEPRECATED - times_cited	long
title	string	Document title
topics	Topics object
DEPRECATED - total_citation_count	long	Deprecated - Total citation count
translated	boolean	Specifies whether the document is translated or not
DEPRECATED - unique_citation_count	long
updated_at	date	Last time this document entry (and associated meta-data) has been updated by Compliance.ai
web_url	keyword	Document URL directly from the source. If available, it points directly to the document on the source. If not available, then is points to the source url.
whitepaper	Whitepaper object	Details about a whitepaper document.

Properties Common To All Document Types

agencies	Agency object	Source (agency) from which a specific document originated.
DEPRECATED - alt_summaries	Alt Summaries object	Unique identifier of the Source (agency) within Compliance.ai
DEPRECATED - bookmarked	boolean	Specifies whether the document has been bookmarked or not
cai_category_id	long	Specifies the ID for the type of document (category) as seen in the Compliance.ai platform. A full list of document categories can be retrieved from the Document Type endpoint available through Compliance.ai API.
category	keyword	Specifies the name for the type of document (category) as seen in the Compliance.ai platform. cai_category_name will always be returned in the results with cai_category_id. A full list of document categories can be retrieved from the Document Type endpoint available through Compliance.ai API.
cfr_parts	keyword	Lists the portions of CFR referenced in the document
complianceai_id	keyword	Unique document identifier assigned by Compliance.ai.
concept_tags	long	Concepts that documents are tagged with
created_at	date	Date and time the document was added to Compliance.ai
deprecated	boolean	Determines if the document has been deprecated from the Compliance.ai platform. Document is deprecated if the attribute is included in the response and is marked as true. Deprecated documents will not be returned in results of a request with filters, but they are still accessible if directly filtered with doc_id in the request.
dockets	Dockets object	List of unique Docket_ids associated with this document. The dockets attribute will have more details for each docket_id.
document_version_latest	boolean	Specifies whether the document is the latest version.
flagged	boolean	Specifies if the document has been flagged by Compliance.ai.
full_xml_hash	keyword	Compliance.ai Hashed reference to the document's xml
has_sentences	boolean	Specifies whether the document has been split into sentences.
has_unofficial_publication_date	boolean	Specifies whether the document has has_unofficial_publication_date or not
hidden	boolean	Specifies whether the document is hidden or not
id	long	Unique identifier of the document within Compliance.ai
DEPRECATED - important_dates	Important Dates object	Important regulatory dates extracted from the document
jurisdiction	keyword	The Jurisdiction associated with this document (US, UK, etc.)
languages	Language object	Information about the detected languages.
official_id	keyword	A unique document identifier within compliance.ai.
parent	Parent object	List of parents of the current document along with associated data for the child document
pdf_hash	keyword	Compliance.ai Hashed reference to the document's pdf
pdf_url	keyword	The document attribute pdf_url stores a direct link to the file from the source, and will be null if not available. Currently supports .pdf and .doc urls.
INTERNAL - pipeline_status	keyword	An internal reference used by compliance.ai.
INTERNAL - provenance	keyword	The ID of the presentation.
publication_date	date	The Publication_date Schema
related_documents	Related Document object	Pre-defined relationships between documents
INTERNAL - spider_name	keyword	Compliance.ai data collector that fetched this document
summaries	Summaries object	Summary created by Compliance.ai based on this document
summary_text	string	summary text created for the document
title	string	Document title
topics	Topics object
translated	boolean	Specifies whether the document is translated or not
updated_at	date	Last time this document entry (and associated meta-data) has been updated by Compliance.ai
user_has_access
web_url	keyword	Document URL directly from the source. If available, it points directly to the document on the source. If not available, then is points to the source url.

Additional Properties For Agency Update Type

citation_ids	Citations object	The schema for storing citation identifiers.
has_obligations	boolean	This field is no longer in use
INTERNAL - presentation_id	keyword	The ID of the presentation.
presidential_document	Presidential Documents object	Information about a presidential document.
sentence_main	Sentence Main object	A list of the sentences, along with associated data, that make up the document.

Additional Properties For Enforcements Type

enforcement

Enforcements object

Schema for enforcement actions.

Additional Properties For Legislation Type

content_authorization	Content Authorization object	Specifies whether the content is privacy restricted.
languages_of_language_related_docs	Language Related Docs object	Information about the detected languages of language-related documents.
INTERNAL - provenance	keyword	An internal reference used by compliance.ai.

Additional Properties For News Type

content_qualification	Content Qualification object	The percentage of content qualification.
mainstream_news	Mainstream News object	Mainstream news entry.

Additional Properties For Notices Type

citation_ids	Citations object	The schema for storing citation identifiers.
has_obligations	boolean	This field is no longer in use
has_sentences	boolean	Specifies whether the document has been split into sentences.
rule	Rule object	If the document is a regulatory rule, provides all the key date information about the rule
sentence_main	Sentence Main object	A list of the sentences, along with associated data, that make up the document.

Additional Properties For Regulations Type

children	Children object/td>	List of children of the current document along with associated data for the child document
document_version_docket_id	keyword	A grouping of all the versions of a document.
document_version_latest	boolean	Specifies whether the document is the latest version.
has_obligations	boolean	This field is no longer in use
has_sentences	boolean	Specifies whether the document has been split into sentences.
official_id	keyword	A unique document identifier within compliance.ai.
proposed_rule	Proposed Rule object	Information about a proposed rule.
rule	Rule object	If the document is a regulatory rule, provides all the key date information about the rule
sentence_main	Sentence Main object	A list of the sentences, along with associated data, that make up the document.

Additional Properties For Regulatory Filings Type

languages_of_language_related_docs

Language Related Docs object

Information about the detected languages of language-related documents.

Additional Properties For Research Papers Type

eitl_labels	EITL Labels object	Schema for EITL labels.
mainstream_news	Mainstream News object	Mainstream news entry.
INTERNAL - presentation_id	keyword	The ID of the presentation.
publisher	keyword	The publisher of the document.
whitepaper	Whitepaper object	Details about a whitepaper document.

Agency Object

active	boolean	Determines if the source is still actively used by Compliance.ai
blacklisted	boolean	Determines if the source is blacklisted by Compliance.ai
description	string	Provides a description of the source (agency) from which this document item originated
id	long	Unique identifier of the Source (agency) within Compliance.ai
jurisdiction	keyword	The specific regulatory jurisdiction for each source
name	string	The extended name of the source from which the documents originated
parent_id	long	If the agency is related to (is a sub-set of) another source, uniquely identifies the parent agency in Compliance.ai
short_name	string	Abbreviated name for the source in Compliance.ai
slug	keyword	Slug used by Compliance.ai for the source
times_cited	long	Determines the number of times the agency was cited across compliance.ai
type	keyword	Type of agency document
url	keyword	The Url for the Source (agency) website. In some cases, the url given may include .json which refers to our internal link because the agency is included in an API
words	word object	associated departments and synonyms for an Agency

Word Object

agency_id	long	Unique identifier of the Source (agency) within Compliance.ai
departments	array string	An instance for the departments under an agency
subdepartments	array string	An instance for the subdepartments under an agency
synonyms	array string	Various names or synonyms associated with the source from which the document originated.

Agency Update

id	long	The unique identifier for the agency update.
is_sro	boolean	Indicates whether the agency update is an SRO (Self-Regulatory Organization).

Agenda Rule

citations	agenda rule citation object
doc_id	integer64	The unique identifier for the document.
id	string	The unique identifier for the agenda rule.
major	boolean	Indicates whether the agenda rule is major.
metadata	agenda rule metadata object	Metadata for the Agenda Rule
priority	string	The priority of the agenda rule.
stage	string	The stage of the agenda rule.
status	string	The status of the agenda rule.

Agenda Rule Citation

cfr	keyword	The citation from the Code of Federal Regulations (CFR).
usc	keyword	The citation from the United States Code (USC).

Agenda Rule Metadata

history	agenda rule metadata history object	History for the rule
xml	string	The XML representation of the document.

Agenda Rule Metadata History

parent_id	long	The parent document's unique identifier.
pub_date	date	The publication date of the document.

Audit Entry

id	long	The unique identifier for the audit entry.
incoming_pipeline_status	keyword	An internal reference used by compliance.ai.
notes	keyword	Any additional notes related to the audit entry.
process	keyword	The process associated with the audit entry.
real_created_at	date	The real creation timestamp of the audit entry.
resulting_pipeline_status	keyword	The status of the resulting pipeline.

Author

id	long	Id of the author
name	string	Name of the author

CFR Parts

cite

keyword

Each citation referenced in the document

Document Children

id	long	Identififier
title	string	Document Title, as observed in the document
category	keyword	Category
pub_date	date	Publication date
web_url	keyword	The URL to the Web version
pdf_url	keyword	The URL to the PDF version
summary_text	keyword	Summary, as observed in the document
agency_ids	long	The unique identifier(s) for the agency or agencies
citation_type	keyword	This field is not in use. If the document has a parent document, citation_type will have the value parent
official_id	keyword	the official id
jurisdiction	keyword	The specific regulatory jurisdiction
parent_id	long	Id of the parent
effective_on	date	Effective Date
comments_on	date	Comment Date

Citation Ids

citation_id

long

The unique identifier for a citation

Cited Association

act_ids	long	List of Acts associated with this citation
bank_ids	long	List of Business names associated with this citation
citation_ids	long	list of documents associated with this document
concept_ids	long	Concepts associated with this citation
named_regulation_ids	long	Each ID reprents a Regulation associated with this citation

Comment

agency_ids	long	The unique identifier(s) for the agency or agencies associated with the comment.
comment_api_id	keyword	The API-specific identifier for the comment.
comment_title	string	The title of the comment.
commented_doc_id	long	The unique identifier for the document being commented on.
doc_id	long	The unique identifier for the comment.
docket_ids	keyword	The identifier(s) for the docket(s) associated with the comment.
id	long	The unique identifier for the comment.
organization	string	The organization associated with the comment.
pdf_url	keyword	The URL to the PDF version of the comment.
submitter_name	string	The name of the person or entity submitting the comment.
web_url	keyword	The URL to the web version of the comment.

Content Authorization

privacy_restricted

boolean

Specifies whether the content is privacy restricted.

Content Qualification

content_percentage

long

The percentage of content qualification.

Docket

docket_id

keyword

Value of the Docket_Id

Document Location

doc_id	long	The unique identifier for the document
title	string	The document title

EITL Label

id	long	The unique identifier for the EITL label.
name	string	The name of the EITL label

Has Comments

comments_count	long	If the document has comments, determines the number of comments available
last_comment_date	string	If the document has comments, provides the date of the last comment collected by Compliance.ai

Important Dates

date	date	Each date that's been extracted (Comments close date, effective date, etc.)
label	keyword	Label for the date
snippet	keyword	Snippet related to the date

Languages

high_confidence	boolean	Indicates whether the language detection is of high confidence.
id	long	The unique identifier for the language.
name	string	The name of the language.

Languages of Language Related Docs

Mainstream News Details

document_id_external

string

The external document ID.

Mainstream News Source CAI Categories

description	string	Description of the CAI category.
doc_meta_category_id	long	The unique identifier for the CAI category.
id	long	The unique identifier for the CAI category.
is_archived	boolean	Indicates whether the CAI category is archived.
name	string	The name of the CAI category.
surface_in_filter	boolean	Indicates whether the CAI category appears in filters.
updated_at	date	The date of the last update of the CAI category.

Mainstream News Source

cai_categories	Mainstream News Source CAI Categories	CAI categories associated with the news source.
created_at	date	The creation date of the news source.
id	integer 64	The unique identifier for the news source.
logo_content_type	string	The content type of the news source's logo.
logo_hash	string	The hash of the news source's logo.
logo_url	string	The URL of the news source's logo.
name	keyword	The name of the news source.
premium_content_provider	boolean	Indicates whether the news source is a premium content provider.
scraped_cai_categories	Mainstream News Source CAI Categories	Scraped CAI categories associated with the news source.
updated_at	date	The date of the last update of the scraped CAI category.

Mainstream News

details	Mainstream News Details	Details about the mainstream news document.
doc_id	long	The unique identifier for the mainstream news document.
document_external_id	string	The external document ID.
has_real_full_text	boolean	Indicates whether the document has real full text available.
id	long	The unique identifier for the mainstream news entry.
image_content_type	keyword	The content type of the image associated with the news.
image_hash	keyword	The hash of the image associated with the news.
image_url	keyword	The URL of the image associated with the news.
news_source	Mainstream News Source	Mainstream news entry.
news_source_id	long	The unique identifier for the news source.

Document Parent

#REF!

agency_ids	long	Agency Ids
category	string	Category
citation_type	string	This field is not in use. If the document has a parent document, citation_type will have the value parent
id	long	Id of the parent
jurisdiction	keyword	Jurisdiction of the parent
official_id	string	Official Id
pdf_url	string	The URL to the PDF version
presentation_id	string	Internal Compliance.ai reference
pub_date	string	Publication date of the document
summary_text	string	Summary, as observed in the document
title	string	Document Title, as observed in the document
web_url	string	Document URL directly from the source. If available, it points directly to the document on the source. If not available, then is points to the source url.

President

identifier	keyword	The identifier of the president.
name	keyword	The name of the president.

Presidential Document

citation	keyword	The citation of the presidential document.
doc_id	long	The unique identifier for the presidential document.
end_page	long	The end page of the presidential document.
id	long	The unique identifier for the presidential document.
notes	keyword	Notes related to the presidential document.
order_number	long	The order number of the presidential document.
president	President	Information about the president related to the presidential document.
signed_on	date	The date when the presidential document was signed.
start_page	long	The start page of the presidential document.
subtype	keyword	The subtype of the presidential document.
volume	long	The volume of the presidential document.

Proposed Rule

comments_count	long	The count of comments on the proposed rule.
id	long	The unique identifier for the proposed rule.
last_comment_date	string	The date of the last comment on the proposed rule.

Rule

comments_close_on	date	Date when the comment period closes
effective_on	date	Date when the rule will be effective
id	long	Unique ID of the internal rule record in Compliance.ai
significant	boolean	Indicates if the rule is significant

Prediction Annotation Types

obligation_probability

float

This shows the highest probability of a sentence that is an obligation. Note - when the has_obligations parameter is used in the search request, the results surface any document that has sentences with obligation_probability higher than .1. The threshold Compliance.ai uses to determine a high confidence obligation is more than .9 , while a likely obligation is between .1 - .89

Sentence Data

sentence_version	long	Version of the sentences from the document. Note - a document may have multiple versions of sentences
sentences	array of Sentence	Sentences

Sentence

id	long	Unique ID of the setence in a document in Compliance.ai
obligation_group_id	long	Obligation group the obligation currently belongs to. This field is currently not in use
obligation_probability	float	Probability assigned to a sentence for it being an Obligation. 0-1 for model prediction, 1.1 for positive human annotation and -0.1 for negative human annotation
sentence	string	A sentence of the document
sentence_para_id	long	The paragraph this sentence belongs to
sentence_type	string	The type of sentence in the document - can be paragraph or header

Sentence Main

current_sentence_version	long	Current version of the sentences used for the document. Note - a document may have multiple versions of sentences
prediction_annotation_types	prediction annotation type	The different types of predictions/annotations Compliance.ai does on a document's sentences, currently we only have obligations
sentence_data	Sentence Data	Sentence Data

Summary

clean_full_text_hash	keyword	Compliance.ai uses this to clean up the full_text before extracting the summary
doc_id	long	Document id
highlighted_pdf_hash	keyword	The Highlighted_pdf_hash Schema
id	long	The full_text_hash Schema
original_full_text_hash	keyword	The full_text_hash Schema
original_pdf_hash	keyword	The full_text_hash Schema
summary_date	date	Date Compliance.ai created the automatic summary
summary_metadata		The Original_pdf_hash Schema
summary_sentences	string	Collection of sentences auto-extracted as summaries
summary_text	string	Summary text
type	keyword	Length type of auto-summary (Long, Med, Short)

Topics

id	long	Topic Id
judge_count	long	Number of times document has been judged by Compliance.ai experts post-model-training
model_probability	float	Probability assigned for topic classification by Compliance.ai experts post-model-training
name	keyword	Topic name(s) associated with document by Compliance.ai
positive_judgments	integer	Number of times document has been positively judged by Compliance.ai experts post-model-training

Related Docs

Whitepaper Metadata Byline

auth_path	keyword	Path of the author
author	keyword	Author of the whitepaper
firm	keyword	Firm associated with the whitepaper
firm_path	keyword	Path of the firm associated with the whitepaper

Whitepaper Metadata

authors	keyword	Authors of the whitepaper.
byline	Whitepaper Metadata ByLine	Information about the author.
topics	keyword	Topics covered in the whitepaper.

Whitepaper

doc_id	long	The unique identifier for the whitepaper document.
id	long	The unique identifier for the whitepaper.
metadata	Whitepaper Metadata	Metadata information about the whitepaper.

Alt Summaries

doc_id	long
summary_type	keyword
details	string
summary	string
machine_summary	string
machine_sentences	long
violation	keyword
respondent	keyword
action_type	action_type
title_words	keyword
filed_on	date

Enforcement

id	long	The unique identifier for the enforcement action.
doc_id	long	The unique identifier for the enforcement document.
filed_on	date	The date of enforcement activity.
activity_on	date	The date of enforcement activity.
bank_ids	long	The unique identifier for the bank(s) involved in enforcement.
entity_respondent	string	The entity respondent involved in the enforcement action.
individual_respondent	string	The individual respondent involved in the enforcement action.
respondents	string	The respondents involved in the enforcement action.
monetary_penalty	float	The monetary penalty associated with the enforcement action.
violation	string	The violation in the enforcement action.
violated_rules	string	The rules violated in the enforcement action.
metadata	Enforcement Metadata object	Metadata related to the enforcement action.

Enforcement Metadata

termination_date	keyword	The termination date of the enforcement action.
date_of_initial_action	keyword	The date of the initial enforcement action.

documents_group_by	string	Currently not in use, created to further dynamically group the multiple related documents
related_doc_ids	array long	Compliance.ai Document IDs of related documents
related_type	string	The different types of relationships between documents. Language is used to link the same document in different languages, Premium Content is used to link a regular doc to a premium doc, and Document is the reverse of 2 where we link a premium doc to a regular doc
related_docs	related_docs	Compliance.ai documents related to the current document, along with associated data

doc_id	integer	Identifier of the related doc
title	string	Title of the related doc
category	string	Specifies the type of document (category) Compliance.ai scraped from the source itself..
language	keyword	Specifies the langauge of document Compliance.ai scraped from the source itself.
content_provider	keyword	Specifies the content_provider of document (category) Compliance.ai scraped from the source itself..