Text fields
Bilby Quant Data provides comprehensive text content for each document in both the original language (typically Chinese) and English translation. This allows you to work with either version depending on your analytical needs.
Document titles
title
- Type: String
- Description: The title of the document in its original language.
- Nullable: Yes
title_en
- Type: String
- Description: The title of the document translated into English.
- Nullable: Yes
Subheadings
subhead
- Type: String
- Description: The subheading of the document in its original language.
- Nullable: Yes
- Note: Not all documents have subheadings.
subhead_en
- Type: String
- Description: The subheading of the document translated into English.
- Nullable: Yes
Body text
body
- Type: String
- Description: The complete body text of the document in its original language.
- Nullable: Yes
- Note: This is the full text content of the document, which can be substantial for longer policy documents.
body_en
- Type: String
- Description: The complete body text of the document translated into English.
- Nullable: Yes
Summaries
summary
- Type: String
- Description: A summary of the document in its original language.
- Nullable: Yes
translated_summary
- Type: String
- Description: A summary of the document translated into English.
- Nullable: Yes
- Note: This provides a concise overview of the document's content without requiring you to process the full body text.
Working with text fields
Language considerations
- Original language fields contain text as published by the source, typically in Chinese.
- English translation fields (
*_enandtranslated_summary) are machine-generated translations optimised for accuracy and readability. - For most analytical purposes, the English fields provide sufficient quality for text mining, NLP, and semantic analysis.
Field availability
Not all documents will have content in every text field:
- Some documents may lack subheadings
- Summaries may not be available for all documents
- In rare cases, body text or titles may be missing due to source formatting issues