SEC Reference Data Fields vs. openfunds Data Model
1. Overview of SEC Structured Data Sources
The SEC provides four distinct structured data sources that contain reference data for US-registered funds. Each covers different aspects:
| Source |
Form |
Content |
Format |
Granularity |
| Series/Class CSV |
— |
Identity & identifiers |
CSV/XML |
Trust → Series → Class |
| XBRL Risk/Return |
N-1A (485BPOS, 497K) |
Prospectus-derived structured data |
XBRL → flat files |
Series & Class level |
| N-PORT Data Sets |
NPORT-P |
Portfolio holdings & fund financials |
XML → flat files |
Series & Holding level |
| Submissions API |
— |
Filing history metadata |
JSON |
Entity (CIK) level |
2. Complete Field Inventory by SEC Source
2.1 Series/Class Reference CSV
This is the identity backbone — maps the hierarchy of trust → fund → share class.
| Field |
Description |
openfunds Equivalent |
Reporting File Number |
811-XXXXX Investment Co. Act number |
— (no direct equivalent) |
CIK Number |
10-digit SEC entity identifier |
— (SEC-specific) |
Entity Name |
Trust/investment company name |
OFST005010 Umbrella Name |
Entity Org Type |
Organization type code |
OFST160100 Legal Form |
Series ID |
S###### fund series identifier |
— (SEC-specific) |
Series Name |
Fund name |
OFST010110 Legal Fund Name Only |
Class ID |
C###### share class identifier |
— (SEC-specific) |
Class Name |
Share class name (e.g. "Admiral Shares") |
OFST020060 Full Share Class Name |
Class Ticker |
Exchange ticker symbol |
OFST020020 Bloomberg Code (partial) |
Address_1, Address_2, City, State, Zip Code |
Registrant address |
— |
Coverage: ~15,000+ investment company trusts, ~50,000+ series, ~100,000+ classes.
2.2 XBRL Risk/Return Summary (from Prospectus — the richest source)
This dataset is extracted from prospectus XBRL filings and is the closest to what openfunds covers. It contains the structured data that prospectuses specify.
A. Fund Identity & Structure
| XBRL Element |
Description |
Data Type |
openfunds Equivalent |
RiskReturnHeading |
Prospectus section heading |
Text |
— |
ObjectiveHeading |
Heading of objectives section |
Text |
— |
ObjectivePrimaryTextBlock |
Investment objective narrative |
Text Block |
OFST010300 Investment Objective |
ObjectiveSecondaryTextBlock |
Additional objective detail |
Text Block |
OFST010300 Investment Objective |
StrategyHeading |
Heading of strategy section |
Text |
— |
StrategyNarrativeTextBlock |
Principal investment strategies |
Text Block |
— (no single openfunds equivalent) |
B. Fee & Expense Data (Shareholder Fees — paid directly by investor)
| XBRL Element |
Description |
Data Type |
openfunds Equivalent |
MaximumSalesChargeImposedOnPurchasesOverOfferingPrice |
Front-end load |
Ratio |
OFST451320 Max Subscription Fee In Favour Of Distributor |
MaximumDeferredSalesChargeOverOfferingPrice |
Back-end load (CDSC) |
Ratio |
OFST451391 Contingent Deferred Sales Charge Exit Fee |
MaximumDeferredSalesChargeOverOther |
CDSC on other basis |
Ratio |
OFST451392 Contingent Deferred Sales Charge Upfront Fee |
MaximumSalesChargeOnReinvestedDividendsAndDistributions |
Load on reinvested dividends |
Ratio |
— |
RedemptionFeeOverRedemption |
Redemption fee (% of amount) |
Ratio |
OFST451440 Max Redemption Fee In Favour Of Fund |
RedemptionFee |
Redemption fee (flat $) |
Monetary |
OFST451439 Min Redemption Fee In Favour Of Fund |
ExchangeFeeOverRedemption |
Exchange fee (% of amount) |
Ratio |
— |
ExchangeFee |
Exchange fee (flat $) |
Monetary |
— |
MaximumAccountFeeOverAssets |
Account maintenance fee (%) |
Ratio |
— |
MaximumAccountFee |
Account maintenance fee ($) |
Monetary |
— |
MaximumCumulativeSalesChargeOverOfferingPrice |
Cumulative max sales charge |
Ratio |
— |
C. Annual Fund Operating Expenses (ongoing costs deducted from fund assets)
| XBRL Element |
Description |
Data Type |
openfunds Equivalent |
ManagementFeesOverAssets |
Management fee |
Ratio |
OFST452010 Management Fee Maximum |
DistributionAndService12b1FeesOverAssets |
12b-1 distribution fee |
Ratio |
OFST454165 Distribution Fee Maximum |
Component1OtherExpensesOverAssets |
Other expense component 1 |
Ratio |
— |
Component2OtherExpensesOverAssets |
Other expense component 2 |
Ratio |
— |
Component3OtherExpensesOverAssets |
Other expense component 3 |
Ratio |
— |
OtherExpensesOverAssets |
Total other expenses |
Ratio |
— |
AcquiredFundFeesAndExpensesOverAssets |
Acquired fund fees (fund-of-funds) |
Ratio |
— |
ExpensesOverAssets |
Total Annual Fund Operating Expenses |
Ratio |
OFST452100 TER Excluding Performance Fee |
FeeWaiverOrReimbursementOverAssets |
Fee waiver/reimbursement |
Ratio |
— |
NetExpensesOverAssets |
Net expenses after waivers |
Ratio |
OFST452200 Ongoing Charges |
D. Expense Example (hypothetical cost projections)
| XBRL Element |
Description |
Data Type |
openfunds Equivalent |
ExpenseExampleYear01 |
Cost for $10K after 1 year |
Monetary |
— |
ExpenseExampleYear03 |
Cost for $10K after 3 years |
Monetary |
— |
ExpenseExampleYear05 |
Cost for $10K after 5 years |
Monetary |
— |
ExpenseExampleYear10 |
Cost for $10K after 10 years |
Monetary |
— |
ExpenseExampleNoRedemptionYear01 |
Cost if no redemption, 1 year |
Monetary |
— |
ExpenseExampleNoRedemptionYear03 |
Cost if no redemption, 3 years |
Monetary |
— |
ExpenseExampleNoRedemptionYear05 |
Cost if no redemption, 5 years |
Monetary |
— |
ExpenseExampleNoRedemptionYear10 |
Cost if no redemption, 10 years |
Monetary |
— |
E. Performance Data
| XBRL Element |
Description |
Data Type |
openfunds Equivalent |
AnnualReturn[YYYY] |
Annual return for calendar year |
Ratio |
OFDY025000-range (Performance data) |
BarChartHighestQuarterlyReturn |
Best quarter return |
Ratio |
— |
BarChartLowestQuarterlyReturn |
Worst quarter return |
Ratio |
— |
BarChartHighestQuarterlyReturnDate |
Date of best quarter |
Date |
— |
BarChartLowestQuarterlyReturnDate |
Date of worst quarter |
Date |
— |
AverageAnnualReturnYear01 |
Average annual return, 1 year |
Ratio |
OFDY025000-range |
AverageAnnualReturnYear05 |
Average annual return, 5 years |
Ratio |
OFDY025000-range |
AverageAnnualReturnYear10 |
Average annual return, 10 years |
Ratio |
OFDY025000-range |
AverageAnnualReturnSinceInception |
Return since inception |
Ratio |
— |
AverageAnnualReturnInceptionDate |
Inception date |
Date |
OFST020560 Share Class Launch Date |
| Performance dimensions: Before Taxes, After Taxes on Distributions, After Taxes on Distributions and Sales |
|
|
— |
F. Risk Disclosures
| XBRL Element |
Description |
Data Type |
openfunds Equivalent |
RiskHeading |
Risk section heading |
Text |
— |
RiskNarrativeTextBlock |
Principal risks narrative |
Text Block |
— |
RiskLoseMoney |
"You may lose money" statement |
String |
— |
RiskMoneyMarketFundMayImposeFeesOrSuspendSales |
MMF gate/fee risk |
Boolean |
— |
RiskMoneyMarketFundPriceFluctuates |
MMF NAV fluctuation risk |
Boolean |
— |
BarChartAndPerformanceTableHeading |
Performance section heading |
Text |
— |
PerformanceNarrativeTextBlock |
Performance context narrative |
Text Block |
— |
G. Portfolio Turnover
| XBRL Element |
Description |
Data Type |
openfunds Equivalent |
PortfolioTurnoverHeading |
Section heading |
Text |
— |
PortfolioTurnoverTextBlock |
Turnover narrative |
Text Block |
— |
PortfolioTurnoverRate |
Turnover rate (%) |
Ratio |
OFRE000025-range (Fund Ratios) |
2.3 N-PORT Data Sets (Portfolio Holdings — quarterly)
This provides dynamic portfolio data not typically in openfunds static fields.
A. Fund-Level Information (FUND_REPORTED_INFO)
| Field |
Description |
openfunds Equivalent |
SERIES_NAME |
Fund name |
OFST010110 Legal Fund Name Only |
SERIES_ID |
SEC series identifier |
— |
SERIES_LEI |
LEI of the fund series |
OFST010030 LEI Of Fund |
TOTAL_ASSETS |
Total assets (USD) |
OFDY000010-range (AuM/TNA) |
TOTAL_LIABILITIES |
Total liabilities |
— |
NET_ASSETS |
Net assets (TNA) |
OFDY000010-range |
SALES_FLOW_MON1/2/3 |
Monthly inflows |
— |
REDEMPTION_FLOW_MON1/2/3 |
Monthly outflows |
— |
| Credit spread sensitivities (3m,1y,5y,10y,30y) |
Risk measures |
— |
B. Interest Rate Risk (INTEREST_RATE_RISK)
| Field |
Description |
openfunds Equivalent |
CURRENCY_CODE |
Currency of exposure |
OFST010410 Fund Currency |
INTRST_RATE_CHANGE_*_DV01 |
DV01 by maturity bucket |
— |
INTRST_RATE_CHANGE_*_DV100 |
Impact of 100bp shift |
— |
C. Monthly Returns (MONTHLY_TOTAL_RETURN)
| Field |
Description |
openfunds Equivalent |
CLASS_ID |
Share class identifier |
— |
MONTHLY_TOTAL_RETURN1/2/3 |
Monthly returns per class |
OFDY025000-range |
D. Portfolio Holdings (FUND_REPORTED_HOLDING)
| Field |
Description |
openfunds Equivalent |
ISSUER_NAME |
Holding issuer name |
OFPH-range (Portfolio Holdings) |
ISSUER_LEI |
LEI of issuer |
OFPH-range |
ISSUER_TITLE |
Security title/description |
OFPH-range |
ISSUER_CUSIP |
CUSIP of holding |
OFPH-range |
BALANCE |
Position size |
OFPH-range |
UNIT |
Shares/principal/other |
OFPH-range |
CURRENCY_CODE |
Currency of holding |
OFPH-range |
CURRENCY_VALUE |
Value in reporting currency |
OFPH-range |
EXCHANGE_RATE |
FX rate applied |
— |
PERCENTAGE |
% of net assets |
OFPH-range |
PAYOFF_PROFILE |
Long/Short/N/A |
OFPH-range |
ASSET_CAT |
Asset type classification |
OFST350000 MiFID Securities Classification (concept) |
ISSUER_TYPE |
Corporate/Government/etc. |
— |
INVESTMENT_COUNTRY |
Country of issuer (ISO) |
OFPH-range |
IS_RESTRICTED_SECURITY |
Restricted security flag |
— |
FAIR_VALUE_LEVEL |
Fair value hierarchy (1/2/3) |
— |
E. Holding Identifiers (IDENTIFIERS)
| Field |
Description |
openfunds Equivalent |
IDENTIFIER_ISIN |
ISIN |
OFST020000 ISIN |
IDENTIFIER_TICKER |
Ticker |
— |
OTHER_IDENTIFIER |
SEDOL, etc. |
OFST020040 SEDOL |
3. Mapping: What openfunds Fields CAN Be Found in SEC Data?
Fully Available (structured, machine-readable)
| openfunds Category |
openfunds OF-ID |
openfunds Field |
SEC Source |
SEC Field |
| Key Fact: Company |
OFST001000 |
Fund Group Name |
Series/Class CSV |
Entity Name |
| Key Fact: Umbrella |
OFST005010 |
Umbrella Name |
Series/Class CSV |
Entity Name |
| Key Fact: Fund |
OFST010030 |
LEI Of Fund |
N-PORT |
SERIES_LEI |
|
OFST010110 |
Legal Fund Name Only |
Series/Class CSV + N-PORT |
Series Name |
|
OFST010300 |
Investment Objective |
XBRL R/R |
ObjectivePrimaryTextBlock |
|
OFST010410 |
Fund Currency |
N-PORT |
CURRENCY_CODE (inferred) |
| Key Fact: Share Class |
OFST020000 |
ISIN |
N-PORT Holdings |
IDENTIFIER_ISIN |
|
OFST020005 |
CUSIP |
N-PORT Holdings |
ISSUER_CUSIP |
|
OFST020040 |
SEDOL |
N-PORT Holdings |
OTHER_IDENTIFIER |
|
OFST020060 |
Full Share Class Name |
Series/Class CSV |
Class Name |
| Classification |
OFST350000 |
Securities Classification |
N-PORT |
ASSET_CAT |
| Fees |
OFST451320 |
Max Subscription Fee (Distributor) |
XBRL R/R |
MaximumSalesChargeImposedOnPurchasesOverOfferingPrice |
|
OFST451391 |
CDSC Exit Fee |
XBRL R/R |
MaximumDeferredSalesChargeOverOfferingPrice |
|
OFST451440 |
Max Redemption Fee |
XBRL R/R |
RedemptionFeeOverRedemption |
|
OFST452010 |
Management Fee Maximum |
XBRL R/R |
ManagementFeesOverAssets |
|
OFST452100 |
TER Excl. Performance Fee |
XBRL R/R |
ExpensesOverAssets |
|
OFST452200 |
Ongoing Charges |
XBRL R/R |
NetExpensesOverAssets |
|
OFST454165 |
Distribution Fee Maximum |
XBRL R/R |
DistributionAndService12b1FeesOverAssets |
| Performance |
OFDY025xxx |
Return periods |
XBRL R/R |
AverageAnnualReturnYear01/05/10 |
| Dynamic: AuM |
OFDY000xxx |
TNA / AuM |
N-PORT |
NET_ASSETS, TOTAL_ASSETS |
Partially Available (derivable from prospectus text, not structured)
These fields exist in the text of the prospectus but are NOT in the SEC structured datasets. They would need to be extracted by an LLM — which is exactly the use case:
| openfunds OF-ID |
openfunds Field |
Where in Prospectus |
| OFST010420 |
Open-ended Or Closed-ended Fund Structure |
Registration form type implies this (N-1A = open-end) |
| OFST010440 |
Fiscal Year End |
Mentioned in prospectus text, in Submissions JSON |
| OFST010500 |
Is Fund Of Funds |
Inferred from AcquiredFundFeesAndExpensesOverAssets > 0 |
| OFST010580 |
Is ETF |
Inferred from form type or share class structure |
| OFST010720 |
Is Passive Fund |
Strategy narrative mentions "index" tracking |
| OFST010730 |
Management Approach Type |
Strategy narrative (active/passive/enhanced) |
| OFST020300 |
Valuation Frequency |
Prospectus "Pricing of Fund Shares" section |
| OFST020400 |
Distribution Policy |
Prospectus "Dividends and Distributions" section |
| OFST020540 |
Share Class Currency |
Inferred; US funds typically USD |
| OFST020558 |
Subscription Period Start Date |
Only for closed-end or interval funds |
| OFST400xxx |
Minimum Investment |
Prospectus "Purchase and Sale of Fund Shares" |
| OFST451027 |
Has Performance Fee |
Prospectus fee table |
| OFST451100 |
Hurdle Rate |
Prospectus fee table |
| OFST013000 |
Prospectus Date |
Filing date in submissions API |
NOT Available in SEC Data
These openfunds fields are European/international-specific or distribution-channel-specific and have no SEC equivalent:
| openfunds Category |
Examples |
| UCITS/AIFMD fields |
OFST160100 Legal Form (SICAV/FCP), OFST011200 Is UCITS With Leveraged Benchmark |
| European regulatory |
OFST350100 EFAMA EFC Category, OFST010075 CSSF Code |
| Distribution-specific |
OFST453151 Is Trailer Fee Clean, OFST451305 Applied Subscription Fee |
| MiFID/PRIIPs/KID |
OFEM-range (MiFID Template), OFEP-range (PRIIPs Template) |
| ESG/Sustainability |
OFST820xxx, OFEE-range (EU sustainability regulation specific) |
| Country registrations |
OFST6000XX (country-specific registration fields) |
| Solvency II |
OFST500xxx |
| Swiss/German/UK specific |
OFST700xxx |
4. Summary: Is Asset Class, Currencies, Fees, Risk Data in the SEC Dataset?
| Data Category |
In SEC Structured Data? |
Source |
Notes |
| Asset Class |
YES |
N-PORT ASSET_CAT field |
Values: equity, debt, derivative, etc. |
| Currencies |
YES |
N-PORT CURRENCY_CODE per holding; interest rate risk by currency |
Per-holding currency + fund-level |
| Fees (sales loads) |
YES |
XBRL R/R |
Front-end load, back-end load, redemption fee |
| Fees (operating expenses) |
YES |
XBRL R/R |
Management fee, 12b-1, TER, net expense ratio |
| Risk data (narrative) |
YES |
XBRL R/R |
Principal risks text block |
| Risk data (quantitative) |
YES |
N-PORT |
DV01, credit spread sensitivity, VaR |
| Performance |
YES |
XBRL R/R + N-PORT |
Annual returns, avg annual returns, monthly returns |
| Investment Objective |
YES |
XBRL R/R |
Full text of objective |
| Strategy |
YES |
XBRL R/R |
Full text of principal strategies |
| Portfolio Turnover |
YES |
XBRL R/R |
Turnover rate |
| Portfolio Holdings |
YES |
N-PORT |
Security-level: name, CUSIP, ISIN, country, asset type, value |
| Country of Issuer |
YES |
N-PORT INVESTMENT_COUNTRY |
ISO country code per holding |
| Minimum Investment |
PARTIAL |
In prospectus text, not structured |
LLM extraction target |
| Distribution Policy |
PARTIAL |
In prospectus text, not structured |
LLM extraction target |
| ESG/Sustainability |
NO |
Not in SEC structured data |
European regulation specific |
| UCITS Classification |
NO |
N/A for US funds |
European regulation specific |
5. Implication for LLM Training Dataset
The SEC provides an excellent foundation for your LLM training dataset:
Ground Truth (structured) — available directly from SEC:
- Fee tables (management fee, expense ratio, loads, 12b-1)
- Performance data (1yr, 5yr, 10yr returns)
- Investment objective text
- Principal risks text
- Portfolio turnover rate
- Total net assets
- Fund/class identifiers (CIK, Series ID, Class ID, Ticker, CUSIP)
- Minimum initial investment amounts
- Distribution frequency and policy
- Share class currency
- Open/closed-end structure
- Active vs. passive management
- Benchmark index name
- Tax status information
- Purchase/redemption cut-off times
- Settlement cycle
This creates a natural supervised learning setup: the XBRL structured data serves as labels/ground truth, and the prospectus HTML/text serves as input, enabling the LLM to learn the mapping from legal language to structured reference data.