Data Sources and Data Sets#
This document lists all data sources used in the codebase, the specific databases/tables accessed, and the subscriptions required to run the code successfully.
Data Source Summary Table#
Data Source |
Database/Table(s) |
Subscription/Access Required |
|---|---|---|
He, Kelly, Manela |
(public URLs) |
None (public) |
Ken French Library |
(public URLs) |
None (public) |
NYU Call Report |
(public URLs) |
None (public) |
Fed Yield Curve |
(public URLs) |
None (public) |
Open Source Bond Asset Pricing |
(public URLs) |
None (public) |
WRDS Compustat |
comp.funda |
WRDS + Compustat North America |
WRDS Bond Returns |
wrdsapps_bondret.bondret |
WRDS + Bond Returns |
WRDS CRSP |
crsp.stksecurityinfohdr, crsp.msf_v2, crsp.ccmxpf_linktable |
WRDS + CRSP (Stock, Link) |
WRDS CRSP Treasury |
crspm.tfz_dly, crspm.tfz_iss |
WRDS + CRSP US Treasury |
WRDS Markit CDS |
markit.CDS{year}, markit_red.redobllookup, markit.redent |
WRDS + Markit CDS |
WRDS Mergent FISD |
(via mapping to Markit RED codes) |
WRDS + Mergent FISD |
Data Modules#
Data modules are subfolders in the src directory that contain a set of scripts that generate 1 or more related datasets. These data sets are grouped together because they use the same data sources. The following lists the data modules, such as cds_bond_basis, along with the data sources used in each module.
corp_bond_returns: Calculates returns for corporate bonds, sovereign bonds, and treasury bonds. Uses Open Source Bond Asset Pricing and WRDS Bond Returns.cds_bond_basis: Uses Open Source Bond Asset Pricing and WRDS Markit CDS.fed_yield_curve: Uses Fed Yield Curve data downloaded from the US Federal Reserve Board of Governors website.foreign_exchange: Uses …he_kelly_manela: Uses “He, Kelly, Manela Test Portfolios and Factors” data downloaded from Asaf Manela’s website.ken_french_data_library: Pulls data from the Ken French Data Library, using the Pandas Data Reader Python package.nyu_call_report:wrds_bank_premium:wrds_crsp_compustat:wrds_markit:
Misc
CRSP Returns (with and without dividends)
Compustat Panel Data
Fama-French Portfolio Returns
Treasury Yield Curve
Bank Call Report Data
He, Kelly, Manela Test Portfolios
We can just use the test portfolios as downloaded from their website, though it would be nice to use data up until the present day.
Equity CLOSE
[.] Treasury Securities
[.] Corporate Bonds
[.] Sovereign Bonds
[.] Options
[.] Foreign Exchange
[.] Commodities CLOSE
[.] CDS CLOSE
He, Kelly, Manela Disaggregated Data
[.] Equity
[.] Treasury Securities
[.] Corporate Bonds
Sovereign Bonds
[.] Options
Foreign Exchange
Commodities
[.] CDS
Segmented Arbitrage, Arbitrage Spreads
CIP
[.] Box Spread
[.] Equity Spot-Futures
[.] Treasury Spot-Futures
[.] Treasury Swap
[.] TIPS-Treasury
[.] CDS-Bond Basis
Notes#
WRDS is not a monolithic subscription. Each dataset listed above typically requires a separate subscription through the WRDS platform.
Public data sources (Ken French Data Library, Fed Yield Curve) typically don’t require paid subscriptions.
Access credentials for WRDS are configured in environment variables. Please set the
.envfile following the.env.examplefile.Specify which subscriptions you have access to in the
subscriptions.tomlfile.