Best 13 Free Financial Datasets for Machine Learning [Updated]
Alexandra Quinn | February 17, 2024
Financial services companies are leveraging data and machine learning to mitigate risks like fraud and cyber threats and to provide a modern customer experience. By following these measures, they are able to comply with regulations, optimize their trading and answer their customers’ needs. In today’s competitive digital world, these changes are essential for ensuring their relevance and efficiency.
How can financial services companies build, expand and optimize their use of data and ML? Open and free financial datasets and economic datasets are an essential starting point for data scientists and engineers who are developing and training ML models for finance. But sadly, they can be hard to come by. Here are 13 excellent open financial and economic datasets and data sources for financial data for machine learning.
1. Data.gov
A US governmental website hosted by the General Services Administration Technology Transformation Service, data.gov provides a catalog of government data in open-machine and readable formats. To find financial-related datasets, you can search for relevant keywords, e.g “credit card”, and get a list of the available datasets for you to consume.
Get the datasets here
2. Data.gov.in
Nation-wide data sets from across India, intended to make Indian government-owned shareable data accessible in human and machine readable formats. There are 2,394 Resources in 484 catalogs related to finance, covering topics like consumer price indexes, GDP estimates, prices and more.
Get the datasets here
3. data.europa.eu
data.europa.eu is the official portal for EU European data. It boasts nearly 1.5 million datasets, 67,758 of them related to finance and the economy. These include information about budgets, monthly wages, occupations and a lot more.
Get the datasets here
4. Global Financial Data (GDF)
An extensive database of current and historical financial data, providing updated information alongside data from hundreds of years ago. The database covers topics like market indicators, exchange rates, commodities, incomes and more. Datasets are free but require logging in to the site.
Get the datasets here
5. International Monetary Fund (IMF)
The International Monetary Fund website provides access to macroeconomic and financial data from around the world. Datasets cover a wide variety of topics, including the external sector, the fiscal sector, the financial sector, the real sector, gender and international outlooks.
Get the datasets here
6. World Bank Open Data
The World Bank provides access to open global development data across 5,437 datasets. “Open Finances” includes data about loans, financial reporting, procurement, projects and more. The data is intended to be easy to download, filter and slice and dice, so it can be easily consumed.
Get the datasets here
7. Nasdaq Data Link
The Nasdaq source of financial, economic and alternative datasets. It is a comprehensive repository covering equities, currencies, interest rates, options, indexes, mutual funds, real estate and a whole lot more! Note that some of the datasets are free and some require a paid license. Nasdaq Data Link is considered to be very reliable. They promise to only share datasets that have passed their curation and quality process and have gone through their own data engineering system.
Get the datasets here
8. Kapsarc.org
Kasparc provides 1371 (and counting) datasets related to energy. The datasets are organized according to themes, like energy supply types, energy use types, economy, trade and others. Data scientists and engineers can also filter according to countries, which include Saudi Arabia, Bahrain, UAE, China, the US and others. Finally, users can also easily see which datasets have been recently updated and which are the most popular.
Get the datasets here
9. Eurostat Easy Comext
Datasets on international trade and manufactured goods production going all the way back to 1988. The portal itself requires some getting used to but holds valuable data from across the EU.
Get the datasets here
10. The World Bank
A comprehensive database of datasets with financial system characteristics for 214 economies. Data is organized annually and was last updated in November 2021. Datasets include information on both financial institutions and financial markets and measure depth, access, efficiency and stability of financial systems.
Get the datasets here
11. American Economic Association
Macroeconomic data from across the US covering aspects like employment, economic output, budget, economic trade and more. The AWA is not a database of its own, but rather hosts links to resources with the data, like the National Bureau of Economic Research or the Bureau of Labor Statistics.
Get the datasets here
12. OECD Public Finance Dataset
A breakdown of public expenditure and revenues in the OECD to support analysis of growth and income inequality. This dataset was built by combining a few sources that provide detailed data.
Get the datasets here
13. Kaggle
Kaggle, an online community of data scientists and ML practitioners (and a Google subsidiary) is also a source for publishing and finding datasets. There are at least 5K finance-related datasets on Kaggle, covering a wide variety of topics. From finance complaints to loans to stocks to presidential finances - Kaggle has it all. There are some caveats though. First, the datasets’ quality is unclear, though users can filter by filters like usability, votes and hotness. Second, searching through the datasets is based on free text, requiring users to either know exactly what they are looking for or to explore different options. (It’s a Google subsidiary after all).
Get the datasets here
[Updated 23 February 2024]More Free Financial Datasets
Financial Statement Data Sets
Datasets containing information extracted from EX-101 attachments, which were submitted to the US Securities and Exchange Commission between 2009 and 2023. EX-101 attachments contain financial information about a company’s performance.
US Economic Census
US economy statistics at the national, state and local levels. Topics include businesses, health, poverty, international trade, research and more.
NationMaster
Global statistics across 300 sectors, including financial services, manufacturing and construction. Explore data like construction output in Germany, material productivity in Switzerland, insurance premiums in Honduras, and much more.
City-Data.com
Data profiles for every city in the United States, including information on income, unemployment, living costs, house value and more.
UK Data Service
Datasets covering the UK’s economy, population and social research. Some of the recent datasets include:
- Quarterly Labour Force Survey, January-March 2023: Teaching Dataset
- International Social Survey Programme, 2020: Attitudes to the Environment
- Cost of Living Crisis: Impact on Schools, 2023
Alpha Vantage
Datasets containing stock market APIs in real-time and historical data. It also includes forex, commodity and crypto data feeds.
The Future of Financial Data for Machine Learning
Financial services institutions are in the midst of a digital transformation revolution, enabling them to leverage data with ML to help personalize customer services, mitigate fraud risk and improve their operational efficiency. ML can help banks, insurance companies, fintech organizations (and more) transform their bottom line and differentiate themselves for customers while ensuring compliance.
Payoneer, for example, uses ML to detect fraud within complex networks so they can prevent it before it materializes. To learn more about ML and financial services, click here.