I have had the recent pleasure of interacting with the US Treasury API, which can be found on the Fiscal Data website.

After finding an API that met the conditions to perform a sufficient analysis, my next goal was to find useful endpoints (or extensions of the API that contain the actual datasets we plan to use); I ended up finding three I was interested in. The first is the Treasury Securities interest rates over time, the second is the US Balance Sheet updated at the end of each fiscal year, and the last is the US National Debt calculations, broken down by groups of where the debt is coming from (or how much money is owed to each group of creditors – i.e. the United States population or other nations).

After determining my data sources, I start off by letting users know what R packages they need load or install (if they haven’t already) so they can do a similar analysis to what I show in later examples. If these packages are not installed or loaded at the start of the R session, then you will not be able to load in the data correctly using the specified functions later and/or you may not be able to perform the exact exploratory data analysis that I specify later in my vignette. Please note, I have two packages I recommend which are gridExtra (which allows plots to be arranged in a tidy way) and gganimate (which is a way to make animated plots that might change over a particular variable – used a lot for changes over time). If you wish to use different plots or arrange your plots in a different way, feel free to not install and/or load these packages.

Shifting to the next section of my interaction with the API, I make some functions that allow the users to get the particular data they might want. For example, a user will have the opportunity to pull filter dates to where they might want to look at data since the pandemic started, which we will say is March 11, 2020 when WHO declared COVID-19 a pandemic on the Treasury Securities interest rates to see if rates have recovered since then and how long it took. We could do the same but with the balance sheet to see if our liabilities increased with the preceived notion that spending increasing to combat it or the same with the National Debt to see if the rate of debt increasing skyrocketed during this time. I provide helper funcitons along the way so the user knows what functions to use and arguments to pass to get their desired returned dataset.

Once we get the data, as a data scientist we should always start off with exploratory data analysis. In the industry, you might hear this just called “EDA”. There was another dataset I added which had information on the inflation rates in the United States over time. I scraped this directly from World Data’s webiste and used this as comparisons to interest rates offered by the United States Treasury. This information is not part of the API but used to answer some questions I had. As a part of my exploratory data analysis, there were three questions that I wanted to look at which were:

  • Which securities are the best to invest in? How do they perform against inflation?
  • Does the fiscal year’s balance sheet in assets minus liabilities (or known as net assets – also known as an asset deficit if negative) have an impact on interest rates?
  • Does the month have an impact on the interest rates offered? Is it better to buy securities in January than July for example?

To answer these questions, I performed a multitude of congency tables, numeric summaries, and plots to try to get some kind of answer to these questions. After doing so, here were my major takeaways to these questions:

  • Question 1: Which securities are the best to invest in? How they do against inflation?
    • Answer: As said earlier, Treasury Bonds and Federal Financing Bank options are good options for Marketable securities. We can also see that if we are fortunate enough to invest in Non-marketable securities the best ones against inflation are Domestic Series and Foreign Series securities. For most regular investors, I would advice taking a look at Treasury Bonds and Federal Financing Bank securities, if interested in investing in the United States Treasury. We also noticed that as the inflation rate gets lower the net rate gain tends to look better since most security interest rates do not fluctuate the same way as the inflation rates. Even though we did not look at stocks, as another form of investing, maybe in higher inflation rates we should look at those instead of the US Treasury securities, as US Treasury Securities tend to be more favorable with the lower interest rates.
  • Question 2: Does the fiscal year’s balance sheet in assets minus liabilities (or known as net assets – also known as an asset deficit if negative) have an impact on interest rates?
    • Answer: This is hard to tell as we said earlier. We can look at plots which shows the deficit increasing at a faster rate and our net rate tends to decrease. Now is this because of the deficit entirely? Probably not, but there is some moderate correlation, as shown earlier, which before we do more in depth analysis can make us believe that some relationship is present (and in a negative way – meaning that if the deficit rate is increasing, the net rate on our investment is decreasing).
  • Question 3: Does the month have an impact on the interest rates offered?
    • Answer: As shown, it did not seem the month had an effect on the interest rates offered. So this shows us that any time is a good time to invest if you feel that you are given a good interest rate (or rate of return) on your investment.

Now on top of the answering some questions that I found interesting as it dealt with the United States Federal Treasury, there were some programming skills that I learned. The first was a better understanding of how to access data from an API directly. Most of the time for my professional and academic career, I am given csv files (or some other type of delimited file) and I perform the analysis straight from that. I also used web scraping tools to access a comparison data set which is also something I have limited experience with. On top of that, I learned how to write functions that allow us to get the exact data we want rather than having to do group_by(), summarize(), filter(), or mutate() functions, which I found very neat and making it much more user friendly. Lastly from a graphing standpoint, I learned how to create separate legends for different facet plots using gridExtra and how to create animated plot which shows changes over time in one, tidy plot using gganimate. These are all skills that I believe will enhance my development as a data scientist.

In a project as intensive as this, there are always things I wish I could have done differently. Finding an API was difficult, as many required some sort of payment to access their information. On top of that the “free” (I should say “freemium”) APIs allowed you to only access a limited amount of data, before making you pay to gather the full set. I feel like there is not much I could have done to correct this problem. The other major problem I had was having to go back to my self-created functions and having to update it when I realized that there could have been something not previously covered. This was a bit annoying as the formatting of the API was tough to figure out so I had to constantly test my functions to make sure I did everything I could to make it easier for the users. I wish I could have done this (like making sure the data was not all character when it was obvious that it was numeric) in a much more efficient manner. This issue I had seemed to take the vast majority of my time spent on this project. Now I know how to make functions, in the future I know that this will not take nearly as long as I had to look up a lot of different characteristics (like implementing the suppress.warnings() function to help achieve my goal of converting to the correct type – something I did not know before).

If you are interested by the data itself or the processes taken place to get through this analysis, feel free to check out my vignette interacting with the US Treasury API. You can also find all the code and files needed for this piece by accessing the corresponding GitHub repository. As stated at the end of my vignette, if you want to share any feedback or comments with me feel free to connect with me on LinkedIn or contact me via email.


<
Previous Post
What it Means to be a Data Scientist
>
Next Post
What is Exploratory Data Analysis?