Chapter 8 - Revealing change
Drawing/using example pictures, explain the following terms in analyzing time series data. Please watch the video lecture in this module before working on this task.
Seasonality is a common characteristic of time series. It can appear in two forms: additive and multiplicative. Professor Nikolaos Kourentzes developed a webpage where you can practice identifying and differentiating between the two forms. Please practice differentiating between the two with more than 92% accuracy on at least 20 examples and submit your screenshot showing your accuracy and sample count.
In Google Sheets (or Microsoft Excel), perform time series decomposition for the “NY-city-chickenpox-1931-to-1972” data below and obtain trend, seasonality, and noise. Validate your plots using the plot blow.
Submit a link to your Google Sheet (or your Excel file).
Optionally, you can also upload your data to “planetcalc” to validate your curves.
Hint (for Google Sheets):
=SPARKLINE(C9:C506, {"charttype","line";"linewidth",1; "color", "black"; "lastcolor", "blue"})
Data: NY-city-chickenpox-1931-to-1972.xlsx
Perform time series decomposition (assuming a multiplicative model) for the dataset below. Upload your final plots and all intermediate results. Assume a seasonality window of 12 months.
Data: multiplicative-seasonality.csv
For two S&P 500 companies of your choice, download the open, high, low, and close price data for the last 15 or 30 days using Google Finance in Google sheets. For these two companies, design OHLC charts and answer the following questions:
a) Which company’s trend is more volatile?
b) Which company’s trend shows more indecision?
c) How many upward trends are there in the first and second company’s trends?
d) Which company’s trend has more significant momentum if any?
Also, draw candlestick charts for the same data.
Hint 1 (for cleaning date):
=TEXT(A2, "MM/DD/YYYY")
Hint 2 (for downloading data):
=GOOGLEFINANCE("NFLX", "all", Today()-15, Today(), "DAILY")
Hint 3 (for quick plotting): Use Plotly (https://chart-studio.plotly.com/) to draw OHLC charts quickly.
The table below shows Netflix’s closing price for ten days. In an attempt to keep the 0-baseline, the accompanying line chart fails to show the dip clearly. Redesign the line chart, so the dip is clear.
Date | Closing price |
---|---|
10/03/2022 | 235.44 |
10/04/2022 | 239.04 |
10/05/2022 | 240.74 |
10/06/2022 | 236.73 |
10/07/2022 | 240.02 |
10/10/2022 | 224.75 |
10/11/2022 | 229.98 |
10/12/2022 | 214.29 |
10/13/2022 | 220.87 |
10/14/2022 | 232.51 |
Reproduce Figure 8.13 in the TTA book. Please note that your plot may look slightly different.
Hint: After extracting month and year as additional columns, sort the data by month followed by the year.
Data: SocialSecurityAffiliationsSpain.csv
The data graphic below compares the closing price of Verizon, Intel, Cisco, Twitter, and Coca-cola (companies with a comparable closing price) and is cluttered (i.e., ineffective for decoding).
One way to de-clutter this data graphic is to remove the legend and place colored labels next to the curves (colored to match the line colors). However, even the technique of labeling next to the lines using the same color fails when we have too many companies to compare. Select five to ten companies of your choice with a similar closing price and compare their trend over the last year. Please note that a figure like the one above would NOT work. It can clutter quickly as the number of companies grows.
The worldometer website shows the world population over the last ~70 years. Design:
a) a standard line diagram in linear scale showing the growth,
b) the same line diagram with a logarithmic scale for the population count, and
c) a change-rate diagram showing population change.
Submit the three plots you designed.
In addition, discuss which of the three plots is more informative/revealing.
You will probably have all the points above the “1” bar in the change-rate plot. What does this indicate?
Toward the end of the chapter “Sparklines: Intense word-sized graphics,” Edward Tufte writes the following:
A good system for evidence display should be centered on evidence, not on a collection of application programs, each devoted to a single mode of information.
Based on your reading of the rest of the chapter, in at least 250 words, explain why Tufte is an extreme proponent of sparklines.
Write a short and meaningful paragraph of your own, with at least two sparklines embedded within your text lines. You are welcome to choose your area and data of interest.
Redesign Figure 8.31 in TTA with time dimension in the x-axis. Please prepare a table estimating the values from the existing chart to create your chart.