20-Nov

I’m excited to share the next phase of our journey in analyzing the BPDA economic indicators dataset. We’re now stepping into the realm of Time Series Analysis. Before diving into how we’re applying it to our project, I want to talk about what Time Series Analysis is, how it is used, why it’s important, and where it’s typically used.

Time Series Analysis is a statistical technique that deals with time-ordered data points. Think of it like a movie, where each frame (data point) is a snapshot in time, and together, they tell a story. This analysis helps us understand the underlying structure and trends in data that change over time.

So, why use Time Series Analysis? It’s all about the ‘when.’ By understanding when things happen and how patterns repeat over days, months, or years, we can get a clearer picture of trends and seasonal effects. This is crucial in fields like economics, where understanding cycles – like when people spend more, when tourism peaks, or when job markets fluctuate – can inform better decisions.

How do we use Time Series Analysis? It’s all about examining the data in the context of time. We look for trends (where something is consistently going up or down), seasonality (patterns that repeat at regular intervals, like increased hotel bookings in summer), and irregularities or outliers. By applying statistical models, we can forecast future trends based on historical patterns, which is invaluable in planning and decision-making.

Where is Time Series Analysis used? Everywhere from stock markets, where it predicts stock trends, to meteorology for weather forecasting. In business, it helps in sales forecasting, resource allocation, and marketing strategy. It’s a powerful tool for understanding past behavior and predicting future trends.

Our next update will delve into how we’re incorporating Time Series Analysis into our project. We’ll explore how Boston’s economy has changed over time, looking at patterns in employment, housing prices, travel trends, and more. This will help us understand not just where the economy has been, but where it might be heading.

Stay tuned for a more detailed look at Time Series Analysis in action!

17-Nov

I’ve got some exciting news about our project where we’re looking at Boston’s economic data from 2013 to 2019. I just finished the first part of our work, that is Descriptive Statistics. This helps us get a good starting idea of things like jobs, house prices, and how busy airports and hotels were in Boston.

I have attached a snippet of the Python code which I used.

As the output is very large, I have attached the document containing the output.

Descriptive_Statistics

Here’s some of what I found out: On average, about 3,940 international flights came in and out of Logan Airport every month. This number even went up to 5,260 flights in busy times. This tells us that lots of people were coming to Boston from other countries. The hotels were pretty busy too, with about 82% of rooms filled on average, and the price for a night’s stay was around $244.

When we look at jobs, it seems like there were usually about 356,000 jobs available in Boston. But the number of people without jobs (the unemployment rate) was around 4%. This changes a bit over time, so it’s something we might want to look into more. Also, houses in Boston could be quite pricey. Sometimes the average house price was as high as $517,750.

This is just the start, and there’s a lot more to find out. I’m going to keep looking at the data to see what else we can learn about how Boston’s economy was doing.

15-Nov

In this update, I am going to explain about the analytical techniques that I am thinking of using to analyze the Boston data. To start with, I am thinking of applying Descriptive Statistics to lay the groundwork. This will involve calculating means, medians, and standard deviations for key indicators like employment rates, housing prices, and passenger traffic. This approach will provide us with a basic yet essential understanding of the overall economic trends in Boston.

Next, I am planning to conduct a Time Series Analysis. This analysis is crucial as it will help us identify and understand patterns over time, such as seasonal changes in hotel occupancy or fluctuations in housing prices. I’m also considering using Correlation Analysis to explore potential relationships between different economic variables, like the impact of international flights on local tourism.

Regression Analysis is another tool I intend to use. It will be instrumental in examining the relationships between various factors, such as how employment rates might influence real estate prices. This can yield insightful correlations that are vital for data-driven decision-making.

Forecasting future trends is also on my agenda. I am going to use Forecasting Models, particularly SARIMA, to predict future movements in economic indicators based on historical data. This predictive modeling could be highly beneficial for strategic planning.

In addition to these, I’m looking at performing Comparative Analysis to compare economic indicators across different years. This will enable us to see the effects of specific events or policy changes on Boston’s economy over time. Cluster Analysis will also be employed to group similar time periods based on economic characteristics, providing a clearer picture of economic cycles.

Finally, I am planning to utilize Visualization techniques to represent our data graphically. This will make it easier to interpret and present our findings, making the data more accessible to everyone.

As we move forward with the project, I might add more ways to analyze the data depending on what we need and what we find out. This means we can change how we look at the data to make sure we’re getting the most important and useful information.

In the updates that are coming up, I’ll talk about each way of analyzing the data, one by one. We’ll go through them carefully and discuss what they show us about how Boston’s economy works. Keep an eye out for these updates – they’ll give us a lot of good information about each method and what it tells us.

13-Nov

I have looked into the data in the file from  https://data.boston.gov/ about the economic indicators. This dataset represents a historical record of economic indicators monitored on a monthly basis from January 2013 to December 2019 by the Boston Planning and Development Authority (BPDA). The BPDA is responsible for planning and fostering inclusive growth in the City of Boston. It gathered and examined diverse economic data pertaining to areas like employment, housing, travel, and real estate development.

The statistical description of the data in “economic-indicators.csv” provides valuable insights into various economic indicators over a specific time period. Here’s a summary of the key statistics for each column:

  1. Year (2013-2019): Data spans from 2013 to 2019.
  2. Month (1-12): Includes all months from January (1) to December (12).
  3. logan_passengers (1,878,731 – 4,120,937): Range of passengers at Logan, with an average of about 3,015,647.
  4. logan_intl_flights (2,587 – 5,260): International flights range, averaging around 3,940.
  5. hotel_occup_rate (0.572 – 0.931): Hotel occupancy rates vary from 57.2% to 93.1%, with an average of 81.8%.
  6. hotel_avg_daily_rate ($157.89 – $337.92): The average daily hotel rate ranges from $157.89 to $337.92.
  7. total_jobs (322,957 – 392,536): Total jobs recorded range from 322,957 to 392,536.
  8. unemp_rate (0.02 – 0.07): Unemployment rate varies from 2% to 7%.
  9. labor_force_part_rate (0.626 – 0.676): Labor force participation rate ranges from 62.6% to 67.6%.
  10. pipeline_unit (-54 – 2,026): Data on pipeline units varies widely.
  11. pipeline_total_dev_cost ($0 – $2.76B): Total development cost for pipelines ranges up to $2.76 billion.
  12. pipeline_sqft (0 – 4,714,445): Square footage related to pipelines.
  13. pipeline_const_jobs (0 – 3,976): Construction jobs related to pipelines range from 0 to 3,976.
  14. foreclosure_pet (0 – 69): Foreclosure petitions vary from 0 to 69.
  15. foreclosure_deeds (0 – 17): The number of foreclosure deeds ranges from 0 to 17.
  16. med_housing_price ($0 – $517,750): Median housing prices have a wide range.
  17. housing_sales_vol (0 – 2,508): Housing sales volume varies significantly.
  18. new_housing_const_permits (0 – 897): New housing construction permits range from 0 to 897.
  19. new-affordable_housing_permits (0 – 232): New affordable housing permits vary from 0 to 232.

In the next update, I will discuss about the statistical analysis which can be applied to this dataset.

10-Nov

As discussed in the previous update, I have applied the decision tree to my project.

The code goes like this:

The output that we got is:

In our decision tree analysis, we focused on predicting individual behaviors in varied incident scenarios. These ranged from situations where individuals didn’t flee to those where they fled by car, on foot, or through other means. Our model achieved an accuracy of approximately 67%, reflecting a substantial ability to predict correctly in a majority of the cases. This level of accuracy underscores the model’s general effectiveness in assessing and interpreting varied behavioral responses during different incident scenarios.

However, a closer inspection of the confusion matrix revealed certain misclassifications. While the model successfully identified 37 instances where individuals did not flee and accurately predicted 676 fleeing incidents, it also misclassified several cases. Notably, there were 125 instances where the model inaccurately predicted fleeing when it didn’t occur, 136 cases were wrongly classified as fleeing by car, and 33 cases were incorrectly predicted as fleeing on foot. These misclassifications point to specific areas in the model that require refinement to enhance its predictive accuracy and reliability.