Tools Utilized: Oracle Data Modeler & SQL
Project Background
In my Business Data Management class, the final project was designed to assess our understanding of the course material through a comprehensive, hands-on application. The project involved several key components:
Entity-Relationship Diagram (ERD): I was tasked with creating an ERD based on the Northampton County Assessment website. This involved identifying and modeling the various entities and relationships within the real estate data.
Data Population: Once the ERD was established, I populated the tables with relevant data, simulating a real-world dataset to reflect actual real estate properties, transactions, and assessments.
SQL Queries: The final step was to write SQL queries to address specific business problems. These queries were designed to extract meaningful insights and answer critical questions related to real estate sales and assessments.
Through this project, I gained valuable experience in database design, data manipulation, and analytical querying, which are essential skills for managing and interpreting business data.
Project Scope
Problem Statement
#1: Targeted Marketing for Real Estate Sales
In the competitive real estate market, it is crucial for realtors to target their marketing efforts effectively to maximize sales opportunities. To achieve this goal, there is a need to identify and understand high-value demographic segments within different geographic areas. Real estate professionals are particularly interested in the relationship between age, income, and property sale prices, as these factors often play a significant role in shaping purchasing decisions and property preferences. The output of the query will provide valuable insight to the marketing department especially as the department considers new strategies to attract potential buyers from these lucrative segments.
#2: Impact of Zip Code on Property Sale Prices
In the realm of real estate market analysis and property valuation, stakeholders seek to uncover patterns and trends that influence property sale prices. One critical factor often under examination is the geographic location, represented by zip codes, and its correlation with sale prices. Understanding how zip codes influence property values can provide valuable insights for various stakeholders, including real estate agents, property developers, and urban planners, aiding in strategic decision-making processes.
Entity-Relationship Diagram (ERD)

Problem #1: Targeted Marketing for Real Estate Sales
SQL:

Query 1: Display data ordered by age

Query 2: Segment by age and display averages for income, sale price, rooms, and square feet

Analysis:
The findings of Query 1 present a comprehensive view of owner demographics (owner id and age), income distributions (income), and property characteristics (land use code and sale price). A land use code of “001” is representative of a luxury single-family residence, whereas a code of “002” is representative of an affordable single-family residence. Organized in ascending order by age and income, this highlights the significance of these factors in real estate purchase decisions.
Query 2 further refines the data, arranging results in ascending order based on age segments. This segmentation in 10-year increments allows for a clearer understanding of trends across different age groups regarding counts of luxury and affordable residences, average incomes, and average sale prices.
The results of these queries reveal several key insights that the marketing department should consider when devising new strategies to attract potential buyers. Younger buyers, particularly those in their 20s, exhibit lower incomes, limiting their purchasing power to properties categorized as “002” or “Affordable.” The average income for this age group is $41,500, with occasional outliers earning significantly more.
As age brackets increase, ownership of both “002” (Affordable) and “001” (Luxury) properties becomes more prevalent. Notably, luxury residences are primarily owned by individuals in their 30s and above. This demographic shift is accompanied by an increase in average income and property sale prices. For instance, in the 30-39 segment, the average income is $92,857.14, with an average sale price of $109,714.29. Similarly, in the 40-49 segment, the average income rises to $97,777,78 with a corresponding increase in average sale price.
This upward trend in income and property ownership among older age groups can be attributed to career advancement and corresponding salary increases. Notably, individuals in their 20s typically do not own or reside in luxury residences, indicating a progression towards higher-end properties as individuals age and accumulate wealth.
Given this information, the marketing department can tailor their campaigns to target high-income buyers in their 30s or older for luxury estates. Conversely, more affordable homes should be marketed toward younger buyers who are at the beginning of their careers. Additionally, the data highlights an emerging trend of individuals in their late 30s and up investing in second homes, presenting an opportunity for marketers to capitalize on this trend by making second homes more appealing, whether for vacation purposes or rental investments.
Problem #2: Impact of Zip Code on Property Sale Prices
SQL:

Query 1: Display the average sale price for each zip code

Query 2: Display zip codes with the highest total sale price

Analysis:
Query 1 showcases the average sale prices associated with different zip codes in descending order. Query 2 reveals the zip codes with the highest total sale prices in descending order.
These insights shed light on the impact of zip codes on property values, offering valuable guidance to stakeholders involved in strategic decision-making. For instance, the zip code 94101 in California emerges as the most expensive with an average sale price of $148,000. Following closely is 10001 in New York, with an average of 146,666.67, 90001 in California at $138,000, and so forth. This trend persists in the total sale prices, reinforcing the notion of higher living expenses in California (zip codes 94101 and 90001) and New York (10001) compared to other states collected.
By analyzing the average and total sale prices, stakeholders can discern underlying market trends and dynamics within specific geographic areas. The higher prices in California can suggest strong demand or desirable features that command premium prices. In turn, they can tailor their offerings or investment strategies to align with market demand.
