Data Critique

dataset description

The data in the dataset we chose is the federal government education funding and allocation specifics. This includes data from 1980 to 2019, painting a full picture of spending allocations for 17 elementary and secondary educational programs over the span of 30 years in the United States. The data was generated from the US Department of Education website, which also originally included data for postsecondary and post high school funding for those specific years. Because the dataset is produced by the U.S. federal government using taxpayer funds, its priorities reflect federal administrative and political objectives rather than independent or academic research goals. The raw data was separated into the Presidential Budget and Appropriation Bills (the presidential budget is the proposal which outlines the executive branch’s funding plan, and the appropriation bills are the specific binding laws that legally authorize the agencies to spend the money). Additionally, the raw data is separated into discretionary and mandatory spending; discretionary spending is adjustable funding that Congress can approve through appropriation processes based on the Presidents’ priorities, whereas mandatory spending is required by existing laws (such as Social Security and Medicare) which need no Congressional vote. An additional supplement existed, which we separated into its own separate spreadsheet because it is an anomaly in terms of format; this supplement was the 2009 Recovery Act, which is unique because it was not a part of the yearly Presidential or Appropriation allocations.


For our purposes, we cleaned the data to only include the elementary and secondary budgets, as this encompasses the information we are interested in for the scope of our project. Notably, many years have $0 for specific programs, which could indicate that the programs did not exist that year or be a general reflection of the Presidential and Congressional priorities at the time. This dataset is useful for identifying long-term trends in federal education funding and examining how national priorities evolve or time. No information is provided on how these federal funds are distributed to states, districts, or individual schools. Thus, not capturing any of the downstream effects of these funding decisions, and how it affects different populations within the United States. As a result, it cannot be used to evaluate whether increases or decreases in federal funding have led to measurable improvement. Since a large portion of K-12 education funding in the United States is provided from state and local governments, solely looking at federal allocations risks overstating the role of the federal government while obscuring the mechanisms that can produce inequality at the local level.

data critique

This data set also does not include any demographic data such as race, income, language status, making it impossible to analyze which populations benefit most or least from federal education funding. Within the dataset itself, the presence of a broad program category exists as the “Other” category. Despite being one of the highest funded categories throughout the entirety of the data set, it is not disaggregated or explained in further detail. Since it does not have any specification, it is extremely difficult to interpret where a substation portion of the money is actually directed. This lack of transparency makes large funding allocations effectively invisible and obscures distinctions that would be important for understanding funding priorities.

Since the data was produced by the U.S. The Department of Education is shaped by the frameworks of the government. While it is an official dataset, it can not be assumed to be neutral or trustworthy, especially during the Trump administration where datasets have been either completely removed or altered to align with their political priorities.

To continue on with the data critique, we can argue that with the dataset being organized around official budget categories and certain program names, it mainly shows how the government classifies and tracks money rather than showing how education is actually enforced and experienced by U.S. students and teachers. The structure of this dataset emphasizes on what can be quantified and numerically documented, while excluding issues that are harder to measure due to their subjective nature, including classroom environment, teacher support, and the well-being of students. These factors would be helpful to form our narrative. Appointing this as our only dataset would be a weak choice because it remains unclear as to how these funding decisions are made or how the people directly facing the effects of the funding feel about it. We also cannot gauge whether money is utilized in an effective manner or reaches a necessary target of schools and students who are in the most need. Therefore, although the dataset is beneficial in terms of overseeing federal funding patterns across the schools in America, it provides a partial look on the realm of education and would only be helpful if combined with other sources as well in order for us to understand the narrative behind these numbers.

data description

The second dataset we analyzed comes from the National Center for Education Statistics’ Common Core of Data, specifically the fiscal survey of local education-funding agencies. Unlike the federal budget dataset above, which captures the national-level educational funding allocations, the CCD dataset tracks revenue and expenditure data reported by states and school districts. We concatenated the dataset to span multiple decades and included detailed breakdowns of revenue sources, including federal, state, and local funding; this concatenation required selecting essential columns, which we deduced to be the summary columns for specific categories.


Although the CCD dataset appears more granular and focused in a local sense, it is still structured through federally standardized reporting requirements. The states submit the financial information using accounting categories that are determined by NCES. This standardization allows for cross-state comparison, but it also imposes a uniform framework onto educational systems that can at their core vary widely in structure, governance, and funding mechanisms. The dataset therefore reflects not only local educational finance realities but also federal priorities in how education finance should be categorized and measured.

data critique

One limitation of the dataset is the instability of categories across time and NA values that exist for local pieces of data that were not collected or reported. The NA values were kept for data consistency purposes and the categories were standardized by overall name in the appendix for the columns. For example, ‘Local Revenues Subtotal’ was originally titled STR1N200, but renamed LOCALREVENUESSUBTOTAL for our dataset. These columnal shifts across the years reveal that data categories are not fixed reflections of reality but are instead administrative constructs subject to change in terms of what is deemed ‘necessary’ to review by the government. What appears to be a continuous time series is, in practice, a series of evolving accounting systems stitched together across the selected years.

Additionally, while the CCD dataset provides detailed fiscal information, it remains largely silent on the lived experience of education, and rather the numbers that relate to education. Revenue totals and expenditure categories do not indicate how funds are distributed within districts, how equitably resources reach different student populations, or whether spending translates into improved educational outcomes. Demographic data is also not integrated into the fiscal dataset in a way that allows for direct equity analysis without merging external datasets and looking at societal data from each year. As a result, the dataset privileges and represents financial accounting over social impact across the entire time frame.

Ultimately, while the CCD dataset is valuable for analyzing patterns in educational finance and comparing state-level structures to those of federal-level, it remains limited in its ability to capture how funding disparities affect students and teachers in real life. Similar to the federal funding dataset, it provides an important but partial view of education that is shaped by pre-structured accounting logics rather than by true educational experience.

data description

This dataset contains fiscal and revenue data for California public school districts, compiled and published by the California Department of Education (CDE). The data originates from the Standardized Account Code Structure (SACS), which requires districts to report detailed financial information including revenue sources, expenditures, and program allocations. The dataset organizes financial data by district and fiscal year, allowing users to observe how much funding districts receive from federal, state, and local sources. At first glance, the dataset appears to show which school districts receive the most funding and how education funding is distributed across the state. However, this interpretation is incomplete without additional context.

Because the data primarily reports total revenue and categorical funding amounts, it can easily be interpreted as a direct indicator of how well-funded a district is. In reality, these totals largely reflect district size and enrollment rather than the level of resources available to each student. Larger districts such as Los Angeles Unified School District naturally appear to receive far greater amounts of funding simply because they serve significantly more students. Without converting the data into per-pupil funding measures, comparisons across districts risk overstating the relative advantage of large districts and obscuring the actual distribution of resources among students.
The dataset is useful for identifying the structure of school funding in California, particularly the relative contributions of federal, state, and local revenue streams. It reveals how districts rely on different funding sources and how categorical programs—such as Title I grants, special education funding, and other targeted allocations—are distributed. However, the dataset cannot reveal how these funds translate into educational opportunities or student experiences. It does not include information about student enrollment by subgroup, the number of eligible students for federal programs, classroom resources, teacher staffing levels, or school infrastructure conditions. Without this contextual information, it is impossible to determine whether districts receiving larger allocations are actually able to provide more educational resources to students.

data critique

Additionally, the dataset cannot capture how funding is experienced at the school or classroom level. Financial data is reported at the district scale, which can obscure significant disparities within districts themselves. Large urban districts often contain schools serving communities with vastly different socioeconomic conditions, yet these differences are not visible within aggregated district-level financial reporting. As a result, the dataset cannot fully represent inequalities in educational access or resource distribution within districts.

The underlying data is generated through administrative fiscal reporting required by the California Department of Education. School districts submit financial data annually through standardized accounting systems designed to ensure consistency and comparability across districts. While this structure allows researchers to analyze statewide funding patterns, it also reflects the assumptions embedded within fiscal reporting systems. Funding is categorized into predefined revenue streams and expenditure categories, which shape how education funding is represented in the data. Programs and needs that do not fit neatly within these categories may be underrepresented or difficult to identify.

Ultimately, the dataset defines educational funding primarily through budgetary totals and categorical allocations, reducing complex educational realities to fiscal line items. When viewed without contextual data—such as enrollment, demographic composition, or program eligibility—it risks presenting funding magnitude as a proxy for educational opportunity. This framing can obscure important questions about how resources are distributed across students, how funding formulas operate in practice, and how fiscal allocations translate into lived educational experiences within schools. While the dataset provides valuable insight into the structure of public education finance, it must be interpreted alongside additional demographic, geographic, and programmatic data to meaningfully assess equity and access within California’s education system.

data description

This dataset contains financial information reported by California local educational agencies (LEAs), including school districts, county offices of education, charter schools, and joint powers agencies. The data is compiled and published by the California Department of Education (CDE) through its financial reporting system, which collects annual fiscal reports detailing revenues, expenditures, and fund balances. These reports are submitted by local agencies and standardized through the Standardized Account Code Structure (SACS), a statewide accounting framework designed to ensure consistency in financial reporting across California’s education system. The resulting datasets allow users to analyze education funding patterns across districts and over time.

At first glance, the dataset appears to reveal how much money California school districts receive and spend, potentially suggesting which districts are better funded or have greater financial resources. However, this interpretation can be misleading without additional context. Because the dataset reports total revenues and expenditures at the district level, it largely reflects the size of the district and the number of students served rather than the actual level of resources available to each student. Large districts such as Los Angeles Unified naturally report far larger budgets simply because they serve more students. Without adjusting the data to a per-pupil basis, comparisons between districts risk overstating the financial advantage of larger districts and obscuring meaningful differences in educational investment per student.

The dataset is useful for illustrating the structure of public education finance in California. It allows researchers to examine how funding is distributed across major categories such as federal grants, state aid, and local revenue sources. It can also reveal how districts allocate funds across different programs and accounting categories within their budgets. However, the dataset cannot explain how these financial allocations translate into educational opportunities or classroom experiences. It does not include data on student enrollment by subgroup, the proportion of eligible students served by federal programs, staffing levels, facility conditions, or classroom resources. Without this contextual information, it is difficult to determine whether districts receiving larger allocations are actually providing greater educational support to students.

The dataset is useful for identifying the structure of school funding in California, particularly the relative contributions of federal, state, and local revenue streams. It reveals how districts rely on different funding sources and how categorical programs—such as Title I grants, special education funding, and other targeted allocations—are distributed. However, the dataset cannot reveal how these funds translate into educational opportunities or student experiences. It does not include information about student enrollment by subgroup, the number of eligible students for federal programs, classroom resources, teacher staffing levels, or school infrastructure conditions. Without this contextual information, it is impossible to determine whether districts receiving larger allocations are actually able to provide more educational resources to students.

data critique

Additionally, the dataset cannot reveal inequalities within districts themselves. Financial reporting occurs at the district or agency level, which aggregates spending across all schools within that jurisdiction. Large districts often include schools serving communities with widely different socioeconomic conditions, yet these internal disparities are not visible in the aggregated financial totals. As a result, the dataset may obscure variations in resource distribution among schools or neighborhoods within the same district.

The underlying data is produced through administrative financial reporting requirements. Each year, local educational agencies submit fiscal reports to the California Department of Education, which collects and processes the data through its Financial Accountability and Information Services office. The adoption of the Standardized Account Code Structure (SACS) was intended to promote consistency in how districts record and categorize financial information across the state. While this standardization enables large-scale analysis and comparison across districts, it also shapes how education funding is represented. The predefined accounting categories determine how revenues and expenditures are recorded, which may simplify complex financial decisions into standardized fiscal classifications.

Ultimately, the dataset defines educational investment primarily in terms of budgetary totals and categorical accounting codes, reducing complex educational systems to financial entries in a ledger. When interpreted without additional demographic, geographic, and programmatic data, the dataset may encourage users to equate larger spending totals with stronger educational investment or better educational outcomes. In reality, fiscal magnitude alone cannot capture whether funding is distributed equitably, whether students have meaningful access to educational resources, or whether financial allocations translate into improved learning conditions. To fully understand how education funding affects students, the dataset must be interpreted alongside enrollment data, demographic information, program participation rates, and indicators of educational quality and outcomes.

data description

This dataset presents statistics on poverty status in the United States drawn from the American Community Survey (ACS), specifically Table S1701: Poverty Status in the Past 12 Months. The data is produced by the U.S. Census Bureau and released as part of the ACS 1-year estimates for 2024. The table reports the number and percentage of individuals living below the federal poverty threshold across demographic categories such as age, sex, employment status, and family composition.

At first glance, the dataset appears to provide a straightforward measurement of poverty levels across geographic areas and demographic groups. The presence of both total counts and percentages may suggest that it directly reflects economic hardship and inequality in the population. However, interpreting the dataset as a complete representation of poverty can be misleading without additional context. The ACS measures poverty using federally defined income thresholds, which classify individuals as living above or below the poverty line based solely on reported household income relative to family size. While this provides a standardized national benchmark, it does not account for regional differences in cost of living, housing prices, or local economic conditions. As a result, the same income level may represent very different living standards depending on where a person lives.

The dataset is useful for identifying broad demographic patterns of poverty, such as differences across age groups, household types, and employment categories. It can also support comparisons across states, counties, or other geographic units where ACS data is available. However, the dataset cannot capture the full economic realities experienced by households. It does not include information on wealth, debt, housing stability, access to social services, or informal economic support networks. Poverty status is determined only by annual income relative to federal thresholds, meaning that households experiencing temporary income changes may appear economically stable or unstable in ways that do not reflect their long-term financial conditions.

data critique

Additionally, the dataset cannot fully represent material hardship or economic vulnerability beyond the poverty line. Many households whose income falls slightly above the official poverty threshold may still struggle to afford housing, healthcare, childcare, or food. These households are sometimes described as “near-poor” or “economically insecure,” yet they are categorized in the dataset as not living in poverty. Because of this binary classification, the dataset may underestimate the number of households experiencing economic strain.

The underlying data is generated through the American Community Survey, an ongoing nationwide survey conducted by the U.S. Census Bureau that collects demographic, social, and economic information from a sample of U.S. households each year. The ACS produces estimates rather than exact counts, meaning that all values include margins of error reflecting sampling variability. The one-year ACS estimates are particularly useful for analyzing recent trends but are typically limited to areas with larger populations due to statistical reliability requirements.

Ultimately, this dataset defines poverty primarily through income thresholds and statistical estimates, reducing complex economic experiences to categorical classifications of “above” or “below” the poverty line. When interpreted without additional context—such as cost-of-living adjustments, wealth data, or indicators of material hardship—the dataset risks presenting poverty as a fixed numerical boundary rather than a multifaceted social condition. While the dataset provides valuable insight into demographic patterns of income poverty in the United States, it must be used alongside other measures of economic well-being to fully understand the distribution and lived experience of poverty.