Skip to main content

April 2026 Updates

Mitch Phillipson May 01, 2026


The WiNDC Household build is complete and will generate data for all years 2011 onward. I’ve also been working on comparing the Julia generated data to the GAMS generated data. So far, the data agrees up to a five percent difference on 99.9% of the data points. The remaining 0.1% of the data points have a difference of more than five percent, and I’m currently investigating those differences to understand why they occur. Overall, I’m pleased with the progress and the level of agreement between the two implementations.

Overview of YAML Files

In an effort to make the data easier to generate when new data is release, I’ve put all the configuration into a YAML file. YAML stands for “YAML Ain’t Markup Language” and is a human-readable data serialization format.

At a high level, YAML is designed to describe structured data in a way that is easy for people to read and edit. It is commonly used for configuration files because it is much less cluttered than formats such as XML, while still being expressive enough to represent nested objects, lists, strings, numbers, and boolean values.

The most important idea in YAML is that structure is defined by indentation. A set of key: value pairs represents a mapping, which is similar to a dictionary in Python. A list is written with leading dashes, and nested content is created by indenting underneath a key or list item. Because whitespace carries meaning, consistent indentation is essential.

YAML supports a few basic kinds of values:

  • scalars, such as strings, integers, floating point values, and booleans
  • mappings, which associate keys with values
  • sequences, which are ordered lists of values

It also supports comments using the # symbol, which makes it useful for documenting configuration choices directly inside the file. For longer text, YAML can represent multi-line strings in a readable way, and it can be used to organize repeated or hierarchical settings without much visual noise.

For example, a small YAML file might look like this:

dataset:
    year: 2024
    region: USA
    include_households: true

sectors:
    - agriculture
    - manufacturing
    - services

In this example, dataset is a mapping with several fields, while sectors is a sequence. That combination of named settings and lists is typical of how YAML is used in practice.

In short, YAML is best thought of as a clean, indentation-based language for describing data. It is not a programming language, but it is a convenient way to store parameters, options, and metadata in a form that both people and software can work with easily.

Updating the Household YAML File

To build the data you need three files:

- here
- here
- here - Contains `capital_tax_rates`, `labor_tax_rates`, `income_elasticities` and `windc_pce_share`

Extract the zip files and note the location of the extracted files. The household.yaml file will need to be updated in a few locations, to point at each of these files.

The first section of the household.yaml file is the metadata section:

metadata:
  title: Household Data Configuration
  description: Configuration file for household data sources
  census_api_key: census_api_key_here
  bea_api_key: bea_api_key_here
  save_data: true
  maps:
    state_map:
    windc_naics_map:
 years:
    - 2024
    - 2023
    - 2022
    - 2021
    - 2020
    - 2019
    - 2018
    - 2017
    - 2014
    - 2011

You will need a census API key and a BEA API key to access the data. You can obtain these keys from the respective websites. You can adjust which years of data you want to generate by modifying the years list.

The maps section contains paths to the mapping files. If these are empty, which they are by default, then we use the provided mapping files which can be found on the GitHub. If you have custom mapping files, you can specify their paths here.

The next section of the YAML file is data. Be sure to update the paths to the state table and the capital_tax_rates, labor_tax_rates, income_elasticities and windc_pce_share files. Any field that says api: true does not need an updated path as it will pull the data directly from the API using the provided keys.

The final section details some magic numbers from specific government sources. They are included here so that they can be easily updated when new data is released.

Building the Data

To build the data, first make sure you have a Julia environment set up and add the WiNDCHousehold package. I also recommend adding DataFrames and MPSGE for working with the data.

Finally, the code to build the data and run the model is as follows. Be sure to update the hh_path variable to point at your updated household.yaml file.

using WiNDCHousehold
using WiNDCHousehold.WiNDCContainer
using DataFrames

using MPSGE

hh_path = raw"update/me/to/point/at/household.yaml"


state_table, HH_Raw_Data = WiNDCHousehold.household_raw_data(hh_path)

HH = WiNDCHousehold.build_household_table(
        state_table,
        HH_Raw_Data;
    )


M = household_model(HH);
MPSGE.solve!(M, cumulative_iteration_limit = 0)