Dataset Matching Toolkit

Last Updated

The Dataset Matching Toolkit provides users with a blueprint for matching datasets, from creating a proposal through the matching process. Included are considerations for data use agreements, instructions for data preparation in SAS, R, and Excel, options for matching software, sample matching code for SAS and R that includes various approaches to ‘fuzzy’ matching, and datasets for consideration. The toolkit aims to be useful to individuals at all process steps - from brainstorming to troubleshooting code. The toolkit also provides options for individuals who are not comfortable with coding, including instructions for how to prepare data in Excel, and software options for matching outside of SAS or R.

If you have suggestions for updates to the toolkit or have further questions regarding matching that are not answered by the toolkit, please send them to

Download the Toolkit