Record coordinating for evaluation purposes in the Netherlands. Eric Schulte Nordholt Senior analyst and undertaking pioneer of the Registration Measurements Netherlands Division Social and Spatial Insights Office Backing and Advancement Segment Innovative work ESLE@CBS.NL
Contents History of the Dutch Census Data sources Micro linkage Micro incorporation Social Statistical Database Estimation perspectives Statistical secrecy Conclusions

History of the Dutch Census TRADITIONAL CENSUS Ministry of Home Affairs: 1829, 1839, 1849, 1859, 1869, 1879 and 1889 Statistics Netherlands: 1899, 1909, 1920, 1930, 1947, 1960 and 1971 Unwillingness (nonresponse) and diminishment costs  not any more Traditional Censuses ALTERNATIVE: VIRTUAL CENSUS 1981 and 1991: Population Register and studies improvement 90’s: more registers → 2001: incorporated arrangement of registers and studies, SSD

Data sources Registers: Population Register (PR), 16 million records demographic variables: sex, age, family status and so forth. Employments document, representatives, 6.5 million records , and independently employed persons, 790 thousand records dates of occupation, branch of monetary action Fiscal organization (FIBASE) employments, 7.2 million records , and annuities and life coverage advantages, 2.7 million records Social Security organizations, 2 million records , helper data mix procedure Surveys: Survey on Employment and Earnings (SEE), 3 million records , working hours, work environment Labor Force Survey (LFS), 2 years: 230.000 records instruction, occupation, (financial) action

Matching procedure Matching of registers and datasets to a self developed Central Matching File Records are recognized by a surrogate identifier (RIN) One remarkable table RIN-Social Security Number Minimal arrangement of distinguishing variables Every progression in the process is a deterministic match

Statistics Netherlands’ spine of persons

Matching procedure Social security number coordinating Check on date of conception and sexual orientation A legitimate match when close to one of the variables year, month, day of conception and sex vary else Matching utilizing different variables like postal code, house number, date of conception, sex All keys must match else Match on government managed savings number with no control on different variables

RIN vocation pay, employments training standardized savings,.. RIN YearMonthBirth, sexual orientation, district, common status de-distinguishing proof table RIN Selection from Municipal populace register Micro information with Surrogate Identifier creation environment SN Municipal Population Register Micro information Services Social Statistics Database Micro information Preparation and documentation Registers Surveys de-recognized smaller scale information Direct Identifier Surrogate Identifier (RIN)

Micro incorporation (1) The point of miniaturized scale combination is: To check the connected information and change erroneous records, In a manner that the outcomes that are to be distributed are of higher quality than the first sources

Micro reconciliation (2) To satisfy this request a coordinated procedure of: information altering, inference of factual variables, and ascription is executed

Micro mix (3) Constraints and impediments: Only variables that are to be distributed are smaller scale incorporated Identity standards are fundamental, e.g. the same variable in two sources or a relationship between two or more variables in one or more sources No mass ascription

Social Statistical Database (SSD) Social Statistical Database (SSD): Set of coordinated microdata records with reasonable and definite demographic and financial information on persons, family units, employments and advantages No staying inside clashing data SSD set: Population Register (spine) Integrated occupations document Integrated record of (social and other) advantages Surveys, e.g. LFS Combining component: RIN-individual

Core and satellites (1) satellite SSD-center satellite

Core and satellites (2) Core: contains just fundamental register data contains the most imperative demographic and financial data contains just data that is utilized as a part of no less than two satellites

Core and satellites (3) Satellites are delivered in two stages: Copying and induction of the applicable data from the center SSD Adding of the extraordinary data on a particular subject from registers and studies

Conclusions SSD The SSD reduces the authoritative weight The SSD expands The effectiveness of measurements generation The precision of factual yields The significance of social insights The potential outcomes for social approach research

Estimation angles Surveys are tests from the populace If overviews are improved with register data, estimations of the register piece of the advanced review will prompt irregularities with the tallies from the whole enlist Statistics Netherlands added to the system for steady and rehashed weighting to illuminate these irregularities

Statistical privacy IDs Variables Characteristics Administrative sources Identifiers (PINs, sex, date of conception, location) IDs Variables Household studies PERSONS BACKBONE full scope of all persons as from 1995 IDs in sources are supplanted by arbitrary Record Identification Numbers (RINs)

Conclusions Matching is moderately shoddy Matching is generally brisk (short creation time) Micro coordination stays essential The SSD has discovered its place in the association Repeated weighting technique ensures reliable appraisals Statistical classification perspectives have turned out to be vital

Thank you for your consideration! Time for inquiries and dialog

