Synthetic Data Engine

Fill in Missing Data with AI

Intelligently fill gaps in your data while preserving the relationships between variables.

Preserves Relationships: Filled values respect the patterns in your existing data
Handles Any Pattern: Random gaps, systematic gaps, or a mix of both
Quality Report: See exactly how well each variable was filled
Easy Export: Download your complete dataset as CSV or Excel

How it works:

Upload your data file (with missing values)
Review which variables have gaps and how many
Choose which gaps to fill and run the AI
Check the quality report and download your complete data

Total Rows

Complete Rows

Rows with Missing

Missing Values by Variable

Missingness Patterns

Warning: Variables with >30% missing may have lower quality.

Variables to Impute

Values to Fill

Training Rows

Configuration Summary

Data Preview

First 10 rows:

Filling in Your Data

Process Log

Values Imputed

PRISM Score

Grade

Samples

PRISM Quality Framework

View per-variable details

Imputed Data Preview

First 10 rows:

Export Options

Download CSV

Download Excel

Export for Paper

Download Standardised Data Download PPT Report

What's next? Your complete dataset is ready for deeper analysis. Try Distillation to combine related questions into summary scores, Catalyst to discover what drives your key outcomes, or Segmentation to find distinct groups in your data.

🎯 Welcome to Data Projection

Turn a small survey into population-level estimates using AI.

20x Expansion: Turn 1,000 respondents into 20,000 representative records
Fast Generation: Results ready in minutes
Quality Scored: Every output gets a quality grade so you know what to trust
What-If Scenarios: Test how changing your audience mix affects results

Getting Started:

Upload your survey data (500-2,000 respondents works best)
Upload your population data (the demographics you want to represent)
Link shared variables (demographics like Age, Gender)
Select which survey questions to generate for the full population
Generate data and download results

📐 Sample Size Check

🏷️ Variable Types

Specify the type for each target variable:

💡 Recommendations

🔧 Value Recoding Required

The values in your survey don't exactly match the values in the universe. Please map each value below so the model can generate accurate synthetic data.

🔗 Map Values

👁️ Preview

Sample of how values will be recoded:

Expected Quality

Expected Time

Output Samples

📋 Generation Summary

🔗 Variable Mappings

Generating Your Data

Progress Log

PRISM Score

Grade

Projected Records

📊 PRISM Quality Framework

📐 Utility & Sample Power

👁️ Data Preview

First 10 rows of your generated data:

What's next? Your expanded dataset is ready for deeper analysis. Try Distillation to create summary scores from related questions, Catalyst to discover what drives your outcomes, or Segmentation to find distinct groups in your expanded data.

🔒 What-If Scenarios Locked

Complete the data generation step first to unlock what-if scenarios.

🔮 What-If Scenarios

Explore "what-if" questions by modifying your audience composition.

💡 How It Works

Your model learned relationships between:

LINKING VARIABLES (demographics you adjust)

Age, Gender, etc.

→

SURVEY VARIABLES (predicted outcomes)

Attitudes, behaviors, etc.

Scenario simulation keeps these learned relationships but changes the INPUT mix . This answers questions like:

"What if our sample was younger?"
"What if we had more female respondents?"
"What if urban residents were overrepresented?"

📊 Your Variables

LINKING (adjustable):

SURVEY (predicted):

🎯 Current Scenario Configuration

📊 What Changed

📈 Before vs After

Download Standardised Data