Who should read Finding and Merging Duplicates?

Sales, revenue, and customer success teams should read Finding and Merging Duplicates when they need a clear, implementation-focused workflow in Lumenbase.

How do I apply Finding and Merging Duplicates in Lumenbase?

Use the step-by-step guidance in Finding and Merging Duplicates and then apply the same flow directly in your workspace with your live records and tasks.

Finding and Merging Duplicates

Data Quality

Keep your CRM clean with duplicate detection, smart merge, and data quality tools.

By Sebastian StreiffertPublished Jan 10, 2026Updated May 29, 20267 min read

The Hidden Cost of Duplicate Data

Duplicate contacts and companies don't just clutter your CRM. They actively undermine your sales process. Reps waste time chasing leads that colleagues already contacted. Reports show inflated pipeline numbers. Customers receive duplicate outreach (when using your outreach platform) and lose confidence in your organization.

The problem compounds over time. Every data import, web form submission, and manual entry creates another opportunity for duplicates to slip in. Without regular cleanup, even well-maintained CRMs accumulate redundant records. See Data Import guide for best practices on preventing duplicates during import.

What Data Quality includes

Core tools

Duplicate Detection

Automated scanning identifies potential duplicates across contacts, companies, and deals.

Smart Merge

Combine records while preserving the most complete and accurate data from each.

Prevention Rules

Block duplicate creation at the source with configurable matching rules.

Data Quality Score

Track overall database health and identify records needing attention.

How Duplicate Detection Works

Lumenbase uses fuzzy matching algorithms to identify potential duplicates. Unlike simple exact-match searches, fuzzy matching catches variations like "Robert Smith" and "Bob Smith" or "Acme Corp" and "Acme Corporation."

Matching Criteria

Entity	Primary Match Fields	Secondary Match Fields
Contacts	Email address (strongest)	Name + company, phone number
Companies	Domain/website	Company name, phone number
Deals	Deal name + company	Contact + close date

Email matching is the most reliable signal. Two contacts with identical email addresses are almost certainly the same person. Name matching is trickier - common names like "John Smith" generate many false positives, so the system requires additional matching fields before flagging these as duplicates.

Confidence Scoring

Each potential duplicate receives a confidence score from 0-100 indicating how likely the records are actually duplicates:

Score Range	Confidence	Typical Match
90-100	Very High	Exact email match or exact name + company
70-89	High	Similar name + same domain or phone
50-69	Medium	Similar name + partial field matches
Below 50	Low	Not flagged as potential duplicate

Duplicate detection runs automatically during imports and can be triggered manually from the Data Quality section. Large databases may take several minutes to scan completely.

Finding Duplicates

You can access duplicate detection from two places: the dedicated Data Quality dashboard or directly within any entity list view.

Open Data Quality

Navigate to Company Settings → Data Quality → Duplicates. You'll see a summary of detected duplicates grouped by entity type.

Review Duplicate Groups

Click into any duplicate group to see the matched records side-by-side. Each group shows the matching fields highlighted and the confidence score.

Filter by Confidence

Use the confidence filter to focus on high-confidence matches first. These are almost always true duplicates and can be merged quickly. Lower-confidence matches require more careful review.

Quick Duplicate Check

From any Contacts or Companies list, click the "Find Duplicates" button in the toolbar. This runs a focused scan on just the visible records - useful when you suspect duplicates in a specific segment.

Merging Duplicate Records

Merging combines two or more duplicate records into a single, unified record. The process preserves all associated data - activities, deals, tasks - by re-linking them to the surviving record.

The Merge Process

Select which record becomes the 'primary' (the one that survives)
Review field-by-field which values to keep from each duplicate
Confirm the merge - secondary records are deleted, their data re-linked
All activities, deals, and tasks from deleted records move to the primary

Field Resolution

When duplicates have different values for the same field, you choose which to keep. The system suggests defaults based on data completeness and recency:

Non-empty values preferred over empty ones
More recently updated values preferred when both exist
Email addresses default to the validated one if available
Manual override available for every field

Merging is permanent. The secondary record is deleted and cannot be recovered. If you're unsure whether records are truly duplicates, use the "Mark as Not Duplicate" option to exclude them from future scans.

Bulk Merge Operations

When facing hundreds of duplicates - common after a messy data import - merging one at a time isn't practical. Bulk merge lets you resolve multiple duplicate groups at once.

How Bulk Merge Works

Filter duplicates to high-confidence matches (90+) for safety
Select multiple duplicate groups using checkboxes
Click 'Bulk Merge' and choose your resolution strategy
Review the summary showing what will happen to each group
Confirm to process all selected merges

Resolution Strategies

Strategy	Behavior
Keep Oldest	Primary record is the one created first
Keep Newest	Primary record is the most recently created
Keep Most Complete	Primary record has the most filled fields
Keep Most Active	Primary record has the most associated activities

Start with a small batch - say, 10-20 duplicates - to verify the results match your expectations. Once you're confident in the process, scale up to larger batches.

Preventing Future Duplicates

Cleanup is necessary, but prevention is better. Lumenbase offers several tools to stop duplicates before they enter your database.

Import duplicate handling

During CSV or Excel imports, the system automatically flags potential duplicates. You can choose to:

Skip: Don't import records that match existing entries
Update: Merge imported data into existing records
Create Anyway: Import as new records (creates duplicates intentionally)
Review: Pause import to manually decide on each match

Real-time duplicate warnings

When manually creating contacts or companies, the system checks for matches in real-time. If a potential duplicate exists, you'll see a warning with the option to view the existing record instead of creating a new one.

Web form deduplication

Forms connected via the API can be configured to update existing records rather than creating duplicates. This is particularly useful for newsletter signups and event registrations where repeat submissions are common.

Data Quality Best Practices

Ongoing hygiene

Run duplicate scans monthly — quarterly at minimum for smaller databases
Process high-confidence duplicates immediately; they're almost always correct
Review medium-confidence matches carefully; name-only matches often aren't true duplicates
Train your team to check for existing records before creating new ones
Use consistent data entry standards (e.g., always 'Inc.' not 'Incorporated')
Clean up before major imports to avoid multiplying existing duplicates
Consider assigning a 'data steward' responsible for ongoing quality

Data Quality Dashboard: Company Settings → Data Quality

Duplicate Detection: Data Quality → Duplicates tab

Import Settings: Import wizard → Duplicate handling step

Merge History: Data Quality → Merge History

Was this article helpful?

Every serious CRM was built decades ago and never fully rebuilt.

Finding and Merging Duplicates

The Hidden Cost of Duplicate Data

What Data Quality includes

Core tools

Duplicate Detection

Smart Merge

Prevention Rules

Data Quality Score

How Duplicate Detection Works

Matching Criteria

Confidence Scoring

Finding Duplicates

Open Data Quality

Review Duplicate Groups

Filter by Confidence

Quick Duplicate Check

Merging Duplicate Records

The Merge Process

Field Resolution

Bulk Merge Operations

How Bulk Merge Works

Resolution Strategies

Preventing Future Duplicates

Import duplicate handling

Real-time duplicate warnings

Web form deduplication

Data Quality Best Practices

Ongoing hygiene

Cookie preferences

Every serious CRM was built decades ago and never fully rebuilt.

Finding and Merging Duplicates

1.The Hidden Cost of Duplicate Data

2.What Data Quality includes

2.1.Core tools

Duplicate Detection

Smart Merge

Prevention Rules

Data Quality Score

3.How Duplicate Detection Works

3.1.Matching Criteria

3.2.Confidence Scoring

4.Finding Duplicates

Open Data Quality

Review Duplicate Groups

Filter by Confidence

4.1.Quick Duplicate Check

5.Merging Duplicate Records

5.1.The Merge Process

5.2.Field Resolution

6.Bulk Merge Operations

6.1.How Bulk Merge Works

6.2.Resolution Strategies

7.Preventing Future Duplicates

7.1.Import duplicate handling

7.2.Real-time duplicate warnings

7.3.Web form deduplication

8.Data Quality Best Practices

8.1.Ongoing hygiene

The Hidden Cost of Duplicate Data

What Data Quality includes

Core tools

How Duplicate Detection Works

Matching Criteria

Confidence Scoring

Finding Duplicates

Quick Duplicate Check

Merging Duplicate Records

The Merge Process

Field Resolution

Bulk Merge Operations

How Bulk Merge Works

Resolution Strategies

Preventing Future Duplicates

Import duplicate handling

Real-time duplicate warnings

Web form deduplication

Data Quality Best Practices

Ongoing hygiene