Every serious CRM was built decades ago and never fully rebuilt.

Lumenbase is a new CRM for the AI age that helps you prioritize your day and engage with the right people in your network.

Full-funnel CRM with lists, leads, deals, and accounts; Lumo AI for daily priorities and outreach; automations, lead scoring, forecasting, and integrations with Gmail, Outlook, Slack, LinkedIn, and more.

Try for free · Features · How it works · AI site digest

    Finding and Merging Duplicates

    Data Quality

    Keep your CRM clean with duplicate detection, smart merge, and data quality tools.

    By Sebastian StreiffertPublished Jan 10, 2026Updated May 29, 20267 min read

    The Hidden Cost of Duplicate Data

    Duplicate contacts and companies don't just clutter your CRM. They actively undermine your sales process. Reps waste time chasing leads that colleagues already contacted. Reports show inflated pipeline numbers. Customers receive duplicate outreach (when using your outreach platform) and lose confidence in your organization.

    The problem compounds over time. Every data import, web form submission, and manual entry creates another opportunity for duplicates to slip in. Without regular cleanup, even well-maintained CRMs accumulate redundant records. See Data Import guide for best practices on preventing duplicates during import.

    What Data Quality includes

    Core tools

    Duplicate Detection

    Automated scanning identifies potential duplicates across contacts, companies, and deals.

    Smart Merge

    Combine records while preserving the most complete and accurate data from each.

    Prevention Rules

    Block duplicate creation at the source with configurable matching rules.

    Data Quality Score

    Track overall database health and identify records needing attention.

    How Duplicate Detection Works

    Lumenbase uses fuzzy matching algorithms to identify potential duplicates. Unlike simple exact-match searches, fuzzy matching catches variations like "Robert Smith" and "Bob Smith" or "Acme Corp" and "Acme Corporation."

    Matching Criteria

    EntityPrimary Match FieldsSecondary Match Fields
    ContactsEmail address (strongest)Name + company, phone number
    CompaniesDomain/websiteCompany name, phone number
    DealsDeal name + companyContact + close date

    Email matching is the most reliable signal. Two contacts with identical email addresses are almost certainly the same person. Name matching is trickier - common names like "John Smith" generate many false positives, so the system requires additional matching fields before flagging these as duplicates.

    Confidence Scoring

    Each potential duplicate receives a confidence score from 0-100 indicating how likely the records are actually duplicates:

    Score RangeConfidenceTypical Match
    90-100Very HighExact email match or exact name + company
    70-89HighSimilar name + same domain or phone
    50-69MediumSimilar name + partial field matches
    Below 50LowNot flagged as potential duplicate
    Duplicate detection runs automatically during imports and can be triggered manually from the Data Quality section. Large databases may take several minutes to scan completely.

    Finding Duplicates

    You can access duplicate detection from two places: the dedicated Data Quality dashboard or directly within any entity list view.

    1

    Open Data Quality

    Navigate to Company Settings → Data Quality → Duplicates. You'll see a summary of detected duplicates grouped by entity type.
    2

    Review Duplicate Groups

    Click into any duplicate group to see the matched records side-by-side. Each group shows the matching fields highlighted and the confidence score.
    3

    Filter by Confidence

    Use the confidence filter to focus on high-confidence matches first. These are almost always true duplicates and can be merged quickly. Lower-confidence matches require more careful review.

    Quick Duplicate Check

    From any Contacts or Companies list, click the "Find Duplicates" button in the toolbar. This runs a focused scan on just the visible records - useful when you suspect duplicates in a specific segment.

    Merging Duplicate Records

    Merging combines two or more duplicate records into a single, unified record. The process preserves all associated data - activities, deals, tasks - by re-linking them to the surviving record.

    The Merge Process

    1. Select which record becomes the 'primary' (the one that survives)
    2. Review field-by-field which values to keep from each duplicate
    3. Confirm the merge - secondary records are deleted, their data re-linked
    4. All activities, deals, and tasks from deleted records move to the primary

    Field Resolution

    When duplicates have different values for the same field, you choose which to keep. The system suggests defaults based on data completeness and recency:

    • Non-empty values preferred over empty ones
    • More recently updated values preferred when both exist
    • Email addresses default to the validated one if available
    • Manual override available for every field
    Merging is permanent. The secondary record is deleted and cannot be recovered. If you're unsure whether records are truly duplicates, use the "Mark as Not Duplicate" option to exclude them from future scans.

    Bulk Merge Operations

    When facing hundreds of duplicates - common after a messy data import - merging one at a time isn't practical. Bulk merge lets you resolve multiple duplicate groups at once.

    How Bulk Merge Works

    1. Filter duplicates to high-confidence matches (90+) for safety
    2. Select multiple duplicate groups using checkboxes
    3. Click 'Bulk Merge' and choose your resolution strategy
    4. Review the summary showing what will happen to each group
    5. Confirm to process all selected merges

    Resolution Strategies

    StrategyBehavior
    Keep OldestPrimary record is the one created first
    Keep NewestPrimary record is the most recently created
    Keep Most CompletePrimary record has the most filled fields
    Keep Most ActivePrimary record has the most associated activities
    Start with a small batch - say, 10-20 duplicates - to verify the results match your expectations. Once you're confident in the process, scale up to larger batches.

    Preventing Future Duplicates

    Cleanup is necessary, but prevention is better. Lumenbase offers several tools to stop duplicates before they enter your database.

    Import duplicate handling

    During CSV or Excel imports, the system automatically flags potential duplicates. You can choose to:

    • Skip: Don't import records that match existing entries
    • Update: Merge imported data into existing records
    • Create Anyway: Import as new records (creates duplicates intentionally)
    • Review: Pause import to manually decide on each match

    Real-time duplicate warnings

    When manually creating contacts or companies, the system checks for matches in real-time. If a potential duplicate exists, you'll see a warning with the option to view the existing record instead of creating a new one.

    Web form deduplication

    Forms connected via the API can be configured to update existing records rather than creating duplicates. This is particularly useful for newsletter signups and event registrations where repeat submissions are common.

    Data Quality Best Practices

    Ongoing hygiene

    • Run duplicate scans monthly — quarterly at minimum for smaller databases
    • Process high-confidence duplicates immediately; they're almost always correct
    • Review medium-confidence matches carefully; name-only matches often aren't true duplicates
    • Train your team to check for existing records before creating new ones
    • Use consistent data entry standards (e.g., always 'Inc.' not 'Incorporated')
    • Clean up before major imports to avoid multiplying existing duplicates
    • Consider assigning a 'data steward' responsible for ongoing quality
    Data Quality Dashboard: Company Settings → Data Quality
    Duplicate Detection: Data Quality → Duplicates tab
    Import Settings: Import wizard → Duplicate handling step
    Merge History: Data Quality → Merge History

    Was this article helpful?