BasisFile guide
How to Remove Duplicate Rows from a CSV — Without Excel Formulas
Every analyst has been there: you open a CSV that should have unique rows, but somehow there are 50 duplicates. Excel's 'Remove Duplicates' only catches exact matches — and 'John Smith' vs 'john smith' vs 'JOHN SMITH ' aren't exact matches to Excel. Half an hour later you're still scrolling, eyeballing rows.
How BasisFile fixes this in 30 seconds
- 1
Drop your CSV file
Any size, any source. Excel files work too. Nothing to install, no formulas to write.
- 2
AI scans for duplicates — exact and near
Catches case differences, trailing whitespace, punctuation variants, and obvious typos. You see a preview with the rows it plans to remove before anything is touched.
- 3
Click 'Clean it'
Get a deduplicated file in 30 seconds. Original is untouched — you download a clean copy.
Before vs after
Five rows that look unique to Excel — actually three.
Before
| Name | Plan | |
|---|---|---|
| John Smith | john@acme.com | Pro |
| john smith | john@acme.com | Pro |
| JOHN SMITH | JOHN@ACME.COM | Pro |
| Jane Doe | jane@acme.com | Free |
| Jane Doe | jane@acme.com | Free |
After BasisFile
| Name | Plan | |
|---|---|---|
| John Smith | john@acme.com | Pro |
| Jane Doe | jane@acme.com | Free |
Most users go straight to Pro
The free tier is enough for occasional cleanup. If you do this weekly, Pro pays for itself the first Monday.
Pro
- Unlimited file cleans
- Up to 100,000 rows per file
- No watermarks on output
- Save merge & cleaning templates
- Cancel anytime
Frequently asked questions
What counts as a near-duplicate?
Case differences ('JOHN' vs 'john'), trailing or doubled whitespace, punctuation variants ('Inc.' vs 'Inc'), and minor spelling variations. You can tune the strictness — strict mode only removes exact matches after normalisation; loose mode catches typos too.
Will this affect rows that are intentionally similar but distinct?
You preview the proposed removals before applying. Each suggested duplicate shows the rows it would merge so you can deselect any that should stay.
Can I do this with formulas in Excel?
Yes — with COUNTIF, TRIM, LOWER, and a helper column you can build it. Realistically that's 20 minutes of formula juggling per file. BasisFile does it in 30 seconds, with no formulas to maintain.
Is my data secure?
Files are processed in encrypted memory and automatically deleted after 24 hours. We never sell or share your data. UK-based, GDPR compliant.
Does it work with Excel files, not just CSV?
Yes. Upload .xlsx or .xls and you get the same dedupe flow. Output can be CSV or Excel.
Stop wasting Mondays on data hygiene
Drop your file, get a clean one back in 30 seconds. No signup needed for the free tier.
Try free — no signup required