Bash - Deleting duplicate records

Wire323 · Dec 4, 2005

I have a text file full of user-submitted email addresses. I want to remove the duplicate records, but it isn't as simple as using "uniq." When I find a dupe I want to remove both of them, not just one. If it's possible I'd also like to create a text file containing all of the email addresses that had duplicates.

Is this possible?

Thanks

Wire323 · Dec 4, 2005

I've changed things slightly. Instead of removing them completely I'd like to leave on, and only take the dupes out. I know I can do that with uniq, but how would I know which ones were taken out so I can write them to a file?

Wire323 · Dec 4, 2005

I don't know if this was the best way, but I was able to do it like this:

sort participants | uniq > temp1
sort participants > temp2
comm -1 -3 temp1 temp2 > temp3
sort temp3 | uniq > outputfile

falko · Dec 4, 2005

Wire323 said:

I don't know if this was the best way
Click to expand...

If it works it's ok!

muha · Mar 8, 2006

An old post but heh, thought i might add a bit:
To show only unique lines from <file>:
Code:
$ uniq file
To show only the non-unique lines once:
Code:
$ uniq -d file
If the lines are not ordered yet. So remove non-consequtive duplicate lines spread out through the file:
Code:
$ sort file| uniq

Log in or Sign up

Bash - Deleting duplicate records

Wire323 New Member

Wire323 New Member

Wire323 New Member

falko Super Moderator Howtoforge Staff

muha New Member

Share This Page

Log in or Sign up

Bash - Deleting duplicate records

Wire323 New Member

Wire323 New Member

Wire323 New Member

falko Super Moderator Howtoforge Staff

muha New Member

Share This Page

Useful Searches