I have a text file full of user-submitted email addresses. I want to remove the duplicate records, but it isn't as simple as using "uniq." When I find a dupe I want to remove both of them, not just one. If it's possible I'd also like to create a text file containing all of the email addresses that had duplicates. Is this possible? Thanks
I've changed things slightly. Instead of removing them completely I'd like to leave on, and only take the dupes out. I know I can do that with uniq, but how would I know which ones were taken out so I can write them to a file?
I don't know if this was the best way, but I was able to do it like this: sort participants | uniq > temp1 sort participants > temp2 comm -1 -3 temp1 temp2 > temp3 sort temp3 | uniq > outputfile
An old post but heh, thought i might add a bit: To show only unique lines from <file>: Code: $ uniq file To show only the non-unique lines once: Code: $ uniq -d file If the lines are not ordered yet. So remove non-consequtive duplicate lines spread out through the file: Code: $ sort file| uniq