Uniq and basic set theory
October 25th, 2006 toydi Posted in sort, Text Manipulation, uniq | Hits: 24102 | 1 Comment »
Imagine that I have two files:
aquatic – contains a list of aquatic animals
starfish
whale
nemo
crab
dolphin
mammal – contains a list of mammals
whale
RMS
batman
dolphin
scooby-doo
Given aquatic and mammal are two different sets, let’s use sort and uniq to play with a few basic set theory operations:
Union ( A U B – members in either A or B )
aquatic U mamal= {batman, crab, dolphin, nemo, RMS, scooby-doo, starfish, whale}
sort aquatic mammal | uniq
Intersection ( A ∩ B – members in both A and B )
aquatic ∩ mammal = {dolphin, whale}
sort aquatic mammal | uniq -d
Symmetric Difference ( A ^ B – members in A or B but not both )
aquatic ^ mammal = {batman, crab, nemo, RMS, scooby-doo, starfish}
sort aquatic mammal | uniq -u
Relative Complement ( A − B – members in A but not in B )
aquatic − mammal = {crab, nemo, starfish}
sort aquatic mammal | uniq -d | sort aquatic - | uniq -u
"sort aquatic mammal | uniq -d"performs an intersection: aquatic ∩ mammal = {dolphin, whale}."sort aquatic - | uniq -u” performs a symmetric difference: aquatic ^ {dolphin, whale} = {crab, nemo, starfish}.
UPDATED: I found a piece of clean elegant codes to perform relative complement:
sort aquatic mammal mammal | uniq -u
.
“sort -u”: a short-hand for “sort | uniq”
sort -u is equivalent to sort | uniq to eliminate duplicated elements in a list. Therefore, you may replace:
sort aquatic mammal | uniq
with:
sort -u aquatic mammal







November 3rd, 2006 at 4:36 pm
[...] In a previous post, I used sort and uniq to emulate a few set theory operations. This time, I try to write them as functions: [...]