Uniq and basic set theory

October 25th, 2006 toydi Posted in sort, Text Manipulation, uniq | Hits: 27371 | 1 Comment »

Imagine that I have two files:

aquatic – contains a list of aquatic animals

starfish
whale
nemo
crab
dolphin

mammal – contains a list of mammals

whale
RMS
batman
dolphin
scooby-doo

Given aquatic and mammal are two different sets, let’s use sort and uniq to play with a few basic set theory operations:

Union ( A U B – members in either A or B )

aquatic U mamal= {batman, crab, dolphin, nemo, RMS, scooby-doo, starfish, whale}

sort aquatic mammal | uniq 

Intersection ( A ∩ B – members in both A and B )

aquatic ∩ mammal = {dolphin, whale}

sort aquatic mammal | uniq -d

Symmetric Difference ( A ^ B – members in A or B but not both )

aquatic ^ mammal = {batman, crab, nemo, RMS, scooby-doo, starfish}

sort aquatic mammal | uniq -u

Relative Complement ( A − B – members in A but not in B )

aquatic − mammal = {crab, nemo, starfish}

sort aquatic mammal | uniq -d | sort aquatic - | uniq -u
  • "sort aquatic mammal | uniq -d" performs an intersection: aquatic ∩ mammal = {dolphin, whale}.
  • "sort aquatic - | uniq -u” performs a symmetric difference: aquatic ^ {dolphin, whale} = {crab, nemo, starfish}.

UPDATED: I found a piece of clean elegant codes to perform relative complement:

sort aquatic mammal mammal | uniq -u

.

“sort -u”: a short-hand for “sort | uniq”

sort -u is equivalent to sort | uniq to eliminate duplicated elements in a list. Therefore, you may replace:

sort aquatic mammal | uniq

with:

sort -u aquatic mammal

One Response to “Uniq and basic set theory”

  1. [...] In a previous post, I used sort and uniq to emulate a few set theory operations. This time, I try to write them as functions: [...]

Leave a Reply