What is a Counter() class?
Counter is class under the module called collections. This is also a sub-class under the dict() class. It is imported from the collections class as follows:
from collections import Counter
Counter is a very versatile and useful class in Python.
What does a Counter object do?
A Counter() object takes an iterable or mapping object as argument and tallies the occurrences of elements in them. An iterable could be a "string", "list", "tuple", "set"; an instance of mapping object is a dict() object.
A string as an iterable:
from collections import Counter
string = "this_is_a_counter"
counter = Counter(string)
print(counter)
>>>Counter({'_': 3, 't': 2, 'i': 2, 's': 2, 'h': 1, 'a': 1, 'c': 1, 'o': 1, 'u': 1, 'n': 1, 'e': 1, 'r': 1})
A list as an iterable:
from collections import Counter
list_ = ['a', 'b', 'a', 'x', 'b', 'a', 'x']
counter = Counter(list_)
print(counter)
>>>Counter({'a': 3, 'b': 2, 'x': 2})
A tuple as an iterable:
from collections import Counter
tuple_ = ('a', 'b', 'a', 'x', 'b', 'a', 'x')
counter = Counter(tuple_)
print(counter)
>>>Counter({'a': 3, 'b': 2, 'x': 2})
A set as an iterable:
from collections import Counter
set_ = {'a', 'b', 'a', 'x', 'b', 'a', 'x'}
counter = Counter(set_)
print(counter)
>>>Counter({'a': 1, 'b': 1, 'x': 1})
Here, as you can see, when set() object is an argument to the Counter object, the duplicates are removed and all occurrences are set to 1. Still the object is a valid argument.
A mapping object as an argument to the Counter:
from collections import Counter
dict_ = {'a':4, 'b':5}
counter = Counter(dict_)
print(counter)
>>>Counter({'b': 5, 'a': 4})
From keyword arguments:
from collections import Counter
counter = Counter(a=6, b=7, c=8)
print(counter)
>>>Counter({'c': 8, 'b': 7, 'a': 6})
Another way to create a Counter object from iterable objects is:
from collections import Counter
string = "This_is_a_string_object"
counter = Counter()
for val in string:
counter[val]+=1
print(counter)
>>>Counter({'_': 4, 'i': 3, 's': 3, 't': 2, 'T': 1, 'h': 1, 'a': 1, 'r': 1, 'n': 1, 'g': 1, 'o': 1, 'b': 1, 'j': 1, 'e': 1, 'c': 1})
Here, you can see that the default occurrence (or count) of an element in the iterable is zero.
How to leverage the functionality of the dict() for tallying elements in an iterable
In fact, we can use an instance of a dict() object to tally the occurrences of elements in an iterable; but this requires additional steps that involve initialization of the occurrences elements to 1.
Look at an example below:
dict_ = dict()
string = "counting_using_dict_object"
for item in string:
if not item in dict_:
dict_[item] = 1
else:
dict_[item]+=1
print(dict_)
>>>{'c': 3, 'o': 2, 'u': 2, 'n': 3, 't': 3, 'i': 3, 'g': 2, '_': 3, 's': 1, 'd': 1, 'b': 1, 'j': 1, 'e': 1}
In a Counter() object, the occurrence of an element that is missing in an iterable is by default '0', and therefore it does not raise a KeyError unlike the dict() object. For example, let's look at the value of an item in the counter that doesn't exist:
string = "This_is_a_string_object"
counter = Counter()
for val in string:
counter[val]+=1
print(counter['z'])
>>> 0
Methods implemented by Counter() object
Here, we will discuss the important functions defined under the Counter() class. These are the methods available to an instance of the Counter() object.
elements()
elements() is an important method of the Counter() object. This produces an iterable containing all the elements that are associated with the Counter() instance.
For example:
from collections import Counter
string = "This_is_a_string_object"
counter = Counter()
for val in string:
counter[val]+=1
print(list(counter.elements()))
>>>['T', 'h', 'i', 'i', 'i', 's', 's', 's', '_', '_', '_', '_', 'a', 't', 't', 'r', 'n', 'g', 'o', 'b', 'j', 'e', 'c']
The method Counter().elements() returns an object of the type 'itertools', and the object can be converted to an iterable object of our interest - for example a list() object.
2. most_common([n])
This method takes an optional integer argument and returns a list of tuples (key, count) of the key and count pairs, and organize in the descending order of the count size. The integer value specifies the fist n pairs of key and count. You can also selectively pick a (key, count) by specifying the index.
from collections import Counter
string = "This_is_a_string_object"
counter = Counter()
for val in string:
counter[val]+=1
print(counter.most_common())
print(counter.most_common(3))
print(counter.most_common()[3])
>>>[('_', 4), ('i', 3), ('s', 3), ('t', 2), ('T', 1), ('h', 1), ('a', 1), ('r', 1), ('n', 1), ('g', 1), ('o', 1), ('b', 1), ('j', 1), ('e', 1), ('c', 1)]
[('_', 4), ('i', 3), ('s', 3)]
('t', 2)
In the first output, we have a list of all the (key, count) pairs in the descending order of the count. In the second output lists the first three (key, count) pairs. The third output lists the (key, count) at index 3.
3. subtract()
Subtract() method is used to subtract a Counter instance from another.
from collections import Counter
string1 = "This_is_a_string_object"
string2 = "what_does_counter_do_here"
counter1 = Counter(string1)
counter2 = Counter(string2)
print(counter1)
print(counter2)
print()
counter1.subtract(counter2)
print(counter1)
print(counter2)
>>>Counter({'_': 4, 'i': 3, 's': 3, 't': 2, 'T': 1, 'h': 1, 'a': 1, 'r': 1, 'n': 1, 'g': 1, 'o': 1, 'b': 1, 'j': 1, 'e': 1, 'c': 1})
Counter({'_': 4, 'e': 4, 'o': 3, 'h': 2, 't': 2, 'd': 2, 'r': 2, 'w': 1, 'a': 1, 's': 1, 'c': 1, 'u': 1, 'n': 1})
Counter({'i': 3, 's': 2, 'T': 1, 'g': 1, 'b': 1, 'j': 1, '_': 0, 'a': 0, 't': 0, 'n': 0, 'c': 0, 'h': -1, 'r': -1, 'w': -1, 'u': -1, 'o': -2, 'd': -2, 'e': -3})
Counter({'_': 4, 'e': 4, 'o': 3, 'h': 2, 't': 2, 'd': 2, 'r': 2, 'w': 1, 'a': 1, 's': 1, 'c': 1, 'u': 1, 'n': 1})
Here, the first Counter instance, counter1 was replaced by the subtracted version. Also, one thing to notice here is, all the negative counts have been retained; the negative counts could mean two things: there are elements uniquely present in the second string; there are elements whose occurrence is higher in the second string than in the first string.
4. total()
total() method is used to sum the collective occurrence of all elements in an iterable.
from collections import Counter
string1 = "This_is_a_string_object"
string2 = "what_does_counter_do_here"
counter1 = Counter(string1)
counter2 = Counter(string2)
print(counter1.total())
print(counter2.total())
>>>23
25
5. update([an iterable-or-mapping])
While update() is a method available to the dict() class, it is also implemented by the Counter() class. It takes an iterable or mapping object and updates an existing Counter() object. This method can also takes a Counter() object as argument and update an existing Counter() object.
A Counter() instance as argument:
from collections import Counter
string1 = "This_is_a_string_object"
string2 = "what_does_counter_do_here"
counter1 = Counter(string1)
counter2 = Counter(string2)
print(counter1)
print(counter2)
print()
counter1.update(counter2)
print(counter1)
print(counter2)
>>>Counter({'_': 4, 'i': 3, 's': 3, 't': 2, 'T': 1, 'h': 1, 'a': 1, 'r': 1, 'n': 1, 'g': 1, 'o': 1, 'b': 1, 'j': 1, 'e': 1, 'c': 1})
Counter({'_': 4, 'e': 4, 'o': 3, 'h': 2, 't': 2, 'd': 2, 'r': 2, 'w': 1, 'a': 1, 's': 1, 'c': 1, 'u': 1, 'n': 1})
Counter({'_': 8, 'e': 5, 's': 4, 't': 4, 'o': 4, 'h': 3, 'i': 3, 'r': 3, 'a': 2, 'n': 2, 'c': 2, 'd': 2, 'T': 1, 'g': 1, 'b': 1, 'j': 1, 'w': 1, 'u': 1})
Counter({'_': 4, 'e': 4, 'o': 3, 'h': 2, 't': 2, 'd': 2, 'r': 2, 'w': 1, 'a': 1, 's': 1, 'c': 1, 'u': 1, 'n': 1})
A list iterable as argument:
from collections import Counter
string1 = "This_is_a_string_object"
string2 = "what_does_counter_do_here"
list_ = ['a', 'b', 'c', 'a', 'c']
counter1 = Counter(string1)
print(counter1)
print(list_)
print()
counter1.update(list_)
print(counter1)
>>>Counter({'_': 4, 'i': 3, 's': 3, 't': 2, 'T': 1, 'h': 1, 'a': 1, 'r': 1, 'n': 1, 'g': 1, 'o': 1, 'b': 1, 'j': 1, 'e': 1, 'c': 1})
['a', 'b', 'c', 'a', 'c']
Counter({'_': 4, 'i': 3, 's': 3, 'a': 3, 'c': 3, 't': 2, 'b': 2, 'T': 1, 'h': 1, 'r': 1, 'n': 1, 'g': 1, 'o': 1, 'j': 1, 'e': 1})
Counter objects can be compared
Two Counter objects can be compared using the following comparison operators.
Equality check: '==' and '!='
Less than/ less than or equal to : '<'/ '<='
Greater than/ Greater than or equal to: '>'/ '>='
Until Python 3.10, Counter(a=1, b=0) and Counter(a=1) returned False in an equality check('==') - the two Counter objects aren't equal. In Python 3.10 and later versions, missing element in a Counter was regarded as an element with zero count, and therefore returns True in an equality check.
from collections import Counter
counter1 = Counter(a=5, b=0)
counter2 = Counter(a=5)
print(bool(counter1 == counter2))
>>>True
A 'less than/ less than or equal to (</<=)' operator is used to check the inclusion of a Counter() object in another.
from collections import Counter
counter1 = Counter(a=5, b=6)
counter2 = Counter(a=2, b=4)
print(bool(counter2 <= counter1))
>>>True
Converting a Counter() object to other data structures.
A Counter() can be converted to other data structures - lis()t, tuple(), set(), dict()
Counter() object converted to a list()
from collections import Counter
counter1 = Counter(a=5, b=6)
counter2 = Counter(a=2, b=4)
print(list(counter1))
>>>['a', 'b']
Unlike the Counter().elements(), when a Counter() object is converted to a list() object, only the unique elements are listed.
Counter() object converted to tuple() object.
from collections import Counter
counter1 = Counter(a=5, b=6)
counter2 = Counter(a=2, b=4)
print(tuple(counter1))
>>>('a', 'b')
Counter() object converted to a set() object.
from collections import Counter
counter1 = Counter(a=5, b=6)
counter2 = Counter(a=2, b=4)
print(set(counter1))
>>>{'b', 'a'}
Counter() object converted to dict() object.
from collections import Counter
counter1 = Counter(a=5, b=6)
counter2 = Counter(a=2, b=4)
print(dict(counter1))
>>>{'a': 5, 'b': 6}
Counter() object converted to a list of tuples of (key, count) pairs.
from collections import Counter
counter1 = Counter(a=5, b=6)
counter2 = Counter(a=2, b=4)
print(counter1.items())
>>>dict_items([('a', 5), ('b', 6)])
Counter() object created from a list of tuples of (key, count) pairs
To create a Counter() object from a list of tuples of (key, count) pairs, the latter data need to be converted to a dict() object and then to Counter() object.
from collections import Counter
tup_data = [('a', 5), ('b', 6), ('c', 8)]
dict_ = dict(tup_data)
counter = Counter(dict_)
print(dict_)
print(counter)
>>>{'a': 5, 'b': 6, 'c': 8}
Counter({'c': 8, 'b': 6, 'a': 5})
Finding 'N' least-common elements in a Counter()
As was discussed before, the most_common([n]) can be invoked to find out the 'N' most commonly occurring keys and their counts. The same method can be called to find out the least common keys. All we need to do is to use a list slicer to crop out the least-commonly occurring keys as follows.
from collections import Counter
string = "find_out_the_least_common_keys"
counter = Counter(string)
array = counter.most_common()
n = 3
least_common = array[:-n-1:-1]
print(array)
print(least_common)
>>>[('_', 5), ('o', 3), ('t', 3), ('e', 3), ('n', 2), ('s', 2), ('m', 2), ('f', 1), ('i', 1), ('d', 1), ('u', 1), ('h', 1), ('l', 1), ('a', 1), ('c', 1), ('k', 1), ('y', 1)]
[('y', 1), ('k', 1), ('c', 1)]
Here, the idea is we list out the (key, count) pairs in the descending order. Then slice them such that we start in the reverse order, traverse through it and slice up to and not including the item at index (-N-1) for N least common elements.
Mathematical operations with Counter() objects
Mathematical operations can be performed with Counter() objects.
Addition of two Counter() objects
When adding two Counter() objects, the counts of the corresponding elements are added.
from collections import Counter
string = "_find_out_the_least_common_keys_"
string1 = "know_how_the_counts_can_be_added"
counter = Counter(string)
counter1 = Counter(string1)
print(counter.total())
print(counter1.total())
sum_ = counter + counter1
print(f"\n{counter}")
print(f"\n{counter1}\n")
print(f"{sum_.total()}")
print(sum_)
>>>32
32
Counter({'_': 7, 'o': 3, 't': 3, 'e': 3, 'n': 2, 's': 2, 'm': 2, 'f': 1, 'i': 1, 'd': 1, 'u': 1, 'h': 1, 'l': 1, 'a': 1, 'c': 1, 'k': 1, 'y': 1})
Counter({'_': 6, 'n': 3, 'o': 3, 'e': 3, 'd': 3, 'w': 2, 'h': 2, 't': 2, 'c': 2, 'a': 2, 'k': 1, 'u': 1, 's': 1, 'b': 1})
64
Counter({'_': 13, 'o': 6, 'e': 6, 'n': 5, 't': 5, 'd': 4, 'h': 3, 'a': 3, 's': 3, 'c': 3, 'u': 2, 'm': 2, 'k': 2, 'w': 2, 'f': 1, 'i': 1, 'l': 1, 'y': 1, 'b': 1})
Here, the missing elements have been assigned with zero counts.
When there are negative counts
In the example below, one of the Counter() instance has a negative count.
from collections import Counter
counter = Counter(a = 5, b = 4, c = -2)
counter1 = Counter(a = 5, b = 4, c = 22)
sum_ = counter + counter1
print(counter.total())
print(counter1.total())
sum_ = counter + counter1
print(f"\n{counter}")
print(f"\n{counter1}\n")
print(f"{sum_.total()}")
print(sum_)
>>>7
31
Counter({'a': 5, 'b': 4, 'c': -2})
Counter({'c': 22, 'a': 5, 'b': 4})
38
Counter({'c': 20, 'a': 10, 'b': 8})
Subtraction between two Counter() objects
Unlike the subtract() method, an explicit subtraction filters out all the zero and negative counts in the resulting Counter() object after subtraction.
from collections import Counter
string = "_find_out_the_least_common_keys_"
string1 = "know_how_the_counts_can_be_added"
counter = Counter(string)
counter1 = Counter(string1)
print(counter.total())
print(counter1.total())
sum_ = counter - counter1
print(f"\n{counter}")
print(f"\n{counter1}\n")
print(f"{sum_.total()}")
print(sum_)
>>>32
32
Counter({'_': 7, 'o': 3, 't': 3, 'e': 3, 'n': 2, 's': 2, 'm': 2, 'f': 1, 'i': 1, 'd': 1, 'u': 1, 'h': 1, 'l': 1, 'a': 1, 'c': 1, 'k': 1, 'y': 1})
Counter({'_': 6, 'n': 3, 'o': 3, 'e': 3, 'd': 3, 'w': 2, 'h': 2, 't': 2, 'c': 2, 'a': 2, 'k': 1, 'u': 1, 's': 1, 'b': 1})
9
Counter({'m': 2, '_': 1, 'f': 1, 'i': 1, 't': 1, 'l': 1, 's': 1, 'y': 1})
Here, only positive counts are retained in the resulting Counter() object
Multiplication(*) and Division(/) operations are not permissible for Counter() objects.
from collections import Counter
string = "_find_out_the_least_common_keys_"
string1 = "know_how_the_counts_can_be_added"
counter = Counter(string)
counter1 = Counter(string1)
print(counter.total())
print(counter1.total())
sum_ = counter / counter1
print(sum_)
>>>TypeError: unsupported operand type(s) for /: 'Counter' and 'Counter'
Finding the minimum of two Counter() objects
We use '&' operator to find the minimum counts of each element present in both the Counter() objects. Here, if any element has zero count or negative count, the corresponding elements are omitted from the operation. This is due to the fact that when the count of an element is zero, it is deemed as a missing element, and therefore is not counted.
counter = Counter(d = 6, b = 4, c = 5)
counter1 = Counter(d = 5, b = 1, c = 2)
minimum_ = counter & counter1
print(minimum_)
>>>Counter({'d': 5, 'c': 2, 'b': 1})
When one of the corresponding element has negative count
counter = Counter(d = 6, b = 4, c = -5)
counter1 = Counter(d = 5, b = 1, c = 2)
minimum_ = counter & counter1
print(minimum_)
>>>Counter({'d': 5, 'b': 1})
When one of the corresponding element has zero count
counter = Counter(d = 6, b = 4, c = 0)
counter1 = Counter(d = 5, b = 1, c = 2)
minimum_ = counter & counter1
print(minimum_)
>>>Counter({'d': 5, 'b': 1})
Finding the largest of the corresponding elements in two Counter() objects
"|" is used to find the corresponding elements with the largest counts, in two Counter() objects. Unlike in the example involving the "&" operator, maximum of the corresponding elements is calculated regardless of the count being zero or negative. Corresponding elements are omitted from the operation only when both the elements have zero counts.
counter = Counter(d = 6, b = 4, c = 5)
counter1 = Counter(d = 5, b = 1, c = 2)
max_ = counter | counter1
print(max_)
>>>Counter({'d': 6, 'c': 5, 'b': 4})
When one of the corresponding elements has zero count.
counter = Counter(d = 6, b = 4, c = 0)
counter1 = Counter(d = 5, b = 1, c = 2)
max_ = counter | counter1
print(max_)
>>>Counter({'d': 6, 'b': 4, 'c': 2})
When the corresponding elements have zero counts.
counter = Counter(d = 6, b = 4, c = 0)
counter1 = Counter(d = 5, b = 1, c = 0)
max_ = counter | counter1
print(max_)
>>>Counter({'d': 6, 'b': 4})
Unary plus and Unary minus operation on Counter() object
A Unary plus(+Counter()) operation on a Counter() object performs addition of the Counter() object with a zero-counter; a zero-counter is a Counter() object whose all corresponding elements have zero counts. This process eliminates all elements with zero or negative counts.
counter = Counter(d = 6, b = -4, c = 0)
unary_plus = +counter
print(unary_plus)
>>>Counter({'d': 6})
A Unary plus(-Counter()) operation on a Counter() object performs a subtraction of a Counter() object from a zero-counter. This eliminates the positive and zero counts from the Counter() object. The original Counter() remains the same.
counter = Counter(d = 6, b = -4, c = 0)
unary_minus = -counter
print(unary_minus)
>>>Counter({'b': 4})
Thank You!!!
댓글