Entropy


Entropy in Information Theory

In information theory, the entropy of a random variable is the average
level of "information", "surprise", or "uncertainty" inherent to the variable's possible outcomes.

Given a discrete random variable X, which takes values in the alphabet
and is distributed according to  , the entropy is








where ∑ denotes the sum over the variable's possible values.


The choice of base for log, the logarithm, varies for different applications.


  • Base 2 gives the unit of bits (or "shannons"), while 
  • Base e gives "natural units" nat, and 
  • Base 10 gives units of "dits", "bans", or "hartleys". 


An equivalent definition of entropy is the expected value of the self-information of a variable.


 Entropy in Machine Learning

Entropy is a measure of the uncertainty in a variable.

Entropy is measured in bits and comes as a number between zero and 1.

Note: Entropy bits are not the same bits as used in computing terminology.


Entropy is given by the following equation, where n is the number of outcomes and P(xi) is the probability of the outcome i


Common values for b are 2 , e , and 10 .

Because the log of a number less than one will be negative, the entire sum is

negated to return a positive value.



For example, a single toss of a fair coin has only two outcomes: heads and tails. The probability that the coin will land on heads is 0.5, and the probability that it will land on tails is 0.5. The entropy of the coin toss is equal to the following:



That is, only one bit is required to represent the two equally probable outcomes, heads and tails.


Two tosses of a fair coin can result in four possible outcomes: heads and heads, heads and tails, tails and heads, and tails and tails. The probability of each outcome is 0.5 x 0.5 = 0.25.

The entropy of two tosses is equal to the following:


If the coin has the same face on both sides, the variable representing its outcome has 0 bits of entropy; that is, we are always certain of the outcome and the variable will never represent new information.


Entropy can also be represented as a fraction of a bit.


For example, an unfair coin has two different faces, but is weighted such that the faces are not equally likely to land in a toss. Assume that the probability that an unfair coin will land on heads is 0.8, and the probability that it will land on tails is 0.2


The entropy of a single toss of this coin is equal to the following:




The outcome of a single toss of an unfair coin can have a fraction of one bit of entropy. 


For the unfair type coin, there are two possible outcomes of the toss, but we are NOT totally uncertain since one outcome is more frequent.


Coin

Weight ratio

Probability ratio

Entropy

Certainty

Fair

50 : 50

0.5 : 0.5

1.0

Less certain

Unfair

80 : 20

0.8 : 0.2

0.73

More certain







Comments

Popular Posts