Entropy

Posted by Barun Samanta May 27, 2024

Entropy

Entropy in Information Theory

In information theory, the entropy of a random variable is the average

level of "information", "surprise", or "uncertainty" inherent to the variable's possible outcomes.

Given a discrete random variable X, which takes values in the alphabet

and is distributed according to

, the entropy is

where ∑ denotes the sum over the variable's possible values.

The choice of base for log, the logarithm, varies for different applications.

Base 2 gives the unit of bits (or "shannons"), while
Base e gives "natural units" nat, and
Base 10 gives units of "dits", "bans", or "hartleys".

An equivalent definition of entropy is the expected value of the self-information of a variable.

Entropy in Machine Learning

Entropy is a measure of the uncertainty in a variable.

Entropy is measured in bits and comes as a number between zero and 1.

Note: Entropy bits are not the same bits as used in computing terminology.

Entropy is given by the following equation, where n is the number of outcomes and P(xi) is the probability of the outcome i .

Common values for b are 2 , e , and 10 .

Because the log of a number less than one will be negative, the entire sum is

negated to return a positive value.

For example, a single toss of a fair coin has only two outcomes: heads and tails. The probability that the coin will land on heads is 0.5, and the probability that it will land on tails is 0.5. The entropy of the coin toss is equal to the following:

That is, only one bit is required to represent the two equally probable outcomes, heads and tails.

Two tosses of a fair coin can result in four possible outcomes: heads and heads, heads and tails, tails and heads, and tails and tails. The probability of each outcome is 0.5 x 0.5 = 0.25.

The entropy of two tosses is equal to the following:

If the coin has the same face on both sides, the variable representing its outcome has 0 bits of entropy; that is, we are always certain of the outcome and the variable will never represent new information.

Entropy can also be represented as a fraction of a bit.

For example, an unfair coin has two different faces, but is weighted such that the faces are not equally likely to land in a toss. Assume that the probability that an unfair coin will land on heads is 0.8, and the probability that it will land on tails is 0.2.

The entropy of a single toss of this coin is equal to the following:

The outcome of a single toss of an unfair coin can have a fraction of one bit of entropy.

For the unfair type coin, there are two possible outcomes of the toss, but we are NOT totally uncertain since one outcome is more frequent.

Coin	Weight ratio	Probability ratio	Entropy	Certainty
Fair	50 : 50	0.5 : 0.5	1.0	Less certain
Unfair	80 : 20	0.8 : 0.2	0.73	More certain

Search This Blog

Python, Machine Learning, Statistics, Data Science, LLM

Entropy

Comments

Post a Comment

Popular Posts

Random numbers generation with numpy