# Cross Entropy Loss

Let’s say that we have a model that tells us what sort of vehicle is in a picture. It outputs the following predictions.

Vehicle | Actuals | Prediction |
---|---|---|

`car` |
0 | \(-4.89\) |

`bus` |
1 | \(2.60\) |

`truck` |
0 | \(0.59\) |

`motorbike` |
0 | \(-2.07\) |

`bicycle` |
0 | \(-4.57\) |

Actuals is a one hot encoded column that tells us what is the correct vehicle in the picture.

To convert these predictions into loss, first take the softmax of each prediction.

Vehicle | Actuals | Prediction | Softmax |
---|---|---|---|

`car` |
0 | \(-4.89\) | \(4.88 \cdot 10^{-4}\) |

`bus` |
1 | \(2.60\) | \(0.874\) |

`truck` |
0 | \(0.59\) | \(0.117\) |

`motorbike` |
0 | \(-2.07\) | \(8.19 \cdot 10^{-3}\) |

`bicycle` |
0 | \(-4.57\) | \(6.72 \cdot 10^{-4}\) |

Next take the logarithm of each softmax value.

Vehicle | Actuals | Prediction | Softmax | \(\ln(\text{Softmax})\) |
---|---|---|---|---|

`car` |
0 | \(-4.89\) | \(4.88 \cdot 10^{-4}\) | \(-7.63\) |

`bus` |
1 | \(2.60\) | \(0.874\) | \(-1.35\) |

`truck` |
0 | \(0.59\) | \(0.117\) | \(-2.14\) |

`motorbike` |
0 | \(-2.07\) | \(8.19 \cdot 10^{-3}\) | \(-4.81\) |

`bicycle` |
0 | \(-4.57\) | \(6.72 \cdot 10^{-4}\) | \(-7.31\) |

Multiply the actuals with the computed logarithms.

Vehicle | Actuals | Prediction | Softmax | \(\ln(\text{Softmax})\) | \(\text{Actuals} \cdot \ln(\text{Softmax})\) |
---|---|---|---|---|---|

`car` |
0 | \(-4.89\) | \(4.88 \cdot 10^{-4}\) | \(-7.63\) | \(0\) |

`bus` |
1 | \(2.60\) | \(0.874\) | \(-1.35\) | \(-1.35\) |

`truck` |
0 | \(0.59\) | \(0.117\) | \(-2.14\) | \(0\) |

`motorbike` |
0 | \(-2.07\) | \(8.19 \cdot 10^{-3}\) | \(-4.81\) | \(0\) |

`bicycle` |
0 | \(-4.57\) | \(6.72 \cdot 10^{-4}\) | \(-7.31\) | \(0\) |

Sum the the results of the multiplications.

\[ 0 + -1.35 + 0 + 0 + 0 = -1.35 \]

And there you have your loss!

Back to top