ForBo7 // Salman Naqvi
  • Home
  • ForBlog
  • Playground
  • Dictionary
  • About

The AI Dictionary © 2023 by Salman Naqvi is licensed under CC BY-NC 4.0

The AI Dictionary

AI terms and jargon simply explained.

I often find explanations online to be more complicated than they need to be. Here, I hope to fix that. New terms will continue to be added over time.

Click terms to view expanded definitions.

Do let me know of any corrections and improvements, and of any terms you would like added!

Accuracy

A type of metric. It is a value that tells us how often a model produces correct predictions. The higher the accuracy, the better.

Architecture

A model that is used as a template or a starting point for another model.

Bagging

An ensembling technique. When bagging, each model is trained on random subset of the rows, and a random subset of the columns, with replacement.

Cross Entropy Loss

A technique for calculating the loss for categorical models with multiple categories.

Decision Tree

A type of model that acts like an if-else statement.

Decoder (Transformers)

A compoent of a trasnformer that is used for generating text. An example is the autocomplete feature on a smartphone’s keyboard.

Document

The name given to a piece or collection of text. It can range from anything from a single word to a sentence to a paragraph to a page of text to a full book, and so on. Also referred to as sequence.

Dot Product

The operation given to the process of taking the product of each corresponding element in a vector, and summing all products. Also known as linear combination.

Embedding

A table, or matrix, where each row represents an item and each column describes the items in some way. The real magic of embeddings happen when you combine two embeddings together in some way to obtain further information.

Encoder (Transformers)

A component of a transformer that is used for “understanding” text. Encoders are typically used for classifying sentences by sentiment and figuring out what parts of a sentence refers, for example, to a person or location.

Ensemble

A collection of models whos’ predictions are averaged to obtain the final prediction.

Error Rate

A type of metric. It is a value that tells us how often a model produces incorrect predictions. The lower the error rate, the better.

Gradient

A numerical value which adjusts the parameters of a model. How much it adjusts is controlled by the learning rate.

Gradient Accumulation

A technique for running or fitting large models on a not-so-powerful GPU.

Gradient Boosting Machine (GBM)

An ensembling technique where instead of averaging the predictions of all models, each successive model predicts the error of the previous model. The errors are then summed to obtain the final prediction.

K-Fold Cross Validation

An ensembling technique where models are trained on a different set percent of the dataset. For example each model is trained on a different 80% of the dataset.

Learning Rate

A numerical value which controls how much the gradients update the parameters of a model.

Linear Combination

The operation given to the process of taking the product of each corresponding element in a vector, and summing all products. Also known as dot product.

Loss

A meaasure of performance of a model. It is used by the model to improve itself. Typically, the lower the loss, the better.

Matrix

A table of values. See also vector

Mean Absolute Error (MAE)

A type of metric. It is a value that tells us, on average, how close a set of predicted values is from the actual values. The smaller the MAE, the better.

Mean Squared Error (MSE)

A type of metric. It is a value that tells us, on average, how close a set of predicted values is from the actual values. The smaller the MSE, the better.

Metric

A measure of performance of a model. It is used by humans to judge the performance of the model.

Model

A mathematical equation that mimicks a real life phenomenon. This equation can be used to predict desired quantities.

Named Entity Recognition (NER)

A NLP classification task where a sentence is broken into its components, and the model attempts to assign each component to a specific entity (e.g., person, place, organization).

Numericalization

A process where numbers are assigned to each token. Occurs after tokenization.

One Hot Encoding

A data processing technique where each class in a categorical feature is given its own column that contains true and false values.

OneR Classifier

The simplest type of decision tree. The tree only contains a single split.

Random Forest

The name given to a bagged ensemble of decision trees.

Root Mean Squared Error (RMSE)

A type of metric. It is a value that tells us, on average, how close a set of predicted values is from the actual values. The smaller the RMSE, the better.

Root Mean Squared Logarithmic Error (RMSLE)

A type of metric. It is a value that tells us, on average, how close a set of predicted values is from the actual values. The smaller the RMSLE, the better.

Sequence

The name given to a piece or collection of text. It can range from anything from a single word to a sentence to a paragraph to a page of text to a full book, and so on. Also referred to as document.

Softmax

A function that calculates the probabilities of a set of predictions.

Tabular Data

Data in the form of a table.

Tabular Model

A model trained on tabular data. It is used to predict a specified column in the data.

Tokenization

Splitting a document into its component words.

Transformer

The name given to a Natural Language Processing (NLP) architecture that, in a nutshell, either fills-in-the-blanks or autocompletes text. Transformers consist of either an encoder, decoder, or both.

Vector

A table of values that has either a single row or a single column. See also matrix.

Weight Decay

A technique for making sure your weights do not grow too large, and in turn overfit the data.

Zero-shot

A prefix given to a pretrained model that can be used without finetuning.

A compressed image.

Using a trained model for predictions.

No matching items
Back to top
ForBo7 // Salman Naqvi © 2023 and ForBlog™ by Salman Naqvi | Site Version 2.2.0.1 | Site Feedback | Website made by me!