ForBo7 // Salman Naqvi
  • Home
  • ForBlog
  • Playground
  • Dictionary
  • About

The AI Dictionary © 2022 by Salman Naqvi is licensed under CC BY-NC 4.0 | This dictionary is currently under soft launch.

The AI Dictionary

I often find explanations online to be more complicated than they need to be. Here, I hope to fix that. New terms will continue to be added over time.

Click terms to jump to other relevant terms and view expanded definitions.

Do let me know of any corrections and improvements!

Accuracy

A type of metric. It is a value that tells us how often a model produces correct predictions. The higher the accuracy, the better.

Architecture

A model that is used as a template or a starting point for another model.

Bagging

An ensembling technique. When bagging, each model is trained on random subset of the rows, and a random subset of the columns, with replacement.

Decision Tree

A type of model that acts like an if-else statement.

Document

The name given to a piece or collection of text. It can range from anything from a single word to a sentence to a paragraph to a page of text to a full book, and so on.

Ensemble

A collection of models whos’ predictions are averaged to obtain the final prediction.

Error Rate

A type of metric. It is a value that tells us how often a model produces incorrect predictions. The lower the error rate, the better.

Gradient

A numerical value which adjusts the parameters of a model. How much it adjusts is controlled by the learning rate.

Gradient Accumulation

A technique for running or fitting large models on a not-so-powerful GPU.

Gradient Boosting Machine (GBM)

An ensembling technique where instead of averaging the predictions of all models, each successive model predicts the error of the previous model. The errors are then summed to obtain the final prediction.

K-Fold Cross Validation

An ensembling technique where models are trained on a different set percent of the dataset. For example each model is trained on a different 80% of the dataset.

Learning Rate

A numerical value which controls how much the gradients update the parameters of a model.

Loss

A meaasure of performance of a model. It is used by the model to improve itself. Typically, the lower the loss, the better.

Mean Absolute Error (MAE)

A type of metric. It is a value that tells us, on average, how close a set of predicted values is from the actual values. The smaller the MAE, the better.

Mean Squared Error (MSE)

A type of metric. It is a value that tells us, on average, how close a set of predicted values is from the actual values. The smaller the MSE, the better.

Metric

A measure of performance of a model. It is used by humans to judge the performance of the model.

Model

A mathematical equation that mimicks a real life phenomenon. This equation can be used to predict desired quantities.

Numericalization

A process where numbers are assigned to each token. Occurs after tokenization.

One Hot Encoding

A data processing technique where each class in a categorical feature is given its own column that contains true and false values.

OneR Classifier

The simplest type of decision tree. The tree only contains a single split.

Random Forest

The name given to a bagged ensemble of decision trees.

Root Mean Squared Error (RMSE)

A type of metric. It is a value that tells us, on average, how close a set of predicted values is from the actual values. The smaller the RMSE, the better.

Root Mean Squared Logarithmic Error (RMSLE)

A type of metric. It is a value that tells us, on average, how close a set of predicted values is from the actual values. The smaller the RMSLE, the better.

Tabular Data

Data in the form of a table.

Tabular Model

A model trained on tabular data. It is used to predict a specified column in the data.

Tokenization

Splitting a document into its component words.

No matching items
ForBo7 // Salman Naqvi © 2022 and ForBlog™ by Salman Naqvi | Site Version 2.0.3.1 | Site Feedback | Website made by me!