CHARACTERISTICS OF THE MAIN NLP MODELS
08.06.2022 13:46
[1. Інформаційні системи і технології]
Автор: Radoutskyi K. E., senior lecturer, V. N. Karazin Kharkiv National University, Kharkiv, Ukraine; Radoutska A.K., student, Kharkiv National University of Radio electronics, Kharkiv, Ukraine
One of the tasks of language modeling is to predict the next word based on knowledge of the previous text. This is necessary for correcting typos, auto-completion, chat bots, etc. Therefore, in this work, I want to determine the pros and cons of the most popular models.
1. Recurrent Neural Network Language Model (RNNLM)
Advantages:
- Simplicity.
- Good learnability and embedding generation.
- Availability of pre-trained versions.
Disadvantages:
- Does not take into consideration long-term dependencies.
- Simplicity limits the possibilities of use.
- The new models are much more versatile and powerful.
2. word2vec
Advantages:
- Convenient architecture.
- Fast learnability of the model and easy generation of embeddings.
- A simple deciphering of controversial points.
- Versatility, useful in many areas.
Disadvantages:
- Lack of context for the use of the word, the impossibility of determining the meaning of the word if it has more than one meaning.
- It is complicated to process rare words.
3. GloVe
Advantages:
- Simple architecture without a neural network.
- The model is fast and this may be sufficient for simple applications.
- More meaningful embeddings.
Disadvantages:
- While the co-occurrence matrix provides global information, GloVe remains trained at the word level and adds data about the sentence and the context in which the word is used.
- Handles unknown and rare words poorly.
4. FastText
Advantages:
- Relatively simple architecture: one input, one hidden layer, one output.
- Because of n-grams, it works well on rare words.
Disadvantages:
- Lack of context for the use of the word, the impossibility of determining the meaning of the word if it has more than one meaning.
- Embedding works much better than GloVe and Word2Vec on rare and non-dictionary words thanks to the n-gram method.