You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
convert "the cat sat on the mat" to ["the", "cat", "sat", "on", "the", "mat"]
things that may need to be considered: upper vs lower case (Apple or apple), stops words ("the", "a", "of", etc), and typo correction ("gooood" or "good").
Align all sample sentences to the same length / number of tokens. Perform zero padding on sentences that are too short by filling missing words with 0's.
Step 5: Word Embedding
v is the amount of words in a dictionary & length of a one-hot encoded vector
d is the dismension of a vector which represents a word
from the matric multiplication, we select out a specific word vector
Simple RNN
There's only one set of A in a RNN model. The values in A are initialized in the beginning by random values and adjusted during training.
LSTM
Conveyor Belt: information directly flows from the past to the future
Forget Gate:
For example, if a = [1, 3, 0, -2], we get:
## In the above, 0.2 will not go through because it's matched with 0. -0.5 will fully go through because it's matched with 1.
It's more complicated than just numbers in a real scenario
The vector of the previous state is concatenated with the current word's vector, multiplied with a set of weight, and then goes through the sigmoid activation function to become the ft.
Input Gate: decides which values of the conveyor belt to update
There are two sets of operations here, one with sigmoid (it) and one with tanh (ćt).
Now that we have everything, we can now find Ct.
Output Gate: decide what flows from the conveyor belt to the state
Ot has the exact same calculations as the previous ones.