Using a pre-trained word embedding (word2vec or Glove) in TensorFlow

QBoard » Artificial Intelligence & ML » AI and ML - Tensorflow » Using a pre-trained word embedding (word2vec or Glove) in TensorFlow

User Dashboard

Using a pre-trained word embedding (word2vec or Glove) in TensorFlow

Back To Topics

Tags : None

Tarun Reddy

84
I've recently reviewed an interesting implementation for convolutional text classification. However all TensorFlow code I've reviewed uses a random (not pre-trained) embedding vectors like the following:
```
with tf.device('/cpu:0'), tf.name_scope("embedding"):
    W = tf.Variable(
        tf.random_uniform([vocab_size, embedding_size], -1.0, 1.0),
        name="W")
    self.embedded_chars = tf.nn.embedding_lookup(W, self.input_x)
    self.embedded_chars_expanded = tf.expand_dims(self.embedded_chars, -1)
```
Does anybody know how to use the results of Word2vec or a GloVe pre-trained word embedding instead of a random one?
August 3, 2021 10:49 PM IST

0
Samar Patil

346 3
I use this method to load and share embedding.
```
W = tf.get_variable(name="W", shape=embedding.shape, initializer=tf.constant_initializer(embedding), trainable=False)
```
August 17, 2021 1:06 PM IST

0
Maryam Bains

317
The answer of @mrry is not right because it provoques the overwriting of the embeddings weights each the network is run, so if you are following a minibatch approach to train your network, you are overwriting the weights of the embeddings. So, on my point of view the right way to pre-trained embeddings is:
```
embeddings = tf.get_variable("embeddings", shape=[dim1, dim2], initializer=tf.constant_initializer(np.array(embeddings_matrix))
```
October 30, 2021 2:29 PM IST

0
Advika Banerjee

319 1
With tensorflow version 2 its quite easy if you use the Embedding layer
```
X=tf.keras.layers.Embedding(input_dim=vocab_size,
                            output_dim=300,
                            input_length=Length_of_input_sequences,
                            embeddings_initializer=matrix_of_pretrained_weights
                            )(ur_inp)
```
November 15, 2021 12:41 PM IST

0
Viaan Prakash

461
2.0 Compatible Answer: There are many Pre-Trained Embeddings, which are developed by Google and which have been Open Sourced.

Some of them are Universal Sentence Encoder (USE), ELMO, BERT, etc.. and it is very easy to reuse them in your code.

Code to reuse the Pre-Trained Embedding, Universal Sentence Encoder is shown below:
```
!pip install "tensorflow_hub>=0.6.0"
  !pip install "tensorflow>=2.0.0"

  import tensorflow as tf
  import tensorflow_hub as hub

  module_url = "https://tfhub.dev/google/universal-sentence-encoder/4"
  embed = hub.KerasLayer(module_url)
  embeddings = embed(["A long sentence.", "single-word",
                      "http://example.com"])
  print(embeddings.shape)  #(3,128)
```
For more information the Pre-Trained Embeddings developed and open-sourced by Google, refer TF Hub Link.
August 16, 2021 3:07 PM IST

0

Cluzters.ai

Cluzters.ai is the first step towards uniting various Industry participants in the field of Applied Data Innovations. It is a gamified community geared towards creating a level playing turf for Data science professionals.

Member Sign In

Member Sign In

Create Account

Using a pre-trained word embedding (word2vec or Glove) in TensorFlow

Connect With Us