jaclearn.embedding package

Submodules

jaclearn.embedding.constant module

jaclearn.embedding.embedding_utils module

jaclearn.embedding.embedding_utils.init_random(elements_to_embed, embedding_size, add_all_zeros=False, add_unknown=False)[source]

Initialize a random embedding matrix for a collection of elements. Elements are sorted in order to ensure the same mapping from indices to elements each time.

Parameters:
  • elements_to_embed – collection of elements to construct the embedding matrix for
  • embedding_size – size of the embedding
  • add_all_zeros – add a all_zero embedding at index 0
  • add_unknown – add unknown embedding at the last index
Returns:

an embedding matrix and a dictionary mapping elements to rows in the matrix

jaclearn.embedding.embedding_utils.make_element2idx(elements_to_embed, add_all_zeros=False, add_unknown=False)[source]

jaclearn.embedding.visualize_tb module

jaclearn.embedding.visualize_tb.visualize_word_embedding_tb(emb, log_dir)[source]

jaclearn.embedding.word_embedding module

jaclearn.embedding.word_embedding.load(path, word_index_only=False, filter=None, format='glove')[source]

Loads pre-trained embeddings from the specified path.

jaclearn.embedding.word_embedding.load_word_index(path, filter=None, format='glove')[source]

Loads only the word index from the embeddings file

@return word to index dictionary

jaclearn.embedding.word_embedding.map(word, word2idx)[source]

Get the word index for the given word. Maps all numbers to 0, lowercases if necessary.

Parameters:
  • word – the word in question
  • word2idx – dictionary constructed from an embeddings file
Returns:

integer index of the word

jaclearn.embedding.word_embedding.map_sequence(word_sequence, word2idx)[source]

Get embedding indices for the given word sequence.

Parameters:
  • word_sequence – sequence of words to process
  • word2idx – dictionary of word mapped to their embedding indices
Returns:

a sequence of embedding indices