OpenAI has introduced embeddings, a new endpoint in the OpenAI API, to assist in semantic search, clustering, topic modeling, and classification.
OpenAI’s embeddings outperform top models in three standard benchmarks, including a 20% relative improvement in code search. Embeddings are really useful for working with natural language and code.
The embeddings that are numerically similar are also semantically similar. For example, the embedding vector of “canine companions say” will be more similar to the embedding vector of “woof” than “meow.” The new endpoint by OpenAI uses neural network models to map text and code to a vector representation—“embedding” them in a high-dimensional space. Each dimension captures some aspect of the input.