Security News
Fluent Assertions Faces Backlash After Abandoning Open Source Licensing
Fluent Assertions is facing backlash after dropping the Apache license for a commercial model, leaving users blindsided and questioning contributor rights.
This repository contains Python bindings for working with Nomic Atlas, the world’s most powerful unstructured data interaction platform. Atlas supports datasets from hundreds to tens of millions of points, and supports data modalities ranging from text to image to audio to video.
With Nomic Atlas, you can:
Try the :notebook: Colab Demo to get started in Python
Read the :closed_book: Atlas Docs
Join our :hut: Discord to start chatting and get help
:world_map: Map of Twitter (5.4 million tweets)
:world_map: Map of StableDiffusion Generations (6.4 million images)
:world_map: Map of NeurIPS Proceedings (16,623 abstracts)
Here are just a few of the features which Atlas offers:
pip install nomic
nomic login
nomic login [token]
from nomic import atlas
import numpy as np
# Randomly generate a set of 10,000 high-dimensional embeddings
num_embeddings = 10000
embeddings = np.random.rand(num_embeddings, 256)
# Create Atlas project
dataset = atlas.map_data(embeddings=embeddings)
print(dataset)
Atlas stores, manages and generates embeddings for your unstructured data.
You can access Atlas latent embeddings (e.g. high dimensional) or the two-dimensional embeddings generated for web display.
# Access your Atlas map and download your embeddings
map = dataset.maps[0]
projected_embeddings = map.embeddings.projected
latent_embeddings = map.embeddings.latent
print(projected_embeddings)
# Response:
id x y
0 9.815330 -8.105308
1 -8.725819 5.980628
2 13.199472 -1.103389
... ... ... ...
print(latent_embeddings)
# Response:
n x d numpy.ndarray where n = number of datapoints and d = number of latent dimensions
Atlas automatically organizes your data into topics informed by the latent contents of your embeddings. Visually, these are represented by regions of homogenous color on an Atlas map.
You can access and operate on topics programmatically by using the topics
attribute
of an AtlasMap.
# Access your Atlas map
map = dataset.maps[0]
# Access a pandas DataFrame associating each datum on your map to their topics at each topic depth.
topic_df = map.topics.df
print(map.topics.df)
Response:
id topic_depth_1 topic_depth_2
0 Oil Prices mergers and acquisitions
1 Iraq War Trial of Thatcher
2 Oil Prices Economic Growth
... ... ... ...
9997 Oil Prices Economic Growth
9998 Baseball Giambi's contract
9999 Olympic Gold Medal European Football
Use Atlas to automatically find nearest neighbors in your vector database.
# Load map and perform vector search for the five nearest neighbors of datum with id "my_query_point"
map = dataset.maps[0]
with dataset.wait_for_dataset_lock():
neighbors, _ = map.embeddings.vector_search(ids=['my_query_point'], k=5)
# Return similar data points
similar_datapoints = dataset.get_data(ids=neighbors[0])
print(similar_datapoints)
Response:
Original query point:
"Intel abandons digital TV chip project NEW YORK, October 22 (newratings.com) - Global semiconductor giant Intel Corporation (INTC.NAS) has called off its plan to develop a new chip for the digital projection televisions."
Nearest neighbors:
"Intel awaits government move on expensing options Figuring it's had enough of fighting over options, the chip giant is waiting to see what Congress comes up with."
"Citigroup Takes On Intel The financial services giant takes over non-memory semiconductor chip production."
"Intel Seen Readying New Wi-Fi Chips SAN FRANCISCO (Reuters) - Intel Corp. this week is expected to introduce a chip that adds support for a relatively obscure version of Wi-Fi, analysts said on Monday, in a move that could help ease congestion on wireless networks."
"Intel pledges to bring Itanic down to Xeon price-point EM64T a stand-in until the real anti-AMD64 kit arrives"
Atlas is developed by the Nomic AI team, which is based in NYC. Nomic also developed and maintains GPT4All, an open-source LLM chatbot ecosystem.
Join the discussion on our :hut: Discord to ask questions, get help, and chat with others about Atlas, Nomic, GPT4All, and related topics. Our doors are open to enthusiasts of all skill levels.
FAQs
The official Nomic python client.
We found that nomic demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Fluent Assertions is facing backlash after dropping the Apache license for a commercial model, leaving users blindsided and questioning contributor rights.
Research
Security News
Socket researchers uncover the risks of a malicious Python package targeting Discord developers.
Security News
The UK is proposing a bold ban on ransomware payments by public entities to disrupt cybercrime, protect critical services, and lead global cybersecurity efforts.