
Security News
NVD Quietly Sweeps 100K+ CVEs Into a “Deferred” Black Hole
NVD now marks all pre-2018 CVEs as "Deferred," signaling it will no longer enrich older vulnerabilities, further eroding trust in its data.
Sketch is an AI code-writing assistant for pandas users that understands the context of your data, greatly improving the relevance of suggestions. Sketch is usable in seconds and doesn't require adding a plugin to your IDE.
pip install sketch
Here we follow a "standard" (hypothetical) data-analysis workflow, showing a Natural Language interface that successfully navigates many tasks in the data stack landscape.
https://user-images.githubusercontent.com/916073/212602281-4ebd090f-09c4-495d-b48d-0b4c37b9f665.mp4
It's as simple as importing sketch, and then using the .sketch
extension on any pandas dataframe.
import sketch
Now, any pandas dataframe you have will have an extension registered to it. Access this new extension with your dataframes name .sketch
.sketch.ask
Ask is a basic question-answer system on sketch, this will return an answer in text that is based off of the summary statistics and description of the data.
Use ask to get an understanding of the data, get better column names, ask hypotheticals (how would I go about doing X with this data), and more.
df.sketch.ask("Which columns are integer type?")
.sketch.howto
Howto is the basic "code-writing" prompt in sketch. This will return a code-block you should be able to copy paste and use as a starting point (or possibly ending!) for any question you have to ask of the data. Ask this how to clean the data, normalize, create new features, plot, and even build models!
df.sketch.howto("Plot the sales versus time")
.sketch.apply
apply is a more advanced prompt that is more useful for data generation. Use it to parse fields, generate new features, and more. This is built directly on lambdaprompt. In order to use this, you will need to set up a free account with OpenAI, and set an environment variable with your API key. OPENAI_API_KEY=YOUR_API_KEY
df['review_keywords'] = df.sketch.apply("Keywords for the review [{{ review_text }}] of product [{{ product_name }}] (comma separated):")
df['capitol'] = pd.DataFrame({'State': ['Colorado', 'Kansas', 'California', 'New York']}).sketch.apply("What is the capitol of [{{ State }}]?")
prompts.approx.dev
to help run with minimal setupYou can also directly use a few pre-built hugging face models (right now MPT-7B
and StarCoder
), which will run entirely locally (once you download the model weights from HF).
Do this by setting environment 3 variables:
os.environ['LAMBDAPROMPT_BACKEND'] = 'StarCoder'
os.environ['SKETCH_USE_REMOTE_LAMBDAPROMPT'] = 'False'
os.environ['HF_ACCESS_TOKEN'] = 'your_hugging_face_token'
You can also directly call OpenAI directly (and not use our endpoint) by using your own API key. To do this, set 2 environment variables.
(1) SKETCH_USE_REMOTE_LAMBDAPROMPT=False
(2) OPENAI_API_KEY=YOUR_API_KEY
Sketch uses efficient approximation algorithms (data sketches) to quickly summarize your data, and feed that information into language models. Right now it does this by summarizing the columns and writing these summary statistics as additional context to be used by the code-writing prompt. In the future we hope to feed these sketches directly into custom made "data + language" foundation models to get more accurate results.
FAQs
Compute, store and operate on data sketches
We found that sketch demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
NVD now marks all pre-2018 CVEs as "Deferred," signaling it will no longer enrich older vulnerabilities, further eroding trust in its data.
Research
Security News
Lazarus-linked threat actors expand their npm malware campaign with new RAT loaders, hex obfuscation, and over 5,600 downloads across 11 packages.
Security News
Safari 18.4 adds support for Iterator Helpers and two other TC39 JavaScript features, bringing full cross-browser coverage to key parts of the ECMAScript spec.