
Security News
NVD Quietly Sweeps 100K+ CVEs Into a “Deferred” Black Hole
NVD now marks all pre-2018 CVEs as "Deferred," signaling it will no longer enrich older vulnerabilities, further eroding trust in its data.
Read our Architecture document
Join the Discussion on the Request for Comments
See also:
OpenAdapt is the open source software adapter between Large Multimodal Models (LMMs) and traditional desktop and web Graphical User Interfaces (GUIs).
Early demos (more coming soon!):
Welcome to OpenAdapt! This Python library implements AI-First Process Automation with the power of Large Multimodal Modals (LMMs) by:
The goal is similar to that of Robotic Process Automation, except that we use Large Multimodal Models instead of conventional RPA tools.
The direction is adjacent to Adept.ai, with some key differences:
Installation Method | Recommended for | Ease of Use |
---|---|---|
Scripted | Non-technical users | Streamlines the installation process for users unfamiliar with setup steps |
Manual | Technical Users | Allows for more control and customization during the installation process |
User Account Control
, click 'Yes'):
Start-Process powershell -Verb RunAs -ArgumentList '-NoExit', '-ExecutionPolicy', 'Bypass', '-Command', "iwr -UseBasicParsing -Uri 'https://raw.githubusercontent.com/OpenAdaptAI/OpenAdapt/main/install/install_openadapt.ps1' | Invoke-Expression"
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/OpenAdaptAI/OpenAdapt/HEAD/install/install_openadapt.sh)"
Prerequisite:
For the setup of any/all of the above dependencies, follow the steps SETUP.md.
Install with Poetry :
git clone https://github.com/OpenAdaptAI/OpenAdapt.git
cd OpenAdapt
pip3 install poetry
poetry install
poetry shell
poetry run postinstall
cd openadapt && alembic upgrade head && cd ..
pytest
See how to set up system permissions on macOS here.
Run this in every new terminal window once (while inside the OpenAdapt
root
directory) before running any openadapt
commands below:
poetry shell
You should see the something like this:
% poetry shell
Using python3.10 (3.10.13)
...
(openadapt-py3.10) %
Notice the environment prefix (openadapt-py3.10)
.
Run the following command to start the system tray icon and launch the web dashboard:
python -m openadapt.entrypoint
This command will print the config, update the database to the latest migration, start the system tray icon and launch the web dashboard.
Create a new recording by running the following command:
python -m openadapt.record "testing out openadapt"
Wait until all three event writers have started:
| INFO | __mp_main__:write_events:230 - event_type='screen' starting
| INFO | __mp_main__:write_events:230 - event_type='action' starting
| INFO | __mp_main__:write_events:230 - event_type='window' starting
Type a few words into the terminal and move your mouse around the screen to generate some events, then stop the recording by pressing CTRL+C.
Current limitations:
Quickly visualize the latest recording you created by running the following command:
python -m openadapt.visualize
This will generate an HTML file and open a tab in your browser that looks something like this:
For a more powerful dashboard, run:
python -m openadapt.app.dashboard.run
This will start a web server locally, and then open a tab in your browser that looks something like this:
You can play back the recording using the following command:
python -m openadapt.replay NaiveReplayStrategy
Other replay strategies include:
StatefulReplayStrategy
: Early proof-of-concept which uses the OpenAI GPT-4 API with prompts constructed via OS-level window data.VisualReplayStrategy
: Uses Fast Segment Anything Model (FastSAM) to segment active window.VanillaReplayStrategy
: Assumes the model is capable of directly reasoning on states and actions accurately. With future frontier models, we hope that this script will suddenly work a lot better.VisualBrowserReplayStrategy
: Like VisualReplayStrategy but generates segments from the visible DOM read by the browser extension.The (*) prefix indicates strategies which accept an "instructions" parameter that is used to modify the recording, e.g.:
python -m openadapt.replay VanillaReplayStrategy --instructions "calculate 9-8"
See https://github.com/OpenAdaptAI/OpenAdapt/tree/main/openadapt/strategies for a complete list. More ReplayStrategies coming soon! (see Contributing).
To record browser events in Google Chrome (required by the BrowserReplayStrategy
), follow these steps:
Go to your Chrome extensions page by entering chrome://extensions in your address bar.
Enable Developer mode
(located at the top right).
Click Load unpacked
(located at the top left).
Select the chrome_extension
directory in the OpenAdapt repo.
Make sure the Chrome extension is enabled (the switch to the right of the OpenAdapt extension widget is turned on).
Set the RECORD_BROWSER_EVENTS
flag to true
in openadapt/data/config.json
.
We are thrilled to open new contract positions for developers passionate about pushing boundaries in technology. If you're ready to make a significant impact, consider the following roles:
[Proposal] <your title here>
We're looking forward to your contributions. Let's build the future 🚀
Our goal is to automate the task described and demonstrated in a Recording
.
That is, given a new Screenshot
, we want to generate the appropriate
ActionEvent
(s) based on the previously recorded ActionEvent
s in order to
accomplish the task specified in the
Recording.task_description
and narrated by the user in
AudioInfo.words_with_timestamps
,
while accounting for differences in screen resolution, window size, application
behavior, etc.
If it's not clear what ActionEvent
is appropriate for the given Screenshot
,
(e.g. if the GUI application is behaving in a way we haven't seen before),
we can ask the user to take over temporarily to demonstrate the appropriate
course of action.
The data model consists of the following entities:
Recording
: Contains information about the screen dimensions, platform, and
other metadata.ActionEvent
: Represents a user action event such as a mouse click or key
press. Each ActionEvent
has an associated Screenshot
taken immediately
before the event occurred. ActionEvent
s are aggregated to remove
unnecessary events (see visualize.)Screenshot
: Contains the PNG data of a screenshot taken during the
recording.WindowEvent
: Represents a window event such as a change in window title,
position, or size.You can assume that you have access to the following functions:
create_recording("doing taxes")
: Creates a recording.get_latest_recording()
: Gets the latest recording.get_events(recording)
: Returns a list of ActionEvent
objects for the given
recording.See GitBook Documentation for more.
Join us on Discord. Then:
Your submission will be evaluated based on the following criteria:
Functionality : Your implementation should correctly generate the new
ActionEvent
objects that can be replayed in order to accomplish the task in
the original recording.
Code Quality : Your code should be well-structured, clean, and easy to understand.
Scalability : Your solution should be efficient and scale well with large datasets.
Testing : Your tests should cover various edge cases and scenarios to ensure the correctness of your implementation.
Commit your changes to your forked repository.
Create a pull request to the original repository with your changes.
In your pull request, include a brief summary of your approach, any assumptions you made, and how you integrated external libraries.
Bonus: interacting with ChatGPT and/or other language transformer models in order to generate code and/or evaluate design decisions is encouraged. If you choose to do so, please include the full transcript.
MacOS: if you encounter system alert messages or find issues when making and replaying recordings, make sure to set up permissions accordingly.
In summary (from https://stackoverflow.com/a/69673312):
From inside the openadapt
directory (containing alembic.ini
):
alembic revision --autogenerate -m "<msg>"
To ensure code quality and consistency, OpenAdapt uses pre-commit hooks. These hooks will be executed automatically before each commit to perform various checks and validations on your codebase.
The following pre-commit hooks are used in OpenAdapt:
--preview
feature is used.To set up the pre-commit hooks, follow these steps:
Navigate to the root directory of your OpenAdapt repository.
Run the following command to install the hooks:
pre-commit install
Now, the pre-commit hooks are installed and will run automatically before each commit. They will enforce code quality standards and prevent committing code that doesn't pass the defined checks.
When you submit a PR, the "Python CI" workflow is triggered for code consistency. It follows organized steps to review your code:
Python Black Check : This step verifies code formatting using Python Black style, with the --preview
flag for style.
Flake8 Review : Next, Flake8 tool thoroughly checks code structure, including flake8-annotations and flake8-docstrings. Though GitHub Actions automates checks, it's wise to locally run flake8 .
before finalizing changes for quicker issue spotting and resolution.
Please submit any issues to https://github.com/OpenAdaptAI/OpenAdapt/issues with the following information:
FAQs
Generative Process Automation
We found that openadapt demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
NVD now marks all pre-2018 CVEs as "Deferred," signaling it will no longer enrich older vulnerabilities, further eroding trust in its data.
Research
Security News
Lazarus-linked threat actors expand their npm malware campaign with new RAT loaders, hex obfuscation, and over 5,600 downloads across 11 packages.
Security News
Safari 18.4 adds support for Iterator Helpers and two other TC39 JavaScript features, bringing full cross-browser coverage to key parts of the ECMAScript spec.