UI automation can be frustrating, often involving a maze of #ids, data-test attributes, and .selectors that are difficult to maintain, especially when the page undergoes a refactor.
Introducing Midscene.js, an innovative SDK designed to bring joy back to programming by simplifying automation tasks.
Midscene.js leverages a multimodal Large Language Model (LLM) to intuitively “understand” your user interface and carry out the necessary actions. You can simply describe the interaction steps or expected data formats, and the AI will handle the execution for you.
Use .aiAction
to perform a series of actions by describing the steps, .aiQuery
to extract customized data from the UI, and .aiAssert
to perform assertions on the page.
It is all based on natural language processing, bringing you a new experience in writing automation.
For example
Quickly integrate GPT-4o and web automation tools like Playwright or Puppeteer into your project with Midscene.
You can setup your automation now with all your familiar tools. No custom training is need.
With our visualization tool, you can easily debug the prompt and AI response. All intermediate data, such as queries, plans, and actions, can be visualized.
You may open the Online Visualization Tool to see the showcase.
Here is a flowchart that describes the core process of the interaction between Midscene and AI.