The Next Generation of Web Exploration
Powered by cutting-edge Large Multimodal Models, delivering human-like intelligence for seamless web interactions.
Tell us what you need in plain language
Agent searches relevant websites
AI processes the content
Get comprehensive results
Creating Advanced Web Agents: An Iterative Approach to Exploration, Learning, and Performance Tuning
CA: COMING SOON
Integrates visual and textual information from web pages using advanced LMM capabilities
Processes observations and plans actions using generalist planning approach
Executes planned actions through Selenium-powered web interaction
Seamlessly interacts with real-world websites through a robust online browsing environment
Watch our AI learn from real-world interactions in real-time
Explore, Extract, Excel โ AI-Powered Web Navigation Made Easy
# Install the package
pip install selenium openai
# Initialize RosieExplorer
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
import openai
# Setup configuration
chrome_options = Options()
chrome_options.add_argument('--headless') # Optional: run in headless mode
chrome_options.add_argument('--window-size=1024,768')
# Initialize the agent
openai.api_key = "YOUR_OPENAI_API_KEY"
driver = webdriver.Chrome(options=chrome_options)
# Run a task
try:
# Navigate to website
driver.get('https://explorer.dev')
# Take screenshot for visual analysis
screenshot = driver.get_screenshot_as_png()
# Process with GPT-4V
response = openai.ChatCompletion.create(
model="gpt-4-vision-preview",
messages=[
{"role": "user", "content": [
{"type": "text", "text": "What do you see on this webpage?"},
{"type": "image", "image": screenshot}
]}
]
)
print(response.choices[0].message.content)
finally:
driver.quit()
Fork, contribute, and customize to your needs. Join our community of developers.
View on GitHub โ