Explore the web with
LMM-Powered intelligence.

Creating Advanced Web Agents: An Iterative Approach to Exploration, Learning, and Performance Tuning

CA: COMING SOON

I'll be back with your information

Results

How It Works

Multimodal Observation

Integrates visual and textual information from web pages using advanced LMM capabilities

Inputs

Screenshots DOM Elements HTML Structure

Reasoning Engine

Processes observations and plans actions using generalist planning approach

Capabilities

Action Planning State Tracking Decision Making

Action Execution

Executes planned actions through Selenium-powered web interaction

Actions

Click Type Scroll Navigate

Your Smart Web Companion

Seamlessly interacts with real-world websites through a robust online browsing environment

Integrate today

Explore, Extract, Excel – AI-Powered Web Navigation Made Easy

# Install the package
        pip install selenium openai
        
        # Initialize RosieExplorer
        from selenium import webdriver
        from selenium.webdriver.chrome.options import Options
        import openai
        
        # Setup configuration
        chrome_options = Options()
        chrome_options.add_argument('--headless')  # Optional: run in headless mode
        chrome_options.add_argument('--window-size=1024,768')
        
        # Initialize the agent
        openai.api_key = "YOUR_OPENAI_API_KEY"
        driver = webdriver.Chrome(options=chrome_options)
        
        # Run a task
        try:
            # Navigate to website
            driver.get('https://explorer.dev')
            
            # Take screenshot for visual analysis
            screenshot = driver.get_screenshot_as_png()
            
            # Process with GPT-4V
            response = openai.ChatCompletion.create(
                model="gpt-4-vision-preview",
                messages=[
                    {"role": "user", "content": [
                        {"type": "text", "text": "What do you see on this webpage?"},
                        {"type": "image", "image": screenshot}
                    ]}
                ]
            )
            
            print(response.choices[0].message.content)
            
        finally:
            driver.quit()

Open Source

Fork, contribute, and customize to your needs. Join our community of developers.

View on GitHub →

Welcome to RosieExplorer

End-to-End Web Agent

How it Works

Describe

Explore

Analyze

Deliver

Key Features

Explore the web withLMM-Powered intelligence.