๐Ÿ‘‹

Welcome to RosieExplorer

The Next Generation of Web Exploration

๐Ÿง 

End-to-End Web Agent

Powered by cutting-edge Large Multimodal Models, delivering human-like intelligence for seamless web interactions.

How it Works

1
๐Ÿ’ญ

Describe

Tell us what you need in plain language

2
๐Ÿ”

Explore

Agent searches relevant websites

3
โšก

Analyze

AI processes the content

4
โœจ

Deliver

Get comprehensive results

Key Features

๐ŸŽฏ Natural language understanding
๐Ÿ‘๏ธ Visual content analysis
๐ŸŒ Real-time web exploration
๐Ÿงช Intelligent data synthesis

Explore the web with
LMM-Powered intelligence.

Creating Advanced Web Agents: An Iterative Approach to Exploration, Learning, and Performance Tuning

CA: COMING SOON

I'll be back with your information
Results

How It Works

Multimodal Observation

Integrates visual and textual information from web pages using advanced LMM capabilities

Inputs
Screenshots DOM Elements HTML Structure
Visual observation demo

Reasoning Engine

Processes observations and plans actions using generalist planning approach

Capabilities
Action Planning State Tracking Decision Making
Reasoning process visualization

Action Execution

Executes planned actions through Selenium-powered web interaction

Actions
Click Type Scroll Navigate
Action execution demo

Your Smart Web Companion

Seamlessly interacts with real-world websites through a robust online browsing environment

Web Environment Integration

Building Our Dataset

Watch our AI learn from real-world interactions in real-time

5041
Prompts Collected
45
Active Users
98
Dataset Size (GB)

Integrate today

Explore, Extract, Excel โ€“ AI-Powered Web Navigation Made Easy

# Install the package
        pip install selenium openai
        
        # Initialize RosieExplorer
        from selenium import webdriver
        from selenium.webdriver.chrome.options import Options
        import openai
        
        # Setup configuration
        chrome_options = Options()
        chrome_options.add_argument('--headless')  # Optional: run in headless mode
        chrome_options.add_argument('--window-size=1024,768')
        
        # Initialize the agent
        openai.api_key = "YOUR_OPENAI_API_KEY"
        driver = webdriver.Chrome(options=chrome_options)
        
        # Run a task
        try:
            # Navigate to website
            driver.get('https://explorer.dev')
            
            # Take screenshot for visual analysis
            screenshot = driver.get_screenshot_as_png()
            
            # Process with GPT-4V
            response = openai.ChatCompletion.create(
                model="gpt-4-vision-preview",
                messages=[
                    {"role": "user", "content": [
                        {"type": "text", "text": "What do you see on this webpage?"},
                        {"type": "image", "image": screenshot}
                    ]}
                ]
            )
            
            print(response.choices[0].message.content)
            
        finally:
            driver.quit()

Open Source

Fork, contribute, and customize to your needs. Join our community of developers.

View on GitHub โ†’