AI Compass
Posts
Convergence AI Proxy 1.0 Challenges OpenAI's Operator.

Convergence AI Proxy 1.0 Challenges OpenAI's Operator.

Convergence introduced Proxy 1.0, an AI agent designed to automate web-based tasks.OpenAI launched the SWE-Lancer benchmark to evaluate AI models on real-world freelance software tasks. ByteDance introduced Phantom, advancing AI-powered video generation.

Sanchay Thalnerkar & Ashutosh Shrivastava
February 19, 2025

Hello, AI Enthusiasts

In Today’s AI Compass :

Convergence launched Proxy 1.0 : London based startup tops AI web agent benchmarks at $20/month, giving OpenAI's Operator some real competition.
OpenAI introduced SWE-Lancer benchmark: Can frontier LLMs earn $1 million from real-world freelance software engineering?
ByteDance introduces Phantom: A groundbreaking paper focusing on subject-consistent video generation.
Quick Industry Moves: Claude's getting web search and reasoning, Grok's going desktop, and Moonshot's tackling long-context scaling.

Read Time: 5 minutes

TOP NEWS

CONVERGENCE

Meet Proxy 1.0: The London-Built AI Agent Challenging OpenAI Operator

source: convergence

Convergence, a London-based startup, has introduced Proxy 1.0, an AI agent designed to automate web-based tasks. This AI Agent is ten times cheaper than similar offerings OpenAI Operator and is available globally with free tier. Proxy 1.0 can perform actions like clicking, scrolling, and typing, potentially revolutionizing how tasks such as job applications and marketing campaigns are managed. Try here

Details:

Benchmark Performance: Leading WebVogager scores across multi-step web tasks, outperforming OpenAI's Operator
Core Technology: Built-in task learning system that remembers and optimizes repeatable workflows
Automation Capabilities: Handles everything from form-filling to database queries with minimal human input
Pricing & Availability: $20/month with free tier, global launch (vs. OpenAI's US-only, premium-priced Operator)
Coming Soon: Parallel processing support, custom workflow marketplace, document analysis features
Proxy turns regular tasks into set-and-forget automations.
source: convergence
Want to browse the web? Proxy handles that like a human would
source: convergence

OPENAI

OpenAI's SWE-Lancer: AI Models Face $1M Worth of Real Coding Jobs

source: OpenAI

OpenAI has launched the SWE-Lancer benchmark, designed to evaluate AI models on real-world freelance software engineering tasks from Upwork, valued at $1 million in total payouts. This new benchmark tests AI's capability across the full engineering stack, highlighting both technical and managerial tasks to assess AI's economic impact in software development.
GitHub repo for SWE-Lancer : This repo contains the dataset and code for the paper.

Details:

Benchmark: SWE-Lancer for evaluating AI coding performance
Task Source: Over 1,400 freelance tasks from Upwork
Total Value: Collectively worth $1 million
Task Range: From $50 bug fixes to $32,000 feature implementations
Scope: Covers both coding and managerial decision-making tasks
Objective: Map AI performance to real-world economic value and promote research on AI's role in software engineering

Current frontier models performance

Anthropic Claude 3.5 sonnet was able to solve more problems comapre to OpenAI
Gpt-4o and o1.

source: OpenAI

BYTEDANCE

ByteDance Presents Phantom: Subject-Consistent Video Generation

source : x :@ai_for_success

ByteDance introduces Phantom, a groundbreaking paper focusing on subject-consistent video generation through cross-modal alignment. Phantom allows for the creation of videos where the subject's identity is preserved across frames, utilizing both single-reference and multi-reference approaches for enhanced video quality. Read More

Details:

Identity-Preserving Video Generation: Generates videos using a facial reference image while strictly maintaining the subject’s identity and following the given prompt.
Single-Reference Subject-to-Video Generation: Creates videos from a single reference image, preserving details of objects, clothing, animals, virtual characters, and more.
Multi-Reference Subject-to-Video Generation: Uses multiple reference images to generate realistic interactions, such as group scenes, product demonstrations, and virtual try-ons.
Identity-Preserving Video Generation ( source : x :@ai_for_success )
Single-Reference Subject-to-Video Generation ( source : x :@ai_for_success )

Multi-Reference Subject-to-Video Generation ( source : x :@ai_for_success )

Free AI Training Course

DeepLearning.AI launched a new free short course in AI, “Attention in Transformers: Concepts and Code in PyTorch.” learn here.

In the Spotlight

Anthropic Teases Major Claude Upgrade with New iOS App Features
Code from Claude's iOS application reveals new icons labeled "Steps," "Think," and "MagnifyingGlass," hinting at upcoming enhancements in reasoning and web search capabilities. These features suggest Anthropic is preparing to expand Claude’s cognitive and research abilities, potentially making it a more powerful AI assistant.
Grok AI Expands Beyond X with Standalone Desktop Apps
Grok, X’s AI assistant, is breaking out of the platform with dedicated MacOS and Windows applications. This expansion marks a significant step toward broader accessibility, allowing users to interact with Grok outside of X's ecosystem for a more seamless AI experience along with their web based chatbot grok.com
Moonshot AI Unveils MoBA Architecture for Faster Long-Context Processing
Moonshot AI has introduced MoBA architecture, delivering a 6.5x speed boost for processing million-token contexts. Already implemented at Kimi.ai, this high-performance AI framework is now open-source on GitHub, enabling developers to explore its potential for large-scale AI tasks.

That’s a Wrap

That's it for today!

Thank you for reading today’s newsletter!
If you have any feedback or suggestions on how I can improve, feel free to reply and let us know.

Reply

or to participate.

Convergence AI Proxy 1.0 Challenges OpenAI's Operator.

Convergence introduced Proxy 1.0, an AI agent designed to automate web-based tasks.OpenAI launched the SWE-Lancer benchmark to evaluate AI models on real-world freelance software tasks. ByteDance introduced Phantom, advancing AI-powered video generation.

Hello, AI Enthusiasts

Details:

Proxy turns regular tasks into set-and-forget automations.

Want to browse the web? Proxy handles that like a human would

Details:

Current frontier models performance

Details:

That's it for today!

Reply