Logo de Lanterns Studios

1 Image-Driven Conversational Agent Framework for Web PFE

Lanterns Studios

StageHybride3 à 6 moisDate limite : 10 déc. 2025
Computer Vision (CLIP/BLIP)Mobile & Web DevelopmentIA / Machine Learning

Postuler

Description

Overview

  • This project focuses on developing a lightweight framework that generates a conversational AI agent from a single 2D image, designed specifically for web environments.
  • The system must produce a responsive on-screen persona capable of real-time interaction through text and optional speech, while prioritizing fast loading and minimal computation overhead.

Key Features / Objectives

  • Generate an interactive AI persona using only a static 2D image with lightweight facial reactions or expression cues without 3D rendering.
  • Provide real-time conversational capabilities (text and optional voice) and prompt-based configuration for personality, tone, and behavior.
  • Optimize for browser performance on low-spec devices and enable simple integration into existing web applications.

Technical Stack & Responsibilities

  • Implement using JavaScript and WebAssembly, leveraging ONNX Runtime Web or TensorFlow.js for model inference in-browser.
  • Integrate Speech-to-Text and Text-to-Speech APIs for optional voice interaction and use lightweight vision models for facial cue generation.
  • Responsibilities include designing the framework architecture, model selection/tuning for web inference, performance optimization, and creating integration examples/demos.
  • Deliverables expected: working web prototype, performance benchmarks on low-spec devices, integration guide, and documentation for prompt-based configuration.

How to Apply

  • To apply, send your CV and a brief motivation letter referencing this project to recruitment@lanterns-studios.com.
  • You can also include links to relevant demos or repositories that demonstrate experience with Web ML, JavaScript, or lightweight vision models.