Embedded Attention - Vision Transformers & Micro-LLMs for Space Systems

Toulouse, France Fixed-term (12 month)

About IRT Saint Exupéry

The Saint Exupéry Technological Research Institute (IRT) is an accelerator for science, technological research and transfer to the aeronautics and space industries for the development of innovative solutions that are safe, robust, certifiable and sustainable.

We offer on our sites in Toulouse, Bordeaux, Sophia Antipolis an integrated collaborative environment made up of engineers, researchers, experts and doctoral students from industrial and academic backgrounds for research projects and R&T services backed by technological platforms around 4 areas: advanced manufacturing technologies, greener technologies, methods & tools for the development of complex systems and smart technologies.

Our developed technologies meet the needs of industry, integrating the results of academic research.

Toulouse

IRT Saint Exupéry is the main tenant of building B612, Toulouse Aerospace's innovation center, occupying 10,900 m² of the 24,000 m² available. Located in the Montaudran district, at the heart of a rich and rapidly changing ecosystem, the B612 is home to the major players in innovation: U-Space, Airbus OneWeb satellites, ANITI, ESSP, Aerospace Valley and Capgemini.

3 reasons to join us:

- Take part in innovative research projects, at the service of French technological research and for the benefit of industry established on national and European territory.

- Living your passion for technology, giving yourself the freedom to innovate and developing your pioneering and team spirit!

- Evolve in a collaborative and multicultural environment, working alongside collaborators from academic research or industry: researchers, doctoral students, engineers, technicians, etc.

Job description

This postdoctoral project is the result of a collaboration between the Centre National d’Etudes Spatiales (CNES) and the Institute of Technological Research Saint Exupéry (IRT Saint Exupéry). The successful candidate will be employed by CNES. The primary location is IRT Saint Exupéry site in Sophia Antipolis or Toulouse.

About the Centre National d’Etudes Spatiales (CNES)

CNES is the public institution responsible for proposing and implementing France’s space policy. Through its innovative activities, CNES plays a major role within the European space sector and is a leading player in major international programs.

As an incubator of projects and a laboratory for new ideas, CNES’s mission is to continue inventing the space sector of tomorrow by offering unique career paths. Working at CNES means joining 2,350 employees in a cutting-edge field firmly oriented toward the future and innovation.

Attention has become a core building block of modern AI across language, vision, and generative models. In imaging, Vision Transformers (ViT) demonstrated that a “pure” transformer on image patches can match or surpass convolutional neural networks (CNNs) when pre-trained at scale, making attention a unifying alternative to convolution and applicable to tasks such as classification, detection, and segmentation [1]. This project aims to adapt attention-based modules for embedded architectures destined for space deployment, where power and memory are tightly constrained. By porting attention, compatibility is preserved with future attention-centric backbones for image processing and, eventually, other on-board applications such as large language models (LLMs).

Enabling attention on board extends EO processing beyond CNN-only models. ViTs can be used for tasks such as object detection, scene classification, or event segmentation, reducing downlink needs and prioritizing valuable data. Compact LLMs can further support spacecraft operations by summarizing telemetry, prioritizing events (e.g., acquisitions, downlink queues), and highlighting anomalies. The benefits are smaller data volumes, faster decisions between ground contacts, and increased autonomy under degraded communications.

Deploying attention in this context remains difficult. Standard attention has quadratic cost in sequence length (or number of patches) and is often constrained by heavy memory use, challenging strict latency and energy budgets. Current embedded toolchains also do not always support the operators required to exploit recent advances, limiting practical efficiency. Several strategies address these issues. IO-aware exact attention, such as FlashAttention, streams Q/K/V blocks through fast memory without materializing the full score matrix, reducing off-chip transfers and improving throughput [2]. Linear or approximate alternatives such as Performer replace softmax with kernel features (FAVOR+), achieving linear complexity with acceptable accuracy trade-offs [3]. For LLMs, activation-aware weight-only quantization (AWQ) compresses weights to 4-bit precision with limited accuracy loss, making micro-LLMs feasible on constrained hardware [4]. To ease memory pressure, PagedAttention introduces page-level allocation and reuse of the K/V cache, enabling longer contexts under tight RAM budgets [5]. Together, these methods—FlashAttention or windowed attention for ViTs, linear attention where suitable, AWQ for LLMs, and paged or quantized K/V caches—provide a practical path toward space-grade deployment.

The proposed work plan is as follows:

• State of the art. Survey barriers to embedded attention: unsupported operators, RAM/bandwidth pressure, and limited runtime optimizations. Catalog acceleration methods—windowed/local attention, quantization (fp8/int8/int4), IO-aware attention (FlashAttention), and linear/approximate attention (Performer)—along with deployment chains (TensorRT-LLM, Vitis AI, ONNX Runtime, Arm Compute Library, Apache TVM) and existing attention models on similar hardware.

• Space use cases & model selection. Select space-relevant cases: an embedded ViT for EO applications (e.g. object detection, scene classification, or event segmentation) and, optionally, a micro-LLM for telemetry (summarization, downlink prioritization). Implement optimizations consistent with mission constraints and toolchain support.

• Implementation & porting. Map the models to embedded targets (Jetson Orin GPU, Versal FPGA, Arm CPU) and adapt them for compatibility with each platform. Align operator sets, tensor layouts, and precisions with toolchains, adding only minimal plugins or kernels where required.

Evaluation & validation. Measure both algorithmic and hardware performance, focusing on RAM usage, power consumption, throughput, and latency. Assess utility (mIoU/F1 for EO; precision/recall for telemetry triage) and compare against CNN or non-attention baselines. Conduct ablations to evaluate the impact of each optimization technique.

• Demonstration & dissemination. Present results through scientific conferences and peer-reviewed publications. If possible, extend this valorization with an in-orbit demonstration (e.g., OPS-SAT-2), showcasing the feasibility of embedded attention under real space conditions.

Profile

PhD holder in Embedded Systems, Artificial Intelligence, or Computer Vision. Experience in Earth Observation image processing is a valuable asset.

Details about the job
Toulouse, France
Fixed-term (12 month)
Search
Powered byTaleez