From imaging algorithms to quantum methods Seminar

Europe/Warsaw
https://cern.zoom.us/j/66151941204?pwd=n7upvvZYibexBhbtyn5kvTpy36L0Wo.1 (Zoom)

https://cern.zoom.us/j/66151941204?pwd=n7upvvZYibexBhbtyn5kvTpy36L0Wo.1

Zoom

Konrad Klimaszewski (NCBJ), Wojciech Krzemien (NCBJ)

## **Title:** DSS-GAN
**Presenter:** Aleksander Ogonowski
**Date:** 20 April 2026

---

## **Participants**
- Aleksander Ogonowski (AO)
- Wojciech Krzemień (WK)
- Konrad Klimaszewski (KK)
- Krzysztof Nawrocki (KN)
- Michał Mazurek (MM)
- Roman Shopa (RS)
- Michał Obara (MO)
- Lech Raczyński (LR)
- Rafał Możdzonek (RM)
- Mateusz Bała (MB)

---

## **Discussion Summary**

**MM:** What does a *non-saturating GAN* mean?
**AO:** I think it refers to the fact that in StyleGAN the discriminator outputs raw logits, and no sigmoid is used, so the outputs are not bounded. These are then passed into the softplus function

---

**RM:** The Mamba architecture assumes linearity and struggles with capturing global properties of 2D images. Typically, attention blocks are used to address this.
**AO:** The idea here was to avoid attention blocks in the architecture to reduce quadratic complexity.
**KN:** In the analysis of astronomical images using Visual Mamba, a trace of the direction from which Mamba processes the data remains visible. This leads to artifacts. It is a non-trivial problem to set the buffer length—i.e., how long Mamba should retain memory.

---

**RM:** Which version of Mamba are you using?
**AO:** Mamba v3.
**RM:** I recommend comparing it with Flash Attention. From my experience, Flash Attention uses significantly fewer resources compared to Mamba.

---

**RM:** What is the target image size?
**AO:** Currently 512×512. In the CaloChallenge, the images are smaller.
**MM:** In real LHCb applications, they will be even smaller, which should be beneficial for diffusion models.
**RM:** In that case, there is a concern. Some articles claim that for images below 2K×2K, Mamba performs worse than architectures with attention blocks.

---

**KN:** Regarding sampling directions—can this be tested using simple synthetic images (e.g., rectangles) to quantify the problem? For example, whether using two directions is better than three, and why?
**KK:** My hypothesis is that we are effectively composing these directions. The question is how to measure this effect using synthetic data and appropriate metrics.

There are minutes attached to this event. Show them.
    • 10:00 11:00
      DSS-GAN (Directional State Space GAN): A Generative Adversarial Network with Directional State Spaces Utilizing the Mamba Architecture 1h

      DSS-GAN is the first generative adversarial network to utilize the Mamba architecture as a hierarchical generator, synthesizing images from noise under class-conditional settings. The key contribution is the Directional Latent Routing (DLR) mechanism, which decomposes the latent vector and class signal into direction-specific sub-vectors, each independently conditioning the corresponding scanning direction along various spatial axes of the feature map.

      Speaker: Aleksander Ogonowski
    • 11:00 11:30
      Discussion 30m
Your browser is out of date!

Update your browser to view this website correctly. Update my browser now

×