## **Title:** DSS-GAN
**Presenter:** Aleksander Ogonowski
**Date:** 20 April 2026
---
## **Participants**
- Aleksander Ogonowski (AO)
- Wojciech Krzemień (WK)
- Konrad Klimaszewski (KK)
- Krzysztof Nawrocki (KN)
- Michał Mazurek (MM)
- Roman Shopa (RS)
- Michał Obara (MO)
- Lech Raczyński (LR)
- Rafał Możdzonek (RM)
- Mateusz Bała (MB)
---
## **Discussion Summary**
**MM:** What does a *non-saturating GAN* mean?
**AO:** I think it refers to the fact that in StyleGAN the discriminator outputs raw logits, and no sigmoid is used, so the outputs are not bounded. These are then passed into the softplus function
---
**RM:** The Mamba architecture assumes linearity and struggles with capturing global properties of 2D images. Typically, attention blocks are used to address this.
**AO:** The idea here was to avoid attention blocks in the architecture to reduce quadratic complexity.
**KN:** In the analysis of astronomical images using Visual Mamba, a trace of the direction from which Mamba processes the data remains visible. This leads to artifacts. It is a non-trivial problem to set the buffer length—i.e., how long Mamba should retain memory.
---
**RM:** Which version of Mamba are you using?
**AO:** Mamba v3.
**RM:** I recommend comparing it with Flash Attention. From my experience, Flash Attention uses significantly fewer resources compared to Mamba.
---
**RM:** What is the target image size?
**AO:** Currently 512×512. In the CaloChallenge, the images are smaller.
**MM:** In real LHCb applications, they will be even smaller, which should be beneficial for diffusion models.
**RM:** In that case, there is a concern. Some articles claim that for images below 2K×2K, Mamba performs worse than architectures with attention blocks.
---
**KN:** Regarding sampling directions—can this be tested using simple synthetic images (e.g., rectangles) to quantify the problem? For example, whether using two directions is better than three, and why?
**KK:** My hypothesis is that we are effectively composing these directions. The question is how to measure this effect using synthetic data and appropriate metrics.