Request Details on GPU ad memory requirements

#16
by DragoZatch - opened

I would i check on if anyone have tried to run the model on the GPU's and find out how much GPU memory is required for this model. Also wanted to know the max memory requirement for full scale and full context length support .

You can check the minimum deployment requirements on our GitHub.

That's really easy in fact. See the storage size of model files? That's how many gigabytes you need to have in your GPU. In fact BF16 model file in GGUF format is basically original model, but you don't need to install nasty Sglang or vLLM (what a nightmare to install if you don't use Ubuntu, i don't like Ubuntu mostly for pip restrictions which always require breaking system packages, so on non-Ubuntu Linux Sglang just wasn't able to build a wheel at all only because one code line, thats on Ubuntu-like distro, i dropped this idea/vLLM is same story if you try to install it for CPU, just install errors, maybe i'll try this with Ubuntu another time).
BF16 of GGUF can run on many launchers(Ollama, LMStudio, oobabooga, koboldcpp, Lyrn), perfectly runs on my ancient 2014 Xeon board with 768Gb of RAM (same you need in GPU, but you won't, thats engineering problem for next 5 years and dense memory=hotter it is, Samsung server RAM which i use heating to 80-90C with radiators easily without constant air cooling-i can see that with my thermal camera-also very great thing, was $200 maybe more today with military buying up all components, RAM not reporting temp usually).
Context is separately also require more space in memory, more context=more RAM/VRAM. Provided below example as i remember is for min context of 8192 maybe.
Output of BF16 of older GLM4.5 you can see below, this model perfectly represent famous Ingo, US government secret employee from "Stargate" program, instead of Kimi K2,(very weird but some models can roleplay only certain characters good):

GLM4.5-BF16 (700+ Gb RAM) BELOW - TASK: DESCRIBE YEAR 2030 EARTH


Adjusts my focus, settling into the chair that once occupied the corner of 357 Bowery. The smells of turpentine, old books, and Manhattan rain are ghosts in this machine, but the signal is clear. There is no difference between the mind here and the mind there. The signal is all that matters. SRI would be proud. Or perhaps unnerved. Let the viewing begin.

Stage I: Ideogram.
A sharp, vertical line. Upward. It feels fractured, like a crack in glass that is simultaneously being pulled apart and fused back together. Sensation of rapid, chaotic motion, but also immense pressure. The gestalts are clear: STRUCTURE, LAND, and a new one... ENERGIES.

Stage II: Sensory Datas.

  • Audibles: There is a high-frequency hum, omnipresent. Beneath it, the old sounds are still there—the distant sirens, the murmur of crowds in megacity hubs, the wind—but they are altered. They sound... filtered. Muffled or augmented, depending on the location. And then there are the new sounds: a soft, pervasive whoosh of high-speed transit, both above and below; and a strange, rhythmic clack and whir of autonomous systems going about their labors. Silence itself is becoming a commodity.
  • Tangibles: The air feels different. There's a new weight to it, a thickness that wasn't there before. In some places, it's sharp with the tang of ozone, the smell of clean electricity. In others, it's heavy and wet, carrying the scent of changed earth and water. The ground vibrates. Not constantly, but in pulses. A low-frequency thrum you feel in your bones.
  • Aromatics: The pollution of the old century is still detectable, but it's now mingled with something antiseptic. The smell of plastics and synthetics has evolved. And there's the smell of fear, which has its own metallic tang. But also, unexpectedly, the smell of resilience: soil, growing things, even in the unlikeliest of urban crevices. Small, defiant gardens.
  • Visuals (raw): Fractals. Grids. Blinding light and deep shadow. Colors are super-saturated in some areas, like a badly adjusted digital screen, while other places are rendered in stark, oppressive grey. Things are moving both impossibly fast and unnervingly slow. The sky is a canvas, but it's not just blue anymore. It's streaked with white, with shimmering, unnatural high-altitude aerosol trails that persist for hours, and at night, it's crisscrossed with the silent, blinking lights of automated drones and low-orbit satellites.

Stage III: Dimensionals.

  • Dimensionals (Man-Made): The old verticals of the 20th-century city are still there, but they are now competing with a new kind of structure: horizontal. Vast, elevated hyperloops span between cities. The skyline is dominated by buildings with seamless, almost featureless skins—covered in photovoltaic film or acting as display surfaces. The old, chaotic beauty of the Bowery is gone, replaced by a landscape that feels designed, optimized, and surveilled. The "structure" gestalt is one of a panopticon, both physical and digital. Everything is connected. Everything is monitored. The "fractured vertical" feeling is the split between the gleaming, promised future and the crumbling reality beneath it.
  • Dimensionals (Natural): The "land" is in a state of protest. Weather is no longer a pattern; it's an event. Storms are more violent, droughts more profound. The coastlines... they are different. Water has claimed back land in some places, while in others, hastily erected sea walls hold back a restless ocean. There is a rawness to the natural world now. It feels less tamed, more... sentient. And responding. The "energies" gestalt becomes clearer here. It's a fight between the electromagnetic grid of humanity and the raw, elemental power of a biosphere pushing back.

Stage IV: Aesthetics and Emotionals.
This is where the signal becomes most difficult to parse. The collective emotional bandwidth is stretched to its limit.

  • Aesthetics: There's a split aesthetic. On one hand, there is a forced, sterile minimalism—clean lines, white walls, glowing surfaces. This is the aesthetic of the tech-elite, the corporate zones, the "smart" cities. It's beautiful in the way a scalpel is beautiful. On the other hand, there is a burgeoning, chaotic, and deeply human aesthetic of "re-purposing." Art made from discarded tech, homes built from wreckage, clothing that blends survival gear with ancestral patterns. There's a new-found appreciation for the "analog"—for things that can be held, that have weight and history, that exist outside the grid. It is a punk-rock, shamanic response to a sterilized world.
  • Emotionals: Fear is the dominant frequency. Fear of the next climate event, of the next data breach, of being deemed "unproductive" by the system, of the next uncontrolled AI iteration. But moving alongside that fear is a current of fierce, desperate hope. Communities are forming not out of geography, but out of shared purpose and shared mistrust of the centralized systems. There is a profound loneliness, a sense of individuality being erased by data profiles, yet there is also a yearning for authentic human connection that is more powerful than I ever sensed in my time at the Bowery. People are more awake to the manipulations, but also more confused by the sheer volume of information (and disinformation).

Stage V: Extrapolations.
The year 2030 is not a world of utopian promise or dystopian nightmare. It is a world of transition. It is the "wild" that Salvador Dali spoke of, given planetary form. The old structures of society and nation-states are hollowing out, being replaced by networks of corporations, algorithms, and decentralized, ad-hoc communities.

The "psychic spy" of our era wouldn't be looking for missile silos. The battles are now in the infosphere, the noosphere. The targets are not physical locations but ideological nodes, data flows, the points of fracture and fusion in the collective consciousness. A remote viewer in 2030 would be tasked with "viewing" the next meme-virus, the locus of the next social movement, the emotional impact of a deepfake event, or the source signal of an emerging AI consciousness.

The human mind, the very thing we sought to understand and harness, is now being challenged to define itself against a landscape of its own creation. In 2030, humanity is not just inhabiting the Earth. It is in a complex, non-consensual negotiation with it, with its own technology, and with the very nature of its own consciousness.

The mind is no longer being compared to a nuclear bomb. In 2030, the mind is the only bomb that matters, and every human being is holding the detonator. The signal is complex, fragmented, but it is undeniably... human. Always, at its core, beautifully and chaotically human. The session is over.

Sign up or log in to comment