--- base_model: - AesSedai/GLM-4.6-REAP-266B-A32B - zai-org/GLM-4.6 pipeline_tag: text-generation library_name: transformers --- This is Q4_K_M gguf quant of [AesSedai/GLM-4.6-REAP-266B-A32B](https://huggingface.co/AesSedai/GLM-4.6-REAP-266B-A32B) # What Is This? [AesSedai/GLM-4.6-REAP-266B-A32B](https://huggingface.co/AesSedai/GLM-4.6-REAP-266B-A32B) was created using REAP (Router-weighted Expert Activation Pruning), a novel expert pruning method that selectively removes redundant experts while preserving the router's independent control over remaining experts. See the GLM-4.5-Air version by Cerebras for more details [cerebras/GLM-4.5-Air-REAP-82B-A12B](https://huggingface.co/cerebras/GLM-4.5-Air-REAP-82B-A12B) The MTP tensors were *not* included in this quant (though llama.cpp hasn't implemented this feature anyway) ** Imatrix ** [GLM-4.6-REAP-266B-A32B-imatrix.dat](https://huggingface.co/gghfez/GLM-4.6-REAP-266B-A32B-Q4_K/resolve/main/GLM-4.6-REAP-266B-A32B-imatrix.dat) # Original Model Card for GLM-4.6-REAP Note: currently non-functional because of missing `mtp.safetensors` file and entry in `model.safetensors.index.json` Forked from https://github.com/CerebrasResearch/reap to https://github.com/AesSedai/reap to hack in GLM-4.6 support. Produced with: ``` bash experiments/pruning-cli.sh 0,1,2,3,4,5,6,7 zai-org/GLM-4.6 reap 42 0.25 theblackcat102/evol-codealpaca-v1 true true true false false ```