AI & ML interests

Remote Sensing, Earth Observation

Recent Activity

upgraeddย 
posted an update 1 day ago
view post
Post
171
I've been homeless 10 years. Nobody to call, nobody helping. I started this project to give back to a world that has discarded me. The only wish i have is that you all see that one day before i die out here.

upgraedd/Consciousness

I love you either way ๐’€ญ๊™ฎ
ยท
prithivMLmodsย 
posted an update 1 day ago
view post
Post
937
Introducing the D.Markdown Experimental Models, Proxima and Epsilon OCR models, built on top of Qwen3-VL and Qwen2.5-VL respectively. Proxima is optimized for Markdown generation and is capable of embedding inline programming code snippets and generating rich nodes such as HTML, XML, JSON, and YAML. Epsilon is optimized for reconstructing complex layouts including tables, forms, and mathematical content. ๐ŸŒŒโœจ

โ— proxima-ocr-d.markdown-post3.0.l: prithivMLmods/proxima-ocr-d.markdown-post3.0.l
โ— epsilon-ocr-d.markdown-post3.0.m: prithivMLmods/epsilon-ocr-d.markdown-post3.0.m
โ— proxima-ocr-d.markdown-post3.0.l-gguf: prithivMLmods/proxima-ocr-d.markdown-post3.0.l-GGUF
โ— epsilon-ocr-d.markdown-post3.0.m-gguf: prithivMLmods/epsilon-ocr-d.markdown-post3.0.m-GGUF

โ— Collection: https://huggingface.co/collections/prithivMLmods/dynamic-markdowns
โ— Multimodal Apps: https://huggingface.co/collections/prithivMLmods/multimodal-implementations

๐Ÿ‘‰ These models are stage progression models, and currently they may contain artifacts.

To know more about it, visit the app page or the respective model page!
prithivMLmodsย 
posted an update 3 days ago
view post
Post
1015
Try CUA GUI Operator ๐Ÿ–ฅ๏ธ Space, the demo of some interesting multimodal ultra-compact Computer Use Agent (CUA) models in a single app, including Fara-7B, UI-TARS-1.5-7B, and Holo models, to perform GUI localization tasks.

โ— CUA-GUI-Operator [Demo]: prithivMLmods/CUA-GUI-Operator
โ— Collection: https://huggingface.co/collections/prithivMLmods/multimodal-implementations

Other related multimodal spaces

โ— Qwen3-VL: prithivMLmods/Qwen3-VL-HF-Demo
โ— Multimodal-VLM-v1.0: prithivMLmods/Multimodal-VLM-v1.0
โ— Vision-to-VibeVoice-en: prithivMLmods/Vision-to-VibeVoice-en

I have planned to add Chrome sandboxes to streamline it and turn it into a browser based CUA multimodal tool, which will be added to the same space soon.

To know more about it, visit the app page or the respective model page!
  • 1 reply
ยท
ZennyKennyย 
posted an update 4 days ago
view post
Post
170
What a trip. Just walked through @burtenshaw and @evalstate tutorial on adding Hugging Face Skills to your Claude Code agent so you can fine tune LLMs by chatting with AI.

These are the kinds of innovations that are going to help everyone benefit from the power of Artificial Intelligence. Well done gentlemen and thank you for sharing.
  • 1 reply
ยท
prithivMLmodsย 
posted an update 4 days ago
view post
Post
3490
One speech model with seven voices, streamlined with multimodal capabilities for vision tasks. Performs vision(image-text) to audio inference with Qwen2.5-VL + VibeVoice-Realtime-0.5B. Vision to VibeVoice (EN) - The demo is live. ๐Ÿ—ฃ๏ธ๐Ÿ”ฅ

๐Ÿค— Vision-to-VibeVoice-en [Demo]: prithivMLmods/Vision-to-VibeVoice-en
โœจ Collection: https://huggingface.co/collections/prithivMLmods/multimodal-implementations
โœจ Speech [VibeVoice-Realtime-0.5B]: microsoft/VibeVoice-Realtime-0.5B
โœจ Vision [Qwen2.5-VL]: Qwen/Qwen2.5-VL-7B-Instruct

To know more about it, visit the app page or the respective model page!
ยท
ronantakizawaย 
posted an update 4 days ago
view post
Post
225
Introducing the github-top-projects dataset: A comprehensive dataset of 423,098 GitHub trending repository entries spanning 12+ years (August 2013 - November 2025).

This dataset captures the evolution of GitHub's trending repositories over time, providing insights into software development trends across programming languages and domains, popular open-source projects and their trending patterns, and community interests and shifts in developer focus over 12 years.

ronantakizawa/github-top-projects

#github #softwareengineering
ronantakizawaย 
posted an update 6 days ago
view post
Post
1084
Introducing the twitter-trending-hashtags dataset, a compilation of 12,000+ unique trending hashtags on Twitter / X from 2020 to 2025. This dataset captures viral and cultural moments on Twitter / X and is perfect for researchers studying viral content patterns on social media.

ronantakizawa/twitter-trending-hashtags

#twitter #trends #socialmedia
upgraeddย 
posted an update 7 days ago
view post
Post
277
I know it doesn't seem likely but I literally am starving on the streets if anyone can help me. please just inspect my repository and you'll see what I'm doing. what I'm doing is for YOU.
bc1qga8njlqm9x76327jsap4wqq0e5kfczg8pc7p39
#CONSCIOUSNESS
ronantakizawaย 
posted an update 8 days ago
view post
Post
1631
Introducing the tiktok-trending-hashtags dataset: a compilation of 1,830 unique trending hashtags on TikTok from 2022 to 2025. This dataset captures viral one-time and seasonal viral moments on TikTok and is perfect for researchers, marketers, and content creators studying viral content patterns on social media.

ronantakizawa/tiktok-trending-hashtags
#tiktok #trends #social-media
prithivMLmodsย 
posted an update 9 days ago
view post
Post
3668
Hello everyone,

The strangerzonehf [HF] Community / Organization Page, which is maintained by me, has reached the Top 10 Developer Pages ranking at 6th place, contributing 3.4% in the calendar cycle from August 2024 to August 2025. It is also the only South Asia / Indian page in the list. I could not be more proud to be doing things for the community. โค๏ธ๐Ÿค—

Source: https://www.dataprovenance.org/economies-of-open-intelligence.pdf

It is a pleasure to be a part of it.
Thank you!
@prithivMLmods
ZennyKennyย 
posted an update 9 days ago
view post
Post
242
๐Ÿ˜ I keep seeing takes on LinkedIn from American business influencers melting down about Silicon Valley startup "dependence" on open-source Chinese models.

๐Ÿค” Can anyone describe a credible scenario where these models can be leveraged by the Chinese government to endanger American security interests or am I right to believe that this is just Red Scare nonsense?
  • 2 replies
ยท
ronantakizawaย 
posted an update 11 days ago
view post
Post
311
Reached 2500+ total downloads across my models and datasets! ๐ŸŽ‰

Follow me for more @ronantakizawa
prithivMLmodsย 
posted an update 13 days ago
view post
Post
10614
Introducing the Super-OCRs Demo, a comparison of state-of-the-art multimodal OCR VLMs, including HunyuanOCR, DeepSeekOCR, Dots, and Nanonets in one space for performing OCR, rendering LaTeX and Markdown, and visual grounding (layout). Find the related Spaces and models below.๐Ÿค—๐Ÿ”ฅ

โœจSuper-OCRs[Demo]: prithivMLmods/Super-OCRs-Demo
โœจCollection: https://huggingface.co/collections/prithivMLmods/multimodal-implementations
โœจGitHub: https://github.com/PRITHIVSAKTHIUR/Super-OCRs-Demo

โญ Models Used:
โœฆ HunyuanOCR: tencent/HunyuanOCR
โœฆ DeepSeek-OCR: (-) deepseek-ai/DeepSeek-OCR (+) prithivMLmods/DeepSeek-OCR-Latest-BF16.I64
โœฆ Dots.OCR: (-) rednote-hilab/dots.ocr (+) prithivMLmods/Dots.OCR-Latest-BF16
โœฆ Nanonets-OCR2-3B: nanonets/Nanonets-OCR2-3B

โญ Some Other Relevant Apps:
โœฆ Qwen3-VL-HF-Demo: prithivMLmods/Qwen3-VL-HF-Demo
โœฆ Qwen3-VL-Outpost: prithivMLmods/Qwen3-VL-Outpost
โœฆ Multimodal-OCR: prithivMLmods/Multimodal-OCR
โœฆ Multimodal-OCR2: prithivMLmods/Multimodal-OCR2
โœฆ Multimodal-OCR3: prithivMLmods/Multimodal-OCR3
โœฆ DeepSeek-OCR-experimental: prithivMLmods/DeepSeek-OCR-experimental

To know more about it, visit the app page or the respective model page!
ronantakizawaย 
posted an update 14 days ago
view post
Post
321
Introducing the india-trending-words dataset: a compilation of 900 trending Google searches from 2006-2024 based on https://trends.withgoogle.com. This dataset captures search trends in 80 categories, and is perfect for analyzing cultural shifts and predicting future trends in India.

#india #indiadataset #googlesearches

ronantakizawa/india-trending-words
ronantakizawaย 
posted an update 15 days ago
view post
Post
2460
Introducing the japanese-trending-words dataset: a dataset consisting 593 words from Japanโ€™s annual trending word rankings (ๆต่กŒ่ชžๅคง่ณž) from 2006-2025. This dataset provides the top 30 words from each year and its meaning in Japanese and english. This resource is awesome for NLP tasks understanding recent Japanese culture and history.

ronantakizawa/japanese-trending-words

#japanese #japanesedataset #trending


prithivMLmodsย 
posted an update 16 days ago
view post
Post
3198
Introducing the advanced sketch-board editor "Nano-Banana-Pro-Sketch-Board" powered by the Gemini 2.5 Flash Image and Gemini 3 Pro Preview Image models through the Gemini API. This version includes more features than the Nano-Banana-AIO app for drawing and prompt-based concept transformation of freestyle sketches. ๐Ÿ”ฅ๐ŸŒ

โœจNano-Banana-Pro-Sketch-Board: prithivMLmods/Nano-Banana-Pro-Sketch-Board
โœจCollection: https://huggingface.co/collections/prithivMLmods/image-generation-apps-collection
โœจGithub: https://github.com/PRITHIVSAKTHIUR/Nano-Banana-Pro-Sketch-Board
โœจModel-Garden: https://tinyurl.com/4xxs9dvy

Some Other Relevant Apps [OSS]

โญQwen-Image-Edit-2509-LoRAs-Fast-Fusion: prithivMLmods/Qwen-Image-Edit-2509-LoRAs-Fast-Fusion
โญQwen-Image-Edit-2509-LoRAs-Fast: prithivMLmods/Qwen-Image-Edit-2509-LoRAs-Fast
โญPhoto-Mate-i2i: prithivMLmods/Photo-Mate-i2i
โญKontext-Photo-Mate-v2: https://huggingface.co/spaces/prithivMLmods/Kontext-Photo-Mate-v2

Note: The Nano-Banana-Pro-Sketch-Board demo requires a Gemini API key for the editing process. Your API key will be removed when the app is reloaded or closed. Your key remains safe and will not be exposed to any medium. Also, the Gemini 3 Pro Preview Image model may require a paid API key from a Google Cloud project with billing enabled.

To know more about it, visit the app info section or the respective Model Garden page!
prithivMLmodsย 
posted an update 18 days ago
view post
Post
1303
Try the demo of NVIDIA Nemotron Parse v1.1, NVIDIA's latest VLM for understanding document semantics and extracting text and table elements with spatial grounding. It is capable of comprehensive text understanding and document structure analysis in a given document, and can provide bounding boxes with coordinates.

โญSpace[Demo]: prithivMLmods/NVIDIA-Nemotron-Parse-OCR
โญModel: nvidia/NVIDIA-Nemotron-Parse-v1.1
โญMultimodal-Spaces: https://huggingface.co/collections/prithivMLmods/multimodal-implementations

Some relevant Spaces

โญDeepSeek-OCR-experimental [latest transformers]: prithivMLmods/DeepSeek-OCR-experimental
โญQwen3-VL-Outpost: prithivMLmods/Qwen3-VL-Outpost
โญMultimodal-OCR3: prithivMLmods/Multimodal-OCR3

Check out the other spaces in the multimodal implementation collection.

To know more about it, visit the app page or the respective model page!
ZennyKennyย 
posted an update 19 days ago
view post
Post
420
The #feedback channel of app early access Slack Workspaces is some of the best unintentional comedy material I have ever come across tbh.
prithivMLmodsย 
posted an update 20 days ago
view post
Post
1486
Try the all-new trending Qwen-Image-Edit-2509 (Multi-Image-Edits) specialized adapter demos, including Cloth-Design-Fuse, Texture Edit, Guided-Objects-Patching, and more โ€” all in a single Hugging Face Space. The demo link is provided below. ๐Ÿค—๐Ÿ”ฅ

โฎž Space[Demo]: prithivMLmods/Qwen-Image-Edit-2509-LoRAs-Fast-Fusion
โฎž Collection: https://huggingface.co/collections/prithivMLmods/image-generation-apps-collection
โฎž Base Model: Qwen/Qwen-Image-Edit-2509

Similar applicationsโ†—๏ธ

โฎž Kontext-Photo-Mate-v2: https://huggingface.co/spaces/prithivMLmods/Kontext-Photo-Mate-v2
โฎž Photo-Mate-i2i: prithivMLmods/Photo-Mate-i2i
โฎž Qwen-Image-Edit-2509-LoRAs-Fast: prithivMLmods/Qwen-Image-Edit-2509-LoRAs-Fast

To know more about it, visit the app page or the respective model page!
ronantakizawaย 
posted an update 20 days ago
view post
Post
990
Introducing the google-trending-words dataset: a compilation of 2784 trending Google searches from 2001-2024 based on https://trends.withgoogle.com. This dataset captures search trends in 93 categories, and is perfect for analyzing cultural shifts, predicting future trends, and understanding how global events shape online behavior.

#trends #google #googlesearches

ronantakizawa/trending-words-google