ByteGPT-small / README.md

Update README with Android SDK Details

1f52143 verified 10 months ago

5.25 kB

	---
	library_name: transformers
	tags:
	- gpt
	- byte-tokenization
	- mobile
	- embedded
	- onnx
	license: cc-by-nc-4.0
	datasets:
	- custom
	- web
	language: en
	widget:
	- text: "In order to make pancakes, you need to"
	- text: "Once upon a time"
	---

	<p align="center">
	<img src="logo.png" alt="IJK Technology" width="150">
	</p>

	<h1 align="center">IJK Technology – ByteGPT-small</h1>


	ByteGPT-small is a small GPT-style language model trained using byte tokenization inspired by the ByT5 paper. It is designed for use on compute- and memory-constrained devices, such as mobile phones and embedded systems.

	## 🚀 Overview
	- Model Type: GPT-style causal language model
	- Tokenizer: Byte-level tokenization (from ByT5)
	- Intended Use: Edge devices, mobile phones, embedded systems
	- Size: Small (initial prototype)
	- Training: Custom-trained from scratch

	## 🧠 Why Byte Tokenization?
	Byte tokenization offers several advantages for small-scale, efficient models:

	1. Reduced Memory Footprint:
	Byte-level tokenization drastically reduces the size of the embedding layer, making the model suitable for devices with limited RAM.

	2. No External Dependencies:
	Unlike subword tokenizers (e.g., SentencePiece, BPE), byte tokenization requires no external libraries for tokenization. A simple Python script can handle tokenization.

	3. Robustness to Noise:
	Byte-level models are more robust to misspellings, typos, and out-of-vocabulary tokens.

	## 💡 Future Plans
	This is the first in a series of models. While this model is not yet highly useful due to its small size, it represents the foundation for future versions. Upcoming releases will include:

	- Larger Models: Scaled-up versions with better performance
	- Distilled Models: Using GPRO distillation to create highly efficient small models
	- Benchmark Results: Comparative performance on mobile devices

	## 💻 Usage

	### Quick Start (with `transformers`):
	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model = AutoModelForCausalLM.from_pretrained("ijktech/ByteGPT-small", trust_remote_code=True)
	tokenizer = AutoTokenizer.from_pretrained("ijktech/ByteGPT-small")

	input_text = "What is the capital of France?"
	inputs = tokenizer(input_text, return_tensors="pt")
	outputs = model.generate(**inputs, max_new_tokens=100)

	print(tokenizer.decode(outputs[0], skip_special_tokens=True))
	```

	### Tokenizer

	The tokenizer is byte-level, compatible with AutoTokenizer from Hugging Face:

	```python
	tokenizer = AutoTokenizer.from_pretrained("ijktech/ByteGPT-small")
	```

	### ONNX

	The model is also available in ONNX format, and can be used with the ONNX Runtime:

	```python
	import onnxruntime as ort
	import numpy as np

	# Create ONNX Runtime session
	ort_session = ort.InferenceSession("model.onnx")

	# Helper function to generate text using the ONNX model
	def generate_with_onnx(prompt_ids, max_new_tokens=50, temperature=1.0):
	input_ids = prompt_ids.clone()

	for _ in range(max_new_tokens):
	# Get the last block_size tokens if input is too long
	if input_ids.shape[1] > model.block_size:
	input_ids = input_ids[:, -model.block_size:]

	# Run inference
	ort_inputs = {
	'input': input_ids.cpu().numpy()
	}
	logits = ort_session.run(None, ort_inputs)[0]

	# Get predictions for the next token
	logits = torch.from_numpy(logits)
	logits = logits[:, -1, :] # Only take the last token's predictions

	# Apply temperature
	if temperature != 1.0:
	logits = logits / temperature

	# Sample from the distribution
	probs = torch.nn.functional.softmax(logits, dim=-1)
	next_token = torch.multinomial(probs, num_samples=1)

	# Append the new token
	input_ids = torch.cat([input_ids, next_token], dim=1)

	return input_ids

	# Test the generation
	prompt = "Hello"
	prompt_ids = tok(prompt, return_tensors="pt")["input_ids"]
	generated_ids = generate_with_onnx(prompt_ids)
	generated_text = tok.decode(generated_ids[0], skip_special_tokens=True)
	print(f"Generated text: {generated_text}")
	#Generated text: Hello everyone!
	#A dinner is only available for St. Loui
	```

	### Android Usage

	We've just released an Android SDK. You can find the SDK on our [GitHub](https://github.com/ijktech/ByteGPT-Android).

	The SDK can be included in your Android project by adding the following to your `build.gradle` file:

	```
	repositories {
	maven {
	url = uri("https://raw.githubusercontent.com/ijktech/ByteGPT-Android/maven-repo")
	}
	}

	dependencies {
	implementation("com.github.ijktech:ByteGPT-Android:1.0.9")
	}
	```


	### iOS Usage

	Coming Soon!


	## 📜 License
	📍 CC-BY-NC-4.0: Free for non-commercial use.

	💼 Commercial Use: Contact IJK Technology Ltd for licensing at [[email protected]](mailto:[email protected]).

	## 🛠️ About IJK Technology Ltd
	IJK Technology Ltd (IJKTech) develops innovative machine learning models optimized for on-device inference. Our focus is on efficiency, privacy, and usability across mobile and embedded platforms.