π Model Description
license: mit base_model: bharatgenai/LegalParam tags:
- gguf
- llama.cpp
- ollama
- quantized
- 2.9B
- indian-law
- legal
- llama
- en
LegalParam GGUF Models
GGUF quantized versions of bharatgenai/LegalParam for use with Ollama.
Model Information
Original Model: bharatgenai/LegalParam
- Architecture: ParamBharatGen (LLaMA-based)
- Parameters: 2.9B
- Context Length: 2048 tokens
- Purpose: Specialized AI assistant for Indian law
Available Quantizations
| Quantization | File Size | Description | Use Case |
|---|---|---|---|
| Q4KM | 1.7GB | 4-bit quantized | Recommended for most use cases |
| Q6_K | 2.2GB | 6-bit quantized | Higher quality, moderate resource usage |
| F16 | 5.4GB | 16-bit float (no quantization) | Highest quality, requires more memory |
Quick Start
1. Install Ollama
curl -fsSL https://ollama.com/install.sh | sh
2. Create the Model
Choose a quantization level:
# Q4KM (Recommended - 1.7GB)
ollama create legalparam:q4 -f Modelfile
Q6_K (Higher quality - 2.2GB)
ollama create legalparam:q6 -f Modelfile-q6
F16 (Highest quality - 5.4GB)
ollama create legalparam:f16 -f Modelfile-f16
3. Run the Model
# Interactive chat
ollama run legalparam:q4
Single query
ollama run legalparam:q4 "What steps should a farmer take to legally transfer agricultural land ownership?"
Python Usage
from ollama import Client
client = Client()
response = client.chat(model='legalparam:q4', messages=[
{'role': 'user', 'content': 'What are the fundamental rights in the Indian Constitution?'}
])
print(response['message']['content'])
Model File Details
All Modelfiles include:
- Correct chat template matching the tokenizer's format
- Stop tokens (
,,) to prevent infinite generation loops - Optimized parameters for legal question answering
Chat Template Format
<user>
{user_message}
<assistant>
{assistant_response}
Context Window
- Default: 2048 tokens (combined input + output)
- Scaling: Can be extended with RoPE scaling in Ollama (experimental)
Example Queries
The model excels at Indian legal queries:
- "Explain the First Amendment of the Indian Constitution"
- "What is the procedure for filing a civil suit in India?"
- "What are the key provisions of the Land Acquisition Act?"
- "Explain the concept of judicial review in India"
- "What are the powers of the Supreme Court of India?"
Technical Specifications
Model Architecture
- Hidden size: 2048
- Layers: 32
- Attention heads: 16
- KV heads: 8 (Grouped Query Attention)
- Vocabulary: 256,006 tokens
Special Tokens
: Beginning of sequence (BOS): End of sequence (EOS): User message marker: Assistant message marker
Limitations
- Context limited to 2048 tokens
- Training data cutoff: August 2023
- Optimized for Indian law queries
- May not perform well on non-legal topics
Original Model
This is a quantized version of bharatgenai/LegalParam. For the original PyTorch model, training details, and full documentation, please refer to the original repository.
License
Please refer to the original model repository for licensing information.
Conversion Process
These models were converted from the original HuggingFace format to GGUF using llama.cpp with the following process:
- Loaded original model with transformers
- Converted to GGUF format
- Quantized to Q4KM, Q6_K, and F16 precision
- Validated with Ollama inference engine
Troubleshooting
Model repeats or loops
- Ensure you're using the provided Modelfiles
- Stop tokens are pre-configured to prevent infinite loops
Out of memory errors
- Try a smaller quantization (Q4KM instead of Q6K)
- Reduce
numctxparameter in Ollama
Poor quality responses
- Try F16 quantization for highest quality
- Ensure proper prompt formatting with
andtags
Acknowledgments
- Original model: bharatgenai/LegalParam
- GGUF conversion: llama.cpp
- Inference engine: Ollama