Install Whisper2Linux
This guide will walk you through the process of setting up Whisper2Linux on your Linux system. Follow these steps carefully to ensure a smooth installation and configuration.
Prerequisites
Running Services
Before you begin, ensure that you finished either Docker or Remote setup requirements. If you don't have one of these methods running do not continue.
whisper2linux-ollama
whisper2linux-openedai-speech-server
whisper2linux-whisper-asr
You should be running these service at this point. Whisper2Linux can only function if the services are available.
Skillset
- Administrative access to your Linux system
- Basic familiarity with the command line
- An active internet connection
Step 1: Install Dependencies
First, we need to install the necessary system packages and Python libraries.
System Packages
Install xdotool
, which is used for simulating keyboard input:
For Arch-based distributions (e.g., Manjaro):
sudo pacman -S xdotool
For Debian-based distributions (e.g., Ubuntu):
sudo apt-get update
sudo apt-get install xdotool
Python and pip
Ensure you have Python 3.8 or higher installed:
python --version
If Python is not installed or is an older version, install it using your distribution's package manager.
Install pip if it's not already available:
sudo apt-get install python3-pip # For Debian-based systems
sudo pacman -S python-pip # For Arch-based systems
Python Libraries
Install the required Python packages:
pip install requests sounddevice soundfile numpy Xlib rapidfuzz
Step 2: Clone the Repository
Clone the Whisper2Linux repository to your local machine:
git clone https://github.com/yourusername/whisper2linux.git
cd whisper2linux
Step 3: Configuration
Before running Whisper2Linux, you may need to configure some settings:
-
Open the
whisper2linux.py
file in a text editor. -
Locate the following variables and update them if necessary:
TRIGGER_WORD = "olga" # Change this if you want a different trigger word
WHISPER_API_URL = "http://192.168.1.186:9000/asr"
TTS_API_URL = "http://192.168.1.186:8000/v1/audio/speech"
OLLAMA_API_URL = "http://192.168.1.186:11434/api/chat"
OLLAMA_MODEL = "mistral-nemo" # Change this to your preferred model
TTS_VOICE = "alloy" # Change this to your preferred voiceEnsure that the API URLs point to your actual API endpoints if you're not using the default local setup.
-
If you're using a non-standard microphone or audio setup, you may need to adjust the
MIC_DEVICE
andSAMPLE_RATE
variables:MIC_DEVICE = None # Set to a specific device index if needed
SAMPLE_RATE = 16000 # Adjust if your microphone requires a different sample rate -
Save the changes to the file.
Step 4: Setting Up API Endpoints (Optional)
If you haven't set up the required API endpoints locally, you have a few options:
-
Local Setup: Follow the documentation for Whisper, a Text-to-Speech service, and Ollama to set up these services on your local machine or network.
-
Remote Services: Use remote API endpoints for these services. Ensure you have the necessary API keys and update the URLs in the configuration.
-
Akash Network: To use the Akash Network for running these services:
- Set up an Akash Network account and obtain AKT tokens
- Use the provided Docker Compose setup to deploy the services
- Update the API URLs in the Whisper2Linux configuration to point to your Akash deployment
Step 5: Running Whisper2Linux
Now that everything is set up, you can run Whisper2Linux:
python whisper2linux.py
You can also use command-line arguments to control logging:
# Run with memory logging
python whisper2linux.py --log memory
# Run with file logging
python whisper2linux.py --log file --log-file /path/to/logfile.log
Step 6: Testing the Installation
-
Once Whisper2Linux is running, you should see a message indicating that it's listening for the Ctrl+Alt key combination.
-
Hold down Ctrl+Alt and speak a command, for example: "Olga, what's the weather like today?"
-
Release the keys and wait for the response. If everything is set up correctly, you should see (or hear) a response from the AI assistant.
Troubleshooting
If you encounter any issues during setup or running Whisper2Linux:
- Check the console output for any error messages.
- Review the log file (if you've enabled logging) for more detailed information.
- Ensure all API endpoints are accessible and responding correctly.
- Verify that your microphone is working and properly detected by your system.
- If using GPU acceleration, ensure your GPU drivers are up to date.
Next Steps
Now that you have Whisper2Linux set up and running:
- Familiarize yourself with the available commands and features.
- Explore the customization options to tailor Whisper2Linux to your needs.
- Consider contributing to the project by reporting bugs, suggesting features, or submitting pull requests.
Congratulations! You've successfully set up Whisper2Linux. Enjoy your new voice-controlled Linux experience!