Install Whisper2Linux

This guide will walk you through the process of setting up Whisper2Linux on your Linux system. Follow these steps carefully to ensure a smooth installation and configuration.

Prerequisites

Running Services

Before you begin, ensure that you finished either Docker or Remote setup requirements. If you don't have one of these methods running do not continue.

  whisper2linux-ollama
  whisper2linux-openedai-speech-server
  whisper2linux-whisper-asr

You should be running these service at this point. Whisper2Linux can only function if the services are available.

Skillset

Administrative access to your Linux system
Basic familiarity with the command line
An active internet connection

Step 1: Install Dependencies

First, we need to install the necessary system packages and Python libraries.

System Packages

Install xdotool, which is used for simulating keyboard input:

For Arch-based distributions (e.g., Manjaro):

sudo pacman -S xdotool

For Debian-based distributions (e.g., Ubuntu):

sudo apt-get update
sudo apt-get install xdotool

Python and pip

Ensure you have Python 3.8 or higher installed:

python --version

If Python is not installed or is an older version, install it using your distribution's package manager.

Install pip if it's not already available:

sudo apt-get install python3-pip  # For Debian-based systems
sudo pacman -S python-pip  # For Arch-based systems

Python Libraries

Install the required Python packages:

pip install requests sounddevice soundfile numpy Xlib rapidfuzz

Step 2: Clone the Repository

Clone the Whisper2Linux repository to your local machine:

git clone https://github.com/yourusername/whisper2linux.git
cd whisper2linux

Step 3: Configuration

Before running Whisper2Linux, you may need to configure some settings:

Open the whisper2linux.py file in a text editor.

Locate the following variables and update them if necessary:

TRIGGER_WORD = "olga"  # Change this if you want a different trigger word
WHISPER_API_URL = "http://192.168.1.186:9000/asr"
TTS_API_URL = "http://192.168.1.186:8000/v1/audio/speech"
OLLAMA_API_URL = "http://192.168.1.186:11434/api/chat"
OLLAMA_MODEL = "mistral-nemo"  # Change this to your preferred model
TTS_VOICE = "alloy"  # Change this to your preferred voice

Ensure that the API URLs point to your actual API endpoints if you're not using the default local setup.

If you're using a non-standard microphone or audio setup, you may need to adjust the MIC_DEVICE and SAMPLE_RATE variables:

MIC_DEVICE = None  # Set to a specific device index if needed
SAMPLE_RATE = 16000  # Adjust if your microphone requires a different sample rate

Save the changes to the file.

Step 4: Setting Up API Endpoints (Optional)

If you haven't set up the required API endpoints locally, you have a few options:

Local Setup: Follow the documentation for Whisper, a Text-to-Speech service, and Ollama to set up these services on your local machine or network.
Remote Services: Use remote API endpoints for these services. Ensure you have the necessary API keys and update the URLs in the configuration.
Akash Network: To use the Akash Network for running these services:
- Set up an Akash Network account and obtain AKT tokens
- Use the provided Docker Compose setup to deploy the services
- Update the API URLs in the Whisper2Linux configuration to point to your Akash deployment

Step 5: Running Whisper2Linux

Now that everything is set up, you can run Whisper2Linux:

python whisper2linux.py

You can also use command-line arguments to control logging:

# Run with memory logging
python whisper2linux.py --log memory

# Run with file logging
python whisper2linux.py --log file --log-file /path/to/logfile.log

Step 6: Testing the Installation

Once Whisper2Linux is running, you should see a message indicating that it's listening for the Ctrl+Alt key combination.
Hold down Ctrl+Alt and speak a command, for example: "Olga, what's the weather like today?"
Release the keys and wait for the response. If everything is set up correctly, you should see (or hear) a response from the AI assistant.

Troubleshooting

If you encounter any issues during setup or running Whisper2Linux:

Check the console output for any error messages.
Review the log file (if you've enabled logging) for more detailed information.
Ensure all API endpoints are accessible and responding correctly.
Verify that your microphone is working and properly detected by your system.
If using GPU acceleration, ensure your GPU drivers are up to date.

Next Steps

Now that you have Whisper2Linux set up and running:

Familiarize yourself with the available commands and features.
Explore the customization options to tailor Whisper2Linux to your needs.
Consider contributing to the project by reporting bugs, suggesting features, or submitting pull requests.

Congratulations! You've successfully set up Whisper2Linux. Enjoy your new voice-controlled Linux experience!

Prerequisites​

Running Services​

Skillset​

Step 1: Install Dependencies​

System Packages​

Python and pip​

Python Libraries​

Step 2: Clone the Repository​

Step 3: Configuration​

Step 4: Setting Up API Endpoints (Optional)​

Step 5: Running Whisper2Linux​

Step 6: Testing the Installation​

Troubleshooting​

Next Steps​