Build an Auto Karaoke System with React and Vercel

Build an Auto Karaoke System with React and Vercel

Featured on Hashnode

Hello again. Today we'll create an auto karaoke system with React and Vercel.

Our project involves recognising a song from an audio sample and then displaying its lyrics to follow along. We can use ACRCloud for the audio recognition and Musixmatch to provide lyrics. Also, we will need backend code to interact with these two services. That's why we'll learn how to set up serverless functions with Vercel, to not have to create a server.

Project Setup

Let's start by initialising a Vite project with React and Typescript.

Then, we'll use the following NPM packages to get up and running:

npm i acrcloud formidable formidable-serverless jotai react-feather react-input-slider react-media-recorder

npm i -D @vercel/node @types/formidable

Song Identification

Using ACRCloud API

We need some credentials before accessing the ACRCloud API. If you haven't already, create an account at acrcloud.com and navigate to your Audio and Video Recognition Projects. On a free-tier account, you are limited to only one project. So if you already have a project, just use the credentials from there. If not, create a new project with the provided configuration.

Make sure that isrc is checked after selecting the required bucket. This will be helpful later.

Copy the host, access key and secret key, then paste them into a .env file in the root of the project.

...
ACR_HOST="identify-eu-west-1.acrcloud.com"
ACR_ACCESS_KEY="INSERT_ACCESS_KEY"
ACR_SECRET="INSERT_SECRET_KEY"

Serverless Function Setup

Install the Vercel CLI using npm i -g vercel . Now run npx vercel link to link the folder with a Vercel project. You can either link an existing project or create a new one from the command line.

Once the project has been linked, create a api folder which will store separate files for each of our serverless functions. Let's create our first function to identify a song and return its ISRC.

Create a new file named acr-identify.ts in this folder, which contains our code, accessible at http://localhost:[PORT]/api/acr-identify for the front end.

import type { VercelResponse, VercelRequest } from "@vercel/node";
import { readFile } from "fs/promises";
const acrcloud = require("acrcloud");
const formidable = require("formidable-serverless");

const acr = new acrcloud({
    host: process.env.ACR_HOST as string,
    access_key: process.env.ACR_ACCESS_KEY as string,
    access_secret: process.env.ACR_SECRET as string,
    data_type: "audio",
});

const form = formidable();

// Responds with a music ISRC for an uploaded sample
export default async function (req: VercelRequest, res: VercelResponse) {
    try {
        // Promisify form.parse so that errors can be caught with only one try-catch block
        const { files } = await new Promise<{
            fields: any;
            files: { file: any };
        }>((resolve, reject) =>
            form.parse(
                req,
                async (
                    err: string | undefined,
                    fields: any,
                    files: { file: any }
                ) => {
                    if (err) {
                        reject(err);
                    }

                    resolve({ fields, files });
                }
            )
        );

        const file = files.file as any;

        // Get the file data from the uploaded file
        const fileData = await readFile(file.path);

        const data = await acr.identify(fileData);
        const metadata = data.metadata;

        if (data.status.code !== 0) {
            throw new Error(data.status.msg);
        }

        if (metadata.music?.length === 0) {
            throw new Error("No music found");
        }

        const music = metadata.music[0];

        res.status(200).json(music.external_ids.isrc);
    } catch (err) {
        res.status(500).send((err as Error).message);
    }
}

Getting Song Lyrics

Setting up Musixmatch

Similar to ACRCloud, create an account at Musixmatch Developers if you haven't already. Once you have created an application, head over to the Applications page to find the API key. Now, add the API key to your .env file.

Creating the Function

We'll use axios to request the API, first for the Musixmatch track ID for a given ISRC, then for the track's lyrics. We use the same approach as before for creating a function, but now at /api/find-lyrics. Add the following code to the new file:

import type { VercelResponse, VercelRequest } from "@vercel/node";
import axios from "axios";

const BASE_URL = "https://api.musixmatch.com/ws/1.1/";

const baseParams = {
    apikey: process.env.MUSIXMATCH_API_KEY as string,
    format: "json",
};

// Find the lyrics for a given track by its ISRC
export default async function (req: VercelRequest, res: VercelResponse) {
    try {
        const { isrc } = req.query;

        if (!isrc) {
            res.status(400).send("Song ISRC required");
            return;
        }

        let response = await axios.get(BASE_URL + "track.get", {
            params: {
                ...baseParams,
                track_isrc: isrc,
            },
        });

        const track = response.data.message.body.track;

        response = await axios.get(BASE_URL + "track.lyrics.get", {
            params: {
                ...baseParams,
                track_id: track.track_id,
            },
        });

        const lyrics = response.data.message.body.lyrics.lyrics_body;
        const title = `${track.track_name} by ${track.artist_name}`;

        res.status(200).json({ title, lyrics });
    } catch (err) {
        res.status(500).send((err as Error).message);
    }
}

The HTTP method used by serverless functions is not specified. Therefore, we'll assume the client makes a GET request, so we read the URL parameters to get the ISRC.

Atoms

We use jotai to manage the state of our React application. Create a /src/store.ts file and add the following atom configurations:

import { atom } from "jotai";

// Stores errors returned from the serverless functions
export const errorAtom = atom("");

// Stores the song title
export const titleAtom = atom("");

// Measured in words per minute
export const songRateAtom = atom(160);

// Displaying lyrics
export const lyricsAtom = atom("");
export const currentWordAtom = atom(0);

// Allows changing the index of the next word to highlight
export const newWordAtom = atom(-1);

// Write-only atom, updates the index of the highlighted word
export const nextWordAtom = atom(null, (get, set) => {
    // Get every word in the lyrics
    const words = [...get(lyricsAtom).matchAll(/[\w']+/g)].map(
        (match) => match[0]
    );
    set(currentWordAtom, (prev) => {
        // Override the new word index with any value in `newWordAtom`
        const newWord = get(newWordAtom);
        if (newWord !== -1) {
            set(newWordAtom, -1);
            return newWord;
        }

        return (prev + 1) % words.length;
    });
});

Building the App Component

If no lyrics have been returned, we display a microphone button for the user. If not, we display the lyrics and other song information. Use the following code in /src/App.tsx :

import React from "react";
import { useAtomValue } from "jotai";
import LyricsBody from "./lyrics/LyricsBody";
import MicrophoneInput from "./MicrophoneInput";
import SongTitle from "./SongTitle";
import { lyricsAtom } from "./store";
import SongRateInput from "./SongRateInput";
import ErrorMessage from "./ErrorMessage";

export default function App() {
    const lyrics = useAtomValue(lyricsAtom);

    const renderedChild =
        lyrics === "" ? (
            <MicrophoneInput />
        ) : (
            <>
                <SongTitle />
                <LyricsBody />
                <SongRateInput />
            </>
        );

    return (
        <main className="w-full h-full px-10 flex flex-col justify-center items-center">
            <h1 className="font-bold font-sans text-6xl">KaraokeNow</h1>
            <p className="mb-5">React ⚛️ + Vite ⚡ + Replit 🌀</p>
            {renderedChild}
            <ErrorMessage />
        </main>
    );
}

Note the uses of song rate within our code. Later, we'll see how to highlight the words of the lyrics so users can follow along.

Displaying Errors

Errors that occur for our auto karaoke system will often involve songs or lyrics not being found. So, we can provide this feedback to the user using a new component, /src/ErrorMessage.tsx.

import React from "react";
import { errorAtom } from "./store";
import { useAtomValue } from "jotai";

export default function ErrorMessage() {
    const error = useAtomValue(errorAtom);
    if (!error) return null;

    return (
        <div className="bg-red-500 px-6 py-4 m-7 h-24 flex overflow-auto rounded-2xl">
            <p className="font-mono my-auto w-80 whitespace-nowrap text-center">
                {error}
            </p>
        </div>
    );
}

Recording the Audio

Let's create a new MicrophoneInput the component at /src/MicrophoneInput.tsx. Here, we will use react-media-recorder to record a certain length of audio, and send the sample to our serverless function.

File Structure

Begin by defining the structure of the file.

import React, { useEffect, useState } from "react";
import { useReactMediaRecorder } from "react-media-recorder";
import { Mic, Activity } from "react-feather";
import axios, { AxiosError } from "axios";
import { useSetAtom } from "jotai";
import { errorAtom, lyricsAtom, titleAtom } from "./store";

const RECORD_DURATION = 15000;

export default function MicrophoneInput() {
    // ...
    return (
        <button type="button">
            Implement me
        </button>
    );
}

Hook Usage

Now, we need functions to update the state when we interact with the serverless functions.

// ...
const setError = useSetAtom(errorAtom);
const setLyrics = useSetAtom(lyricsAtom);
const setTitle = useSetAtom(titleAtom);
// ...

react-media-recorder provides us with hooks to simplify media recording within React. Here we add a stop event handler which submits the audio.

const { status, startRecording, stopRecording } = useReactMediaRecorder({
    audio: true,
    onStop: (_url, blob) => {
        onSubmit(blob);
    },
});

Start Recording

Recording starts when the button is clicked and stops once RECORD_DURATION milliseconds have elapsed.

const onClick = async () => {
    if (status === "recording") {
        return;
    }

    // Clear the error when recording again
    setError("");
    await startRecording();

    // Trigger the animation to start
    setEndTime(Date.now() + RECORD_DURATION);

    await new Promise((resolve) => {
        setTimeout(resolve, RECORD_DURATION);
    });

    stopRecording();
};

Animating Recording Progress

We have to use requestAnimationFrame to update the state of our progress animation. From the code above, the animation shows a bar which reduces its height during the process of recording.

useEffect(() => {
    if (endTime === 0) return;
    const request = window.requestAnimationFrame(updateProgress);

    return () => {
        window.cancelAnimationFrame(request);
    };
}, [endTime]);

And we update the progress of the animation with this function:

const updateProgress = () => {
    // elapsed time / duration
    const recorded = Math.max(0, endTime - Date.now()) / RECORD_DURATION;

    // Reset the height to full when completed
    setProgressHeight(recorded === 0 ? 100 : recorded * 100);
    window.requestAnimationFrame(updateProgress);
};

Submitting Audio Data

And now we can create a function to submit the audio file, as described before.

const onSubmit = async (blob: Blob) => {
    const body = new FormData();
    body.append("file", blob);

    try {
        let response = await axios.post("/api/acr-identify", body);

        const trackISRC = response.data;

        response = await axios.get("/api/find-lyrics", {
            params: { isrc: trackISRC },
        });
        const { title, lyrics } = response.data;

        setTitle(title);
        setLyrics(lyrics);
    } catch (err) {
        let message: string;

        if (err instanceof AxiosError) {
            message = err.response?.data;
        } else {
            message = (err as Error).message;
        }

        setError(message);
    }
};

Rendering the Animation

Let's now render a button that incorporates all the code explained above.

return (
    <button
        type="button"
        className="bg-indigo-300 rounded-2xl border-gray-300 border-2 p-7 relative"
        onClick={onClick}
        disabled={status === "recording"}
    >
        <div
            style={{
                height: `${progressHeight}%`,
            }}
            className="absolute rounded-2xl w-full bg-indigo-500 bottom-0 left-0"
        />
        <Icon className="relative z-10 w-12 h-12" />
    </button>
);

Displaying the Song Title

In /src/SongTitle.tsx, create a component to display the identified song's title.

import React from "react";
import { useAtomValue } from "jotai";
import { titleAtom } from "./store";

export default function SongTitle() {
    const title = useAtomValue(titleAtom);
    return (
        <div
            title={title}
            className="px-4 py-2 bg-gray-400/60 rounded-3xl shadow-md mb-2"
        >
            <h4 className="font-bold text-xl">{title}</h4>
        </div>
    );
}

Changing the Singing Rate

We're going to use react-input-slider to create a slider component for controlling the song rate. In reality, the lyrics won't all be sung at the same rate, but the user still has control over how fast they want to sing karaoke.

If you're interested in more accurate lyric tracking, check out Rich sync from Musixmatch.

Create the file at /src/SongRateInput.tsx and add the following code for the component:

import React from "react";
import { useAtom } from "jotai";
import { songRateAtom } from "./store";
import Slider from "react-input-slider";

export default function SongRateInput() {
    const [rate, setRate] = useAtom(songRateAtom);

    const onChange = ({ x }: { x: number }) => setRate(x);

    return (
        <div className="max-w-2xl w-full absolute bottom-8">
            <Slider
                axis="x"
                x={rate}
                xmin={100}
                xmax={250}
                onChange={onChange}
                styles={{
                    track: {
                        width: "100%",
                    },
                }}
            />
            <p className="text-center">Song Rate (WPM)</p>
        </div>
    );
}

Displaying Lyrics

To be able to follow along with the lyrics, each word will have to be highlighted individually, as shown in the demo.

Updating the Current Word

We can create a folder named lyrics inside src. The first file of three here will be LyricsBody.tsx. This component is responsible for displaying all the words of the lyrics and updating the current highlighted word.

import React, { useEffect, useMemo } from "react";
import { useAtomValue, useSetAtom } from "jotai";
import { lyricsAtom, nextWordAtom, songRateAtom } from "../store";
import parseLyrics from "./parser";

export default function LyricsBody() {
    const lyrics = useAtomValue(lyricsAtom);
    const nextWord = useSetAtom(nextWordAtom);
    const songRate = useAtomValue(songRateAtom);

    const segments = useMemo(() => parseLyrics(lyrics), [lyrics]);

    useEffect(() => {
        let request: number;

        // Converts a rate of words per minute to milliseconds per word
        const delay = (60 * 1000) / songRate;
        let start: number;
        let previousTime: number;

        const animateWords = (time: number) => {
            if (!start) {
                start = time;
                previousTime = time;
            }

            // If it's been at least `delay` milliseconds since the last word had been highlighted
            if (time - previousTime >= delay) {
                nextWord();
                previousTime = time;
            }

            request = window.requestAnimationFrame(animateWords);
        };

        request = window.requestAnimationFrame(animateWords);

        return () => {
            window.cancelAnimationFrame(request);
        };
    }, [songRate]);

    return (
        <div className="border-dashed border-4 border-slate-400 rounded-3xl p-2 w-full">
            <p className="text-3xl font-serif leading-loose h-80 overflow-y-auto text-center whitespace-pre-wrap">
                {segments}
            </p>
        </div>
    );
}

We're using requestAnimationFrame again here to make updating the highlight colour for each word a smoother process.

But note how we parse the lyrics into segments, before rendering them. This step is crucial so that each word can be styled individually.

Parsing Lyrics

Create the parser.tsx file in the lyrics folder. Remember to keep the .tsx extension, so that we can use JSX components.

This file will export the following function to chunk up the lyrics into components of individual words.

import React from "react";
import LyricWord from "./LyricWord";

export default function parseLyrics(lyrics: string) {
    // Match all words within the text, including words with apostrophes
    const matches = [...lyrics.matchAll(/[\w']+/g)];

    // Derived array to find matches by start letter index
    const matchStarts = matches.map((match) => match.index);

    const nodes = [];

    let i = 0;
    let letter: string;

    while (i < lyrics.length) {
        letter = lyrics[i];

        // Find the match index by the start letter's index
        const matchIndex = matchStarts.indexOf(i);

        if (matchIndex !== -1) {
            const match = matches[matchIndex];
            const word = match[0];

            // The match index is the word index
            nodes.push(<LyricWord key={i} index={matchIndex} word={word} />);

            // Skip to the character after the word
            i += word.length;
            continue;
        } else if (letter === "\n") {
            // Replace newlines with <br>
            nodes.push(<br key={i} />);
        } else {
            nodes.push(letter);
        }

        i++;
    }

    return nodes;
}

Displaying Each Word of Lyric

Create a new component at /src/lyrics/LyricWord.tsx . It requires the index of the word and the value of the word itself.

Then, class names are applied to the rendered <span> if the index matches the index of the current word.

import React, { useEffect, useRef } from "react";
import { useAtomValue, useSetAtom } from "jotai";
import { currentWordAtom, newWordAtom } from "../store";

interface LyricWordProps {
    index: number;
    word: string;
}

export default function LyricWord({ index, word }: LyricWordProps) {
    const elementRef = useRef<HTMLSpanElement | null>(null);
    const currentWord = useAtomValue(currentWordAtom);
    const setNewWord = useSetAtom(newWordAtom);

    useEffect(() => {
        if (index !== currentWord || elementRef.current === null) return;
        // Scroll the higlighted word into view
        elementRef.current.scrollIntoView();
    }, [currentWord]);

    // Override the value for the current word when clicked
    const onClick = () => {
        setNewWord(index);
    };

    let highlight: string;

    const distanceToCurrent = Math.abs(index - currentWord);

    if (index === currentWord) {
        highlight = "text-purple-800 font-bold border-2 p-2 rounded-xl";
    } else if (distanceToCurrent == 1) {
        highlight = "text-purple-400";
    } else {
        highlight = "text-gray-700";
    }

    return (
        <span
            ref={elementRef}
            className={`cursor-pointer translate-y-20 ${highlight}`}
            onClick={onClick}
            title="Skip here"
        >
            {word}
        </span>
    );
}

Final notes

In conclusion, we used Vercel serverless functions to identify music and find song lyrics. Then, we created a front-end interface to highlight the words one-by-one and tie everything together.

When running the project, use npx vercel dev to make sure the serverless functions are available.

Here is something else to consider:

The lyrics returned by Musixmatch weren't complete. Their free plan only includes access to 30% of lyrics per song, and full access is only available through an enterprise plan.

An alternative to Musixmatch could be genius.com, as they have an API that returns the URL to the lyrics of a song. All we would have to do is access their API and scrape the lyrics. I may publish an article explaining web scraping and this API soon.

If you liked this article and want to explore further, you can get the entire code here.

Could you add a button to reset the lyrics? Read my article explaining Jotai to find out how.

References

Getting Started | Vite (vitejs.dev)

Identification API - ACRCloud

Documentation (musixmatch.com)

react-media-recorder - npm (npmjs.com)

Install Tailwind CSS with Vite - Tailwind CSS

Average Speaking Rate and Words per Minute (virtualspeech.com)

Composing atoms — Jotai

Using requestAnimationFrame with React Hooks | CSS-Tricks - CSS-Tricks

Did you find this article valuable?

Support CS310 by becoming a sponsor. Any amount is appreciated!